I’ve always had this nagging feeling any time I write XML parsing code in Python that it just ought to be simpler.
Conceptually, I tend to always think that an XML document is really just a big nested python dictionary, and I should be able to walk it as such, or even better, as a composition of objects. That’s probably a gross oversimplification and probably not correct according to spec or some corner use cases, but really, that’s just how my brain works. Show me an XML document and I’ll show you a big dictionary tree.
So I’ve always been disappointed with the experience of having to write actual XML parsing code. In Flumotion I think we’ve gone through three different XML parsers as well, depending on which developer was supposed to rewrite or add some piece of code that had to deal with XML config files.
So each time I have to write some XML parsing code again, I tend to go look if there’s something better out there. This time I stumbled across a blog post that echoed my sentiments exactly. And it came with a solution: xml.etree.ElementTree. Finally ! A library that more or less maps to my mental model. Reading through the tutorial then writing the code I needed to parse my file took a lot less time than handrolling yet another set of functions to parse tags would have.
I’m posting this to increase the google juice of xml.etree.ElementTree. It simply doesn’t show up when you google for python xml, and it should be hit number 1!