SAX parser
SAX - Simple API for XML.
SAX parser much more lightweight than DOM parser. So in case of large data structures with great amount of nodes selecting SAX is the only one right decision.
Python out of box already has all of You need to build SAX parser.
import sys from xml.sax import handler, parse class MyContentHandler(handler.ContentHandler): def __init__(self, out=sys.stdout): handler.ContentHandler.__init__(self) self._out = out def startDocument(self): pass def startElement(self, name, attrs): self.content = '' def characters(self, content): self.content += content def endElement(self, name): self._out.write('%s = %s\n' % (name, self.content)) def endDocument(self): pass def main(): """ # example.xml <ResultSet> <Result id="1"> <name> Electromaster </name> </Result> <Result id="2"> <name> When You Work Under The Sun, You Need To Stay Rehydrated </name> </Result> <Result id="3"> <name> Tokiwadai is Targeted </name> </Result> </ResultSet> Result: python parser.py name = Electromaster Result = Electromaster name = When You Work Under The Sun, You Need To Stay Rehydrated Result = When You Work Under The Sun, You Need To Stay Rehydrated name = Tokiwadai is Targeted Result = Tokiwadai is Targeted ResultSet = Tokiwadai is Targeted """ f = open('example.xml') parse(f, MyContentHandler()) if __name__ == "__main__": main()
Also can be useful: xml.dom.
Links:
- http://en.wikipedia.org/wiki/Simple_API_for_XML
- http://docs.python.org/2/library/xml.sax.html
- http://mail.python.org/pipermail/python-dev/2000-October/009946.html
Licensed under CC BY-SA 3.0