[Twisted-Python] Can anyone recommend a sensible XML parser for Python?
Glyph Lefkowitz
glyph at twistedmatrix.com
Mon Sep 2 05:07:45 EDT 2002
So, I'm really pretty discouraged and disgusted with the state of XML tools
that ship with Python today. Mainly, they do surprising and insecure things
when I try to parse XML, and I don't understand how to tell what will and won't
work between various versions of them.
I think my requirements of an XML parser are pretty simple. Here are the
basics of what I want it to do:
* adhere to a subset of both DOM and SAX APIs for both event-based and
synchronous processing of XML data
* allow creation of DOM trees from fragments of an XML stream so that
discrete "packets" can be processed, a-la jabber "xml streams"
* perform relatively well (optional)
More importantly, here are the things I *don't* want an XML parser to do:
* validate in any way, ever, at all
* fetch DTDs or otherwise do helpful things like eval()ing python code
found in random attributes in the node tree
* break necessary extensions to SAX/DOM and subtleties of API compatibility
between versions, making my code do lots of checks
* look for Unicode flag characters
* pay attention to !DOCTYPE and ?xml directives
* split Text nodes into multiple pieces on newlines or whitespace
* pay attention specially to any attribute, like "xmlns"
* dump core
Does anybody know of an XML parser that meets these requirements or am I going
to have to write my own?
--
| <`'> | Glyph Lefkowitz: Travelling Sorcerer |
| < _/ > | Lead Developer, the Twisted project |
| < ___/ > | http://www.twistedmatrix.com |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://twistedmatrix.com/pipermail/twisted-python/attachments/20020902/3b6dd8b1/attachment.pgp
More information about the Twisted-Python
mailing list