[Twisted-Python] Can anyone recommend a sensible XML parser for Python?
Eron Lloyd
elloyd at lancaster.lib.pa.us
Thu Sep 5 15:53:09 MDT 2002
Are you referring to PyXML? I know xml.* in the Standard Library is
pretty weak by far (but getting better!). PyXML, on the other hand,
supports currently at least two pretty powerful parsers: Expat ("the"
parser for many projects, including mozilla), and xmlproc (a robust
pure-python parser that does validation). In fact, I believe Fred Drake
of PythonLabs is the maintainer of Expat, so Python will always have
strong Expat support. Also, I know Daniel Veillard is very interested in
"guaranteeing" Python wrappers for the GNOME libxml/libxslt C library
(http://www.xmlsoft.org/python.html). There are many more options I just
can't think of right now. All in all, there *is* a wealth of parsers
available to you, you just have decide what you need. Check PyXML
(http://pyxml.sf.net) and contact the Python XML-SIG for help. Have
faith, Python is quickly shaping up to be a powerful XML platform.
Cheers,
Eron
On Mon, 2002-09-02 at 05:07, Glyph Lefkowitz wrote:
>
> So, I'm really pretty discouraged and disgusted with the state of XML tools
> that ship with Python today. Mainly, they do surprising and insecure things
> when I try to parse XML, and I don't understand how to tell what will and won't
> work between various versions of them.
>
> I think my requirements of an XML parser are pretty simple. Here are the
> basics of what I want it to do:
>
>
> * adhere to a subset of both DOM and SAX APIs for both event-based and
> synchronous processing of XML data
>
> * allow creation of DOM trees from fragments of an XML stream so that
> discrete "packets" can be processed, a-la jabber "xml streams"
>
> * perform relatively well (optional)
>
> More importantly, here are the things I *don't* want an XML parser to do:
>
> * validate in any way, ever, at all
>
> * fetch DTDs or otherwise do helpful things like eval()ing python code
> found in random attributes in the node tree
>
> * break necessary extensions to SAX/DOM and subtleties of API compatibility
> between versions, making my code do lots of checks
>
> * look for Unicode flag characters
>
> * pay attention to !DOCTYPE and ?xml directives
>
> * split Text nodes into multiple pieces on newlines or whitespace
>
> * pay attention specially to any attribute, like "xmlns"
>
> * dump core
>
> Does anybody know of an XML parser that meets these requirements or am I going
> to have to write my own?
>
> --
> | <`'> | Glyph Lefkowitz: Travelling Sorcerer |
> | < _/ > | Lead Developer, the Twisted project |
> | < ___/ > | http://www.twistedmatrix.com |
--
Eron Lloyd
Technology Coordinator
Lancaster County Library
elloyd at lancaster.lib.pa.us
Phone: 717-239-2116
Fax: 717-394-3083
---
[This E-mail scanned for viruses by Declude Virus]
More information about the Twisted-Python
mailing list