[Twisted-Python] Can anyone recommend a sensible XML parser for Python?

Glyph Lefkowitz glyph at twistedmatrix.com
Fri Sep 6 02:39:22 EDT 2002

On 05 Sep 2002 17:53:09 -0400, Eron Lloyd <elloyd at lancaster.lib.pa.us> wrote:
> Are you referring to PyXML? I know xml.* in the Standard Library is 
> pretty weak by far (but getting better!).

Yes.  In fact, PyXML is a big part of the problem.  Its "minidom" module, for
example, is *far* buggier than the one found in the standard library.  (As an
example of that, try to figure out how to make cloneNode work on a Document

I could deal with one set of potential problems and pitfalls using XML in
Python and work around then, but I have to work around every combination of
versions to make a useful app that doesn't have very stringent installation
requirements: in pracitice this means 4 environments: python2.1 with pyxml,
python2.1 standalone, python2.2 with pyxml, python2.2 standalone.

I don't want a plethora of XML parsers with rich features, all of which are
broken.  I want *one* XML parser that can *reliably* transform a stream of
bytes into a stream of nodes, and a text file into a tree of nodes.  You
mentioned validatation in your post and I explicitly said that validation is
worse than useless to me; in most cases I want to parse XHTML, which means
dealing with lots of potentially DTD-violating stuff which is still "valid" as
far as I'm concerned.

Eventually I'll clean up the problem cases I'm having and submit them as bug
reports, but right now it's not worth my time, because I really don't want to
deal with the fragility of the PyXML or python-standard-library xml.* stuff.

 |    <`'>    |  Glyph Lefkowitz: Traveling Sorcerer   |
 |   < _/ >   |  Lead Developer,  the Twisted project  |
 |  < ___/ >  |      http://www.twistedmatrix.com      |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://twistedmatrix.com/pipermail/twisted-python/attachments/20020906/a0b9473c/attachment.pgp 

More information about the Twisted-Python mailing list