[Twisted-Python] Can anyone recommend a sensible XML parser for Python?

Eron Lloyd elloyd at lancaster.lib.pa.us
Fri Sep 6 12:36:12 EDT 2002

Hmm, I know that minidom has had some problems recently, but it has also
seen some good improvements. It sounds like you need more robust DOM
support--have you tried 4DOM? It's not as fast, but it does adhere to
the spec the best. Maybe (when you have time) if you let us know what
you expect to accomplish we can help out--the people in XML-SIG are some
of the sharpest in the community. Perhaps TREX or RELAX-NG would be more
suitable. I guess the only comforting thing I can say is that every
development community is experiencing growing pains when it comes to an
XML strategy.

Good luck,


On Fri, 2002-09-06 at 02:39, Glyph Lefkowitz wrote:
> On 05 Sep 2002 17:53:09 -0400, Eron Lloyd <elloyd at lancaster.lib.pa.us> wrote:
> > Are you referring to PyXML? I know xml.* in the Standard Library is 
> > pretty weak by far (but getting better!).
> Yes.  In fact, PyXML is a big part of the problem.  Its "minidom" module, for
> example, is *far* buggier than the one found in the standard library.  (As an
> example of that, try to figure out how to make cloneNode work on a Document
> object.)
> I could deal with one set of potential problems and pitfalls using XML in
> Python and work around then, but I have to work around every combination of
> versions to make a useful app that doesn't have very stringent installation
> requirements: in pracitice this means 4 environments: python2.1 with pyxml,
> python2.1 standalone, python2.2 with pyxml, python2.2 standalone.
> I don't want a plethora of XML parsers with rich features, all of which are
> broken.  I want *one* XML parser that can *reliably* transform a stream of
> bytes into a stream of nodes, and a text file into a tree of nodes.  You
> mentioned validatation in your post and I explicitly said that validation is
> worse than useless to me; in most cases I want to parse XHTML, which means
> dealing with lots of potentially DTD-violating stuff which is still "valid" as
> far as I'm concerned.
> Eventually I'll clean up the problem cases I'm having and submit them as bug
> reports, but right now it's not worth my time, because I really don't want to
> deal with the fragility of the PyXML or python-standard-library xml.* stuff.
> -- 
>  |    <`'>    |  Glyph Lefkowitz: Traveling Sorcerer   |
>  |   < _/ >   |  Lead Developer,  the Twisted project  |
>  |  < ___/ >  |      http://www.twistedmatrix.com      |
Eron Lloyd
Technology Coordinator
Lancaster County Library
elloyd at lancaster.lib.pa.us
Phone: 717-239-2116
Fax: 717-394-3083

[This E-mail scanned for viruses by Declude Virus]

More information about the Twisted-Python mailing list