[XML-SIG] Re: [Twisted-Python] Can anyone recommend a sensible XML parser for Python?

Fred L. Drake, Jr. fdrake at acm.org
Sat Sep 7 00:07:28 EDT 2002


Glyph Lefkowitz writes:
 > The reason I mentioned the cloneNode bug is because it is the most
 > reliable and the most trivial to demonstrate.  Like I said; at some
 > point, I will clean up my complaints and submit some bug reports.
 > Here's a "bare test case" of that particular spurious accusation:

This particular bug has already been fixed in CVS.

 > In order to do the work I want to do, though, those bug reports
 > aren't going to help.  Even if you resolved every bug report that I
 > submitted within a week, I would be stuck in the same place I am
 > now: I have to work around the bugs in a bunch of old versions of
 > PyXML or produce what amounts to my own `implementation' of an XML

If you're shipping commercial applications, ship the versions of
relevant libraries and Python needed for the application.  Eating up
disk space may be annoying, but it's cheap enough not to be a real
problem.  Bugs are a real problem, no matter how unfortunate, even if
they're not your own.

If the issue is that you're shipping a framework that needs to work
with as many other packages as possible, then document which versions
it's known to work with, which versions its known not to work with,
and keep moving.

Please understand, I'm really sorry PyXML 0.8 had bugs, but we're not
getting paid for this, so I don't feel it's my job to double-check
every checkin that every PyXML develop makes before a release; I try
to make my checkins work as well as I can, and I do test with 4
different major versions of Python.  If you need PyXML to become
increasingly bug free over time, I'd like to suggest two things:

1. Keep track of the CVS version regularly, and test it out with your
   components.  Sometimes this can be tedious, but good automated
   tests can make this substantially easier.  Report bugs quickly
   using the SourceForge tracker.

2. Contribute regression tests to the project.  We know our tests are
   not complete, and are improving them with each release, but some
   assistance with this, especially when you report bugs, can make
   more of a difference even than contributing fixes (which are also
   welcome).

 > parser.  Granted, if I packaged a newer, fixed-up version of PyXML
 > with Twisted, I wouldn't have to be mucking about with bits and
 > bytes -- but I *would* have to understand the entire ontology of
 > confusion associated with cross-language XML APIs.

I must be missing something.  Doesn't it just mean that you need to
provide a sufficiently updated PyXML distribution?

 > My main frustration is with packaging.  If all the world were
 > running Debian unstable, I'd be fine: I'd just say Depends:
 > python2.2-xml >= 0.9.  However, with lots of users in Windows, and

Yeah, the packaging sucks.  It's not any worse than for any other bit
of library code though, as far as I can tell.  (I'll admit the horizon
for my sight is substantially limited to open source software,
however.)

 > For the applications that I'm intending to write, just doing my own
 > parser and API is both more appealing and more rewarding.  Neither
 > DOM nor SAX will present an API which allows me to get network XML
 > events in quite the way I want, so I'm going to have to do some

If you don't think the interfaces match you application space very
well, please describe your requirements and explain how the current
APIs don't meet your requirements, and what sort of APIs you're
looking for.

 > If the general quality of XML parsers in Python were really high, I
 > would regard this impulse as contrary and counterproductive -- why

You talk about parser, but I don't think that's what you mean.  The
bug you referred to in minidom had nothing to do with the underlying
parser; it would have manifested itself with any parser you picked
that reported processing instructions.  ("All of them.")

 > So maybe I'm just rationalizing what I would have done anyway.
 > Nevertheless, it is easier to write my own XML parser than to even
 > properly report the bugs that I have thus far discovered.

As an Expat maintainer, I wish you luck.  ;-)

 > I appreciate that.  At some point I hope to have the time to run
 > down every last bug I've found and help PyXML to become very
 > robust.

Yes, bug reports are definately necessary to develop a solid piece of
software.  I do hope we can encourage you to produce a few.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation




More information about the Twisted-Python mailing list