[Twisted-Python] regarding xml elements
glyph at divmod.com
glyph at divmod.com
Sat Mar 29 23:08:15 MDT 2008
On 12:56 am, phil at bubblehouse.org wrote:
>On Mar 28, 2008, at 7:33 PM, Jean-Paul Calderone wrote:
>>On Fri, 28 Mar 2008 22:59:21 -0000, glyph at divmod.com wrote:
>>>On 02:55 pm, exarkun at divmod.com wrote:
>>>>On Fri, 28 Mar 2008 10:51:10 -0400, Phil Christensen
>>>><phil at bubblehouse.org> wrote:
>> >>> from twisted.web.microdom import parseString
>> >>> s = '<div><span>hello</span> <span>world</span></div>'
>> >>> parseString(s).toxml()
>> '<?xml version="1.0"?><div><span>hello</span><span>world</span></
>>div>'
>> >>>
>>So if you need such advanced XML features as correct whitespace
>>handling,
>>steer clear. ;)
>I have to say, I don't find this to be that big an issue. I think if
>you're using XML as a data interchange format (as I know the original
>poster was), whitespace is generally syntactically meaningless.
Like many things in Microdom, whitespace handling does not strive to be
particularly spec-compliant (the spec does say "An XML processor MUST
always pass all characters in a document that are not markup through to
the application."), but to be useful for simple cases and stable enough
that your code won't break. If you want whitespace you can probably
cram it in there. For example, it has a creative misinterpretation of
the "xml:space" attribute:
>>>from twisted.web.microdom import parseString
>>>s = '<div xml:space="preserve"><span>hello</span>
>>><span>world</span></div>'
>>>parseString(s).toxml()
'<?xml version="1.0"?><div xml:space="preserve"><span>hello</span>
<span>world</span></div>'
It is also hard-coded to preserve space in <pre> tags, which is also
broken because it doesn't really honor namespaces, and therefore has no
idea if your document is HTML or not, and it can't read DTDs so it
doesn't know if your elements have this attribute set implicitly (and so
on and so on).
This could be made into *slightly* less of a hack with a preserveSpace
argument to parse*(), of course; the implementation would probably be
very straightforward (c.f. MicroDOMParser.shouldPreserveSpace). Maybe
someone who actually likes Microdom, such as Phil, will add one, since
all I'm committing to here is not totally hating it ;).
More information about the Twisted-Python
mailing list