Opened 15 years ago

Closed 15 years ago

#1984 defect closed fixed (fixed)

Microdom drops the first character in an "extremely lenient" document

Reported by: radix Owned by: radix
Priority: highest Milestone:
Component: core Keywords:
Cc: Branch:
Author:

Description

>>> from twisted.web import microdom
>>> microdom.parseString('Nope<br> Sup yo <div>hi</div>', beExtremelyLenient=True).toxml()
'<?xml version="1.0"?><html>ope<br /> Sup yo <div>hi</div></html>'

Change History (4)

comment:1 Changed 15 years ago by radix

Owner: changed from Glyph to radix
Status: newassigned

There are two more issues I found that are similar.

  • Parsing a string like "foo" that contains no XML tags results in an error. It should probably result in <html>foo</html>
  • Parsing a string like "foo<br>bar" results in a document containing only "<br />". It should probably be "<html>foo<br />bar</html>"

comment:2 Changed 15 years ago by radix

Keywords: review added
Owner: changed from radix to Glyph
Status: assignednew

Review plz in /branches/xml-sux-1984

comment:3 Changed 15 years ago by Jean-Paul Calderone

Keywords: review removed
Owner: changed from Glyph to radix
Priority: normalhighest

Document the new instance attribute leadingBodyData.

Rename newly added test methods to use test_ prefix.

Add a test like testLeadingTextDropping, but for trailing-only text.

XMLParser.connectionLost calls the 2nd element of the parser's current state, which is pretty opaque. How about the END_HANDLERth element, or something like that?

Otherwise looks good, particularly the elimination of that extremely convoluted loop in parse().

comment:4 Changed 15 years ago by radix

Resolution: fixed
Status: newclosed

Fixed in r17819.

Note: See TracTickets for help on using tickets.