Opened 5 years ago

Closed 4 years ago

#4503 defect closed fixed (fixed)

domish gets confused by spaces in xmlns names

Reported by: michich Owned by:
Priority: normal Milestone:
Component: words Keywords:
Cc: Branch: branches/domish-xmlns-spaces-4503
(diff, github, buildbot, log)
Author: exarkun Launchpad Bug:

Description

I maintain the package Jabbim in Fedora. It's a Twisted-based XMPP client. Leandro Vázquez Cervantes reported to me a crash in Jabbim which turns out to be a bug in Twisted:

	  File "/usr/lib/python2.6/site-packages/twisted/internet/tcp.py", line 140, in doRead
	    return Connection.doRead(self)
	  File "/usr/lib/python2.6/site-packages/twisted/internet/tcp.py", line 463, in doRead
	    return self.protocol.dataReceived(data)
	  File "/usr/lib/python2.6/site-packages/twisted/words/xish/xmlstream.py", line 75, in dataReceived
	    self.stream.parse(data)
	  File "/usr/lib/python2.6/site-packages/twisted/words/xish/domish.py", line 756, in parse
	    self.parser.Parse(buffer)
	  File "/usr/lib/python2.6/site-packages/twisted/words/xish/domish.py", line 774, in _onStartElement
	    e = Element(qname, self.defaultNsStack[-1], attrs, self.localPrefixes)
	  File "/usr/lib/python2.6/site-packages/twisted/words/xish/domish.py", line 404, in __init__
	    self.uri, self.name = qname
	exceptions.ValueError: too many values to unpack

After some googling I found the same bug was affecting pyaimt in Debian: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=583263

It happens when a received stanza contains spaces in a xmlns, e.g.:

<data xmlns="http://www.xmpp.org/extensions/xep-0084.html#ns-data ">

(notice the space before the closing quotation mark)

I don't know if any standard forbids spaces in xmlns IRIs, but a ValueError is certainly not an acceptable result.

The attached patch fixes it.

Attachments (2)

twisted-words-domish-handle-xmlns-names-with-spaces.patch (972 bytes) - added by michich 5 years ago.
domish: handle spaces in xmlns IRIs
expattest.py (751 bytes) - added by michich 5 years ago.
a short demo of the problem

Download all attachments as: .zip

Change History (7)

Changed 5 years ago by michich

domish: handle spaces in xmlns IRIs

Changed 5 years ago by michich

a short demo of the problem

comment:1 Changed 4 years ago by exarkun

  • Author set to exarkun
  • Branch set to branches/domish-xmlns-spaces-4503

(In [29626]) Branching to 'domish-xmlns-spaces-4503'

comment:2 Changed 4 years ago by exarkun

  • Keywords review added; patch removed
  • Owner exarkun deleted

As far as I can tell, spaces are allowed here and change the value. See http://www.w3.org/TR/REC-xml-names/#NSNameComparison which specifies the URI to be the normalized value of the attribute, and http://www.w3.org/TR/REC-xml/#AVNormalize which specifies how to normalize values. In particular:

For a white space character (#x20, #xD, #xA, #x9), append a space character (#x20) to the normalized value.

So

"http://www.xmpp.org/extensions/xep-0084.html#ns-data"

and

"http://www.xmpp.org/extensions/xep-0084.html#ns-data "

represent different namespaces. It seems quite unlikely to me that someone intentionally specified the latter value rather than the former, but it seems that the specification forces us to go along with their mistake.

In any case, it seems that the solution of just being more careful about how the values are split up is the correct solution. It will preserve spaces in the value and avoid the later TypeError.

Build results

comment:3 Changed 4 years ago by Screwtape

  • Keywords review removed
  • Owner set to exarkun

For some reason (possibly age), the build results mention that some tests failed but don't provide logs that show *which* tests failed, so I'm assuming they're known-intermittent failures (otherwise you probably wouldn't have put this patch up for review).

Also, the reference on "how to normalize values" has extra whitespace munging steps for attributes whose type "is not CDATA", but I can't find any reference to what the type of namespace values is. "All attributes for which no declaration has been read SHOULD be treated by a non-validating processor as if declared CDATA", so I guess CDATA is a reasonable assumption.

Let's merge it!

comment:4 Changed 4 years ago by exarkun

  • Resolution set to fixed
  • Status changed from new to closed

(In [29852]) Merge domish-xmlns-spaces-4503

Author: michich, exarkun
Reviewer: screwtape
Fixes: #4503

Be careful when splitting namespace URIs from element and attribute
names in the XMPP DOM creation code so that namespace URIs containing

spaces don't trigger an unhandled exception.

comment:5 Changed 4 years ago by <automation>

  • Owner exarkun deleted
Note: See TracTickets for help on using tickets.