Opened 9 years ago

Closed 9 years ago

#4503 defect closed fixed (fixed)

domish gets confused by spaces in xmlns names

Reported by: Michal Schmidt Owned by:
Priority: normal Milestone:
Component: words Keywords:
Cc: Branch: branches/domish-xmlns-spaces-4503
branch-diff, diff-cov, branch-cov, buildbot
Author: exarkun


I maintain the package Jabbim in Fedora. It's a Twisted-based XMPP client. Leandro Vázquez Cervantes reported to me a crash in Jabbim which turns out to be a bug in Twisted:

	  File "/usr/lib/python2.6/site-packages/twisted/internet/", line 140, in doRead
	    return Connection.doRead(self)
	  File "/usr/lib/python2.6/site-packages/twisted/internet/", line 463, in doRead
	    return self.protocol.dataReceived(data)
	  File "/usr/lib/python2.6/site-packages/twisted/words/xish/", line 75, in dataReceived
	  File "/usr/lib/python2.6/site-packages/twisted/words/xish/", line 756, in parse
	  File "/usr/lib/python2.6/site-packages/twisted/words/xish/", line 774, in _onStartElement
	    e = Element(qname, self.defaultNsStack[-1], attrs, self.localPrefixes)
	  File "/usr/lib/python2.6/site-packages/twisted/words/xish/", line 404, in __init__
	    self.uri, = qname
	exceptions.ValueError: too many values to unpack

After some googling I found the same bug was affecting pyaimt in Debian:

It happens when a received stanza contains spaces in a xmlns, e.g.:

<data xmlns=" ">

(notice the space before the closing quotation mark)

I don't know if any standard forbids spaces in xmlns IRIs, but a ValueError is certainly not an acceptable result.

The attached patch fixes it.

Attachments (2)

twisted-words-domish-handle-xmlns-names-with-spaces.patch (972 bytes) - added by Michal Schmidt 9 years ago.
domish: handle spaces in xmlns IRIs (751 bytes) - added by Michal Schmidt 9 years ago.
a short demo of the problem

Download all attachments as: .zip

Change History (7)

Changed 9 years ago by Michal Schmidt

domish: handle spaces in xmlns IRIs

Changed 9 years ago by Michal Schmidt

Attachment: added

a short demo of the problem

comment:1 Changed 9 years ago by Jean-Paul Calderone

Author: exarkun
Branch: branches/domish-xmlns-spaces-4503

(In [29626]) Branching to 'domish-xmlns-spaces-4503'

comment:2 Changed 9 years ago by Jean-Paul Calderone

Keywords: review added; patch removed
Owner: Jean-Paul Calderone deleted

As far as I can tell, spaces are allowed here and change the value. See which specifies the URI to be the normalized value of the attribute, and which specifies how to normalize values. In particular:

For a white space character (#x20, #xD, #xA, #x9), append a space character (#x20) to the normalized value.




" "

represent different namespaces. It seems quite unlikely to me that someone intentionally specified the latter value rather than the former, but it seems that the specification forces us to go along with their mistake.

In any case, it seems that the solution of just being more careful about how the values are split up is the correct solution. It will preserve spaces in the value and avoid the later TypeError.

Build results

comment:3 Changed 9 years ago by Screwtape

Keywords: review removed
Owner: set to Jean-Paul Calderone

For some reason (possibly age), the build results mention that some tests failed but don't provide logs that show *which* tests failed, so I'm assuming they're known-intermittent failures (otherwise you probably wouldn't have put this patch up for review).

Also, the reference on "how to normalize values" has extra whitespace munging steps for attributes whose type "is not CDATA", but I can't find any reference to what the type of namespace values is. "All attributes for which no declaration has been read SHOULD be treated by a non-validating processor as if declared CDATA", so I guess CDATA is a reasonable assumption.

Let's merge it!

comment:4 Changed 9 years ago by Jean-Paul Calderone

Resolution: fixed
Status: newclosed

(In [29852]) Merge domish-xmlns-spaces-4503

Author: michich, exarkun Reviewer: screwtape Fixes: #4503

Be careful when splitting namespace URIs from element and attribute names in the XMPP DOM creation code so that namespace URIs containing

spaces don't trigger an unhandled exception.

comment:5 Changed 8 years ago by <automation>

Owner: Jean-Paul Calderone deleted
Note: See TracTickets for help on using tickets.