Ticket #3931 (new defect)

Opened 13 months ago

Last modified 10 months ago

t.p.amp.AmpBox.serialize will _not_ return a string if any value is unicode

Reported by: dotz Owned by: dotz
Priority: normal Milestone:
Component: core Keywords: win32
Cc: exarkun Branch:
Author: Launchpad Bug:

Description

If you place an unicode value in AmpBox, AmpBox.serialize will return an unicode string, not str.

For example:

>>> from twisted.protocols.amp import AmpBox
>>> a = AmpBox(asdf = 'foo')
>>> type(a.serialize())
<type 'str'>
>>> a = AmpBox(asdf = u'foo')
>>> type(a.serialize())
<type 'unicode'>

This will cause funny errors on some platforms (see #3930).

This is also against AmpBox.serialize docstring, which says that it should return a str type, not unicode.

    def serialize(self):
        """
        Convert me into a wire-encoded string.

        @return: a str encoded according to the rules described in the module
        docstring.
        """

Please let me know what is the preferred way to get this fixed in trunk and I'll prepare a patch.

Attachments

fix-3931.patch Download (1.5 KB) - added by dotz 10 months ago.
Let's check keys and values during serialize and if the strict mode is enabled, let's fail. Also includes an unit test

Change History

Changed 12 months ago by dotz

Do you need anything else from me to get this one small change merged?

Changed 12 months ago by glyph

  • owner changed from glyph to dotz

Best solution: don't put unicode strings into AmpBox :).

Thanks for poking this issue, I certainly would have forgotten about it. I am not going to fix it myself. In most Python libraries, these sorts of checks are left out; if you put other random 'str' subclasses or non-string objects into an AmpBox you're going to get bogus results too, this isn't unique to 'unicode' objects.

If this is a problem that you're running into frequently, then please feel free to add a "strict" or "debug" mode which will reject anything but str:str mappings in an AmpBox; make sure that you cover .update, .__setitem__, .setdefault, and any other methods that dicts might have that I've forgotten about. Submit a patch, with tests, submit it for review.

Changed 10 months ago by dotz

Let's check keys and values during serialize and if the strict mode is enabled, let's fail. Also includes an unit test

Changed 10 months ago by dotz

  • keywords win32 review added
  • owner dotz deleted

I added a patch, that makes debugging a lot easier. Thanks for your time!

Changed 10 months ago by therve

  • owner set to glyph

I don't feel super great about this "strict" mode, so I'll let glyph review this.

Changed 10 months ago by exarkun

  • cc exarkun added
  • keywords review removed
  • owner changed from glyph to dotz

I think the "strict" mode is silly too, particularly if it defaults to on. Just do the checks all the time. Doing it in serialize is a bit later than would be ideal, but it has the advantage of avoiding the need to do it in every single dict mutation method. So I guess I like it. Please use the one-argument form of raise, though, and include the name of the offending key.

Note: See TracTickets for help on using tickets.