[Twisted-Python] Unicode

Grant Baillie grant at osafoundation.org
Mon Oct 3 19:01:15 MDT 2005


Well, I agree the message could be more brutal :).

What's the developer use case for "transparent exchange" of unicode  
strings in a network framework? Every protocol and data format has  
some different (sometimes goofy, and sometimes nonexistent) scheme  
for encoding non-ASCII end-user strings. Since the internet only  
understands bytes, it's almost certainly programmer error (omitting  
to implement the protocol's encoding scheme) if you try to send a  
unicode over the wire.

I no more expect

self.transport.write(u"Shoot me with a \u2022")

to work than

self.transport.write(7)

inside my protocol code, for exactly the same reason in both cases.

--Grant

Grant Baillie
Open Source Applications Foundation
http://www.osafoundation.org

PS: As an aside, I actually believe a "default encoding" (site-wide  
or application-wide) scheme isn't so great either. It leads to  
developers making assumptions about the global setting, and those  
assumptions lead to different modules being incompatible.

On Oct 3, 2005, at 17:19, Ken Kinder wrote:

> Perhaps like many developers, I came across this surprising bit of  
> code
> inside a couple of Twisted's methods:
>
>         if isinstance(data, unicode): # no, really, I mean it
>             raise TypeError("Data must be not be unicode")
>
> And of course, I simply removed those lines. But I'm sure if I submit
> that patch, a discussion similar to this one would develop, because  
> it's
> unlikely that such code would have been accidentally included:
>
>     http://twistedmatrix.com/pipermail/twisted-python/2005-April/ 
> 010199.html
>
> Python library will kindly cast unicode objects to strings when
> necessary, as is mentioned in the above thread. It *would* be fair to
> say that not implicitly deciding on an encoding type is "taking the  
> high
> road" if the behavior of encoding weren't so uniformly explicit and
> consistent in Python and its standard library:
>
>     http://www.python.org/peps/pep-0100.html
>     http://docs.python.org/api/arg-parsing.html
>     http://docs.python.org/api/stringObjects.html
>
> (There are more...)
>
> The purpose of Python's unicode type is transparent exchange of string
> objects, whether those string objects are of type str or type unicode.
> Pretending that isn't so and raising a TypeError is not helpful. I  
> would
> urge you to AT LEAST provide a detailed explanation in that error,
> explaining the philosophical disagreement you have with Python's
> unicode-string conversion behavior and have a flag you can set to
> disable that check.







More information about the Twisted-Python mailing list