[Twisted-Python] Twisted 16.6.0rc1 Release Candidate Announcement

Mark Williams markrwilliams at gmail.com
Thu Nov 17 07:43:22 MST 2016


On Wed, Nov 16, 2016 at 11:22:49PM -0800, Glyph Lefkowitz wrote:
> However; is it really a regression to have py3 support for Words that just doesn't support other encodings yet?  It strikes me that this is just a bug, and that we should just fall back from UTF-8 to latin-1 in this scenario.  But adding that fallback is a small additional fix (perhaps one that should be slated for 16.6.0 if you want to make it).

Falling back to latin-1 will address the most obvious issue exposed by
the client in the re-opened ticket.  It will not fix the general issue.

Note that my sample was heavily biased towards European servers.
Other IRC servers in other regions might prefer a different 8-bit
encoding, like windows-1251 or Big5.  And often a single server will
see a long tail (or at least a tail) of different 8-bit encodings.
Listing all channels on a server, as the example script does, cannot
be done with an implementation that decodes input as text prior to
parsing it.  It's even possible to use chardet to detect encodings.

IRC's encoding situation mirrors file systems' one on POSIX.  A given
path's components can be in multiple encodings.  I believe at least
part of the reason FilePath's paths are bytes, even when
surrogateescape exists, is that Unicode paths on POSIX systems would
make FilePath unusable for perfectly valid use cases.  We can pretend
that IRC has a defined encoding, but doing so will make unusable for
perfectly valid use cases.

> -glyph
>
>
> _______________________________________________
> Twisted-Python mailing list
> Twisted-Python at twistedmatrix.com
> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python




More information about the Twisted-Python mailing list