[Twisted-Python] Python3: should paths be bytes or str?

exarkun at twistedmatrix.com exarkun at twistedmatrix.com
Sun Sep 7 20:14:10 MDT 2014


On 01:26 am, wolfgang.kde at rohdewald.de wrote:
>The porting guide says
>
>No byte paths in sys.path.

What porting guide is that?
>
>doc for FilePath says
>    On both Python 2 and Python 3, paths can only be bytes.
>
>
>I stumbled upon this while trying to find out how much work it might be
>to make bin/trial run with python3
>
>admin/run-python3-tests already passes for all twisted.spread related
>tests but I still need to clean up a lot.
>
>after adding an assert to FilePath.__init__, python3 bin/trial ... 
>gives
>
>  File "/home/wr/ssdsrc/Twisted/twisted/scripts/trial.py", line 601, in 
>run
>    config.parseOptions()
>  File "/home/wr/ssdsrc/Twisted/twisted/python/usage.py", line 277, in 
>parseOptions
>    self.postOptions()
>  File "/home/wr/ssdsrc/Twisted/twisted/scripts/trial.py", line 472, in 
>postOptions
>    _BasicOptions.postOptions(self)
>  File "/home/wr/ssdsrc/Twisted/twisted/scripts/trial.py", line 382, in 
>postOptions
>    self['reporter'] = self._loadReporterByName(self['reporter'])
>  File "/home/wr/ssdsrc/Twisted/twisted/scripts/trial.py", line 369, in 
>_loadReporterByName
>    for p in plugin.getPlugins(itrial.IReporter):
>  File "/home/wr/ssdsrc/Twisted/twisted/plugin.py", line 209, in 
>getPlugins
>    allDropins = getCache(package)
>  File "/home/wr/ssdsrc/Twisted/twisted/plugin.py", line 134, in 
>getCache
>    mod = getModule(module.__name__)
>  File "/home/wr/ssdsrc/Twisted/twisted/python/modules.py", line 781, in 
>getModule
>    return theSystemPath[moduleName]
>  File "/home/wr/ssdsrc/Twisted/twisted/python/modules.py", line 702, in 
>__getitem__
>    self._findEntryPathString(moduleObject)),
>  File "/home/wr/ssdsrc/Twisted/twisted/python/modules.py", line 627, in 
>_findEntryPathString
>    if _isPackagePath(FilePath(topPackageObj.__file__)):
>  File "/home/wr/ssdsrc/Twisted/twisted/python/filepath.py", line 664, 
>in __init__
>    assert isinstance(path, bytes), 'path must be bytes: %r' % (path,)
>AssertionError: path must be bytes: 
>'/home/wr/ssdsrc/Twisted/twisted/__init__.py'

If paths are being represented using unicode somewhere and you want to 
use them with FilePath then you have to encode them (or you have to add 
unicode path support to FilePath and let FilePath encode them).

Unfortunately it's not entirely obvious how to make FilePath support 
unicode paths since not all platforms Twisted supports represent 
filesystem paths using unicode.

The choice python-dev made to bridge this gap was the creation of the 
"surrogateescape" error handler for the UTC-8 codec.  This lets you 
pretend that any time you need to convert between bytes and unicode the 
correct codec is UTF-8 (with this special error handler).

It's not clear this was a good choice (since the result is unicode 
strings that may contain garbage which will confuse other software) but 
it's also not clear it's possible for Twisted to try to make any other 
choice (at some point Twisted has to interoperate with the path-related 
APIs in Python itself - `sys.path`, for example).

Not sure if that helps you at all.  Maybe it outlines the problem a 
little more clearly, at least.

Jean-Paul




More information about the Twisted-Python mailing list