[Twisted-Python] twisted.vfs issues

Jp Calderone exarkun at divmod.com
Wed Sep 28 18:04:24 MDT 2005


On Thu, 29 Sep 2005 05:49:04 +1000, Andy Gayton <andy at thecablelounge.com> wrote:
>James Y Knight wrote:
>>So, starting to look through twisted.vfs, I'm finding a few things  that 
>>need work.
>
>Hey James,
>
>Thanks for the feedback.  We need it.  Heaps of decisions for the vfs stuff 
>have been put off to see what other use cases would need from the vfs. 
>Inparticular permissions and metadata.

With that in mind.... ;)

>>1) I see no way of reading from or writing to a file in 
>>ivfs.IFileSystemLeaf.
>
>The vfs stuff is still heavily influenced by the interface that conch 
>expects as sftp has been the main motivation for the current contributors.
>
>Reading and writing is done through writeChunk and readChunk - we've always 
>felt this wasn't quite right though for a general backend. But after two 
>sprints we still haven't come up with something that is better.  Adding the 
>web2.Stream adaptor seems to have glazed over the issue for protocols that 
>read/writeChunk doesn't work for.  Spiv even used streams for the vfs ftp 
>adaptor!

  twisted.vfs should not import things from or depend upon twisted.web2:

    * web2 is unreleased
    * web2's APIs are unstable
    * vfs is more generally applicable than web2 
    * web2's stream abstraction is not generally agreed upon

  If you like, we can talk more about how this interface should work.  However, my first inclination is to say that it should use the existing producer/consumer APIs.  While these are not the best APIs, they are used widely throughout Twisted, and therefore this will give the greatest usability to the resulting VFS code.  While there are adapters between these APIs and web2 streams, I still recommend against web2 streams for the reasons mentioned above.

>
>I've added read/writeChunk to ivfs.IFileSystemLeaf's interface.

  I mentioned these in a separate email, so I won't repeat those points.

>>2) createFile is racy -- it requires opening a file by the given  name, 
>>with default permissions, then immediately closing it.
>
>:), racy is good right?
>>In  addition, it doesn't specify whether it's an error if the file  already 
>>exists.
>
>It should, I've added this to the interface.
>>3) Looks like all operations are blocking? What about a remote vfs? I 
>>think every operation in the vfs interface ought to be non-blocking.
>
>The other option is the vfs interface could be maybe deferred.  Most 
>protocols are good at handling this (sftp, streams).  But given how easy it 
>is to return deferred.succeed - it's probably simpler to say always non- 
>blocking.

  I assume you mean that they should always return a Deferred.  In this case, I agree.  maybeDeferred is intended as a convenience for application-level code.  Framework-level code can avoid introducing the need for it at the application-level by simply always using Deferreds.

>>4) IFileSystemNode.remove doesn't say whether it's a recursive delete  (on 
>>a directory)
>
>Hrm yeah - should it?  Or should this be handled by higher level utilities 
>(eg shutil).  The current os backend uses os.rmdir, so doesn't do a 
>recursive delete.  I've updated the interface to say that it doesn't.
>>, and .rename don't specify whether newName can be in  a different 
>>directory, whether it replaces an existing file, or  whether it works on a 
>>directory.
>
>The method is against Node, so it works on directories.
>
>This is os.rename's spec:
>
>---
>Rename the file or directory src to dst. If dst is a directory, OSError will 
>be raised. On Unix, if dst exists and is a file, it will be removed silently 
>if the user has permission. The operation may fail on some Unix flavors if 
>src and dst are on different filesystems. If successful, the renaming will 
>be an atomic operation (this is a POSIX requirement). On Windows, if dst 
>already exists, OSError will be raised even if it is a file; there may be no 
>way to implement an atomic rename when dst names an existing file. 
>Availability: Macintosh, Unix, Windows.
>---
>
>Should vfs be aiming to provide consistent behaviour for all operations 
>across all backends?  Or should some behaviour be left down to the 
>particular backend to decide?
>
>For the moment I've updated the interface to read:
>
>Renames this node to newName.  newName can be in a different directory.  If 
>the destination is an existing directory, an error will be raised.

The semantics provided by vfs should be the same across all platforms and all backends.  Since os.rename's semantics vary between platforms, this probably eliminates it from (unaided) use in an implementation.  .rename() in VFS should work across filesystems, guarantee atomicity (if this is feasible - I think it is.  If not, it should explicitly deny atomicity), and have well-defined edge cases (for example, whether an exception is raised because the destination exists already should be defined one way or the other, and that's how it should always work).

>>5) Errors are coarse-grained. Everything is a VFSError, and the only 
>>detailed information is in human-readable text, not any nice computer- 
>>readable form.
>
>yeah :( that needs to be fixed.
>>6) Need some support in the interface for extended attributes.
>
>There's getMetadata. That let's you return arbitrary attributes.
>
>Would that cover what you're thinking?
>
>Protocol's should try to get by with as little metadata as they can.  If a 
>backend doesn't supply a bit of metadata a protocol must have, then it won't 
>be able to be used with the protocol.
>

There needs to be a convention for the format of this metadata.  Protocol implementations should not need to be familiar with the backend they are using, and different backends should provide the same metadata in the same way.  It may make sense to expand the example dictionary in getMetadata's docstring, and continue expanding it as new requirements are made (perhaps getMetadata's docstring isn't the best place for this information, either).  This still doesn't strike me as ideal, but it's better than nothing.

Going further, I'd like to see pathutils implemented in terms of twisted.python.filepath: there's a lot of code duplication between these two modules.

The code in twisted/vfs/adapters/dav.py is misplaced.  Itamar posted to this list about this issue a couple weeks ago, but I'll re-iterate.  Third-party package dependencies need to be considered carefully.  Most importantly, dependencies *must* not be cyclic.  Twisted cannot import from akadav, because akadav imports from Twisted.  If akadav can be used to provide VFS functionality, then the adapters to do so belong in akadav, or in some other package: not beneath the Python package "twisted".

As I mentioned above, twisted/vfs/adapters/ftp.py and stream.py shouldn't be importing from twisted.web2.  Likewise, twisted/vfs/adapters/sftp.py's dependence on twisted.conch is backwards: twisted.conch should provide code which augments twisted.vfs.  These are both great candidates for use of the plugin system.  This also lets you take care of the nasty registration-requires-import issues, since gathering plugins will necessarily import the required modules, or if not, will provide a hook so that they can be imported at precisely the right time.

Some easy things: new code in Twisted should use new-style classes; modules should have `test-case-name' declarations; zope Interface's convention is to not include "self" in method declarations; "type(x) is y" is generally wrong - osfs.py uses it in getMode() - both because isinstance() should really be used, and because type checking generally indicates some weakness in an API (why might the mode be either a string or an integer?  pick one and require only that).

I hope this doesn't come off as too critical :)  I'm very much looking forward to the day when setting up a dav server against a purely virtual, dynamic filesystem is as easy as implementing a couple interfaces out of ivfs.py.

Jp




More information about the Twisted-Python mailing list