[Twisted-Python] Large file transfers

Bruce Mitchener bruce at cubik.org
Fri Jul 26 09:41:22 EDT 2002


Steve Waterbury wrote:
> Twisted gurus,
> 
> I just noticed item 008 on the twisted TO DO list:
> 
>     File Transfer layer for PB.  This would be especially nice for
>     twisted.words; having standard a way to transfer "large" (100MB+) packets
>     across or in tandem with a PB connection without breaking anything would be
>     very good.
> 
> <sophomoric question>
> Would an ftp connection (authenticated using cred, of course) in 
> tandem or parallel to the PB connection work?  ... but maybe you 
> are referring to implementing file transfer as *part* of the 
> PB protocol, in which case this question might not make any sense 
> at all.    
> </sophomoric question>
> 
> And how close is this to being implemented?
> 
> My interest is not merely academic -- the application I am working 
> on will be "routinely" transferring 100MB+ files, and I'd like 
> to use the PB as one of our interfaces.

Steve,

This sort of thing is why some of the features of the BEEP protocol are 
nice, specifically the presence of multiple channels and that messages 
on those channels needn't block others, because it chunks them and 
interleaves them.

BEEP is documented in RFC 3080:

   http://www.ietf.org/rfc/rfc3080.txt

and the design rationale is documented in 3171, On the Design of 
Application Protocols:

   ftp://ftp.rfc-editor.org/in-notes/rfc3117.txt

and more information, including links to various implementations can be 
found at:

   http://www.beepcore.org/

I don't know that I'd directly use BEEP because the existing 
implementations are lacking (thread-heavy), and it doesn't support 
features that would be needed for PB-over-UDP support, but having the 
option to run PB over BEEP would let you do large file transfers over 
the same connection (without worrying about NAT or firewall issues) 
without blocking the usual PB messages.

I'd go so far as to say that this problem isn't just with file 
transfers.  It is a potential problem anytime you have messages of 
different priorities being sent over PB.  Larger, lesser priority 
messages block higher priority messages because they're all over the 
same connection and there aren't logical channels in PB (as in BEEP).

You can work around this yourself by manually chunking messages and 
managing sending them at the sending side in small pieces to give other 
messages a chance to make it through.  Another way of handling this, and 
nicer than laying on top of BEEP, would be to start down the path 
towards some of the features that would be needed or useful in UDP 
support.  With UDP support, it'd be useful to be able to flag messages 
with different bits of data:

    * Reliable
    * Unreliable
    * Time-sensitive data which is useless after that time.

in that sort of scenario, one could add an additional set of behaviors 
where the message that contained large, low-priority data would be 
flagged to let PB know that it was something that could be spread out 
over time and that timing for it wasn't a concern.

Maybe there are already capabilities like this in PB .. but given the 
lack of docs, I haven't found them yet. :)

Cheers,

  - Bruce





More information about the Twisted-Python mailing list