[Twisted-Python] Need words of wisdom regarding PB

Paul G paul-lists at perforge.com
Thu Jun 29 20:15:45 EDT 2006


quick note: please don't top post (in replies to me, anyway). just common 

in my context, the solution was meant as a synchronization method with some 
application-level logic to be used instead of rsync (i didn't need 
differencing and needed some complicated candidacy logic). the pb client 
simply opened multiple connections and sent data chunks labelled with chunk 
id and object id, multiplexed over the connection pool, with the chunks 
being pulled out of a queue that a threadpool reading from disk and 
optionally gzipping data was writing to. i didn't need to write logic to 
grow the connection pool to maximize bandwidth (i easily saturated a 
fast-e), but that would be trivial.

note: i do not advocate doing this unless there is a very good reason 
existing tools won't work for your application. reinventing the wheel makes 
baby jesus cry.


----- Original Message ----- 
From: "Chaz." <eprparadocs at gmail.com>
To: "Paul G" <paul-lists at perforge.com>
Cc: "Twisted general discussion" <twisted-python at twistedmatrix.com>
Sent: Wednesday, June 28, 2006 8:03 AM
Subject: Re: [Twisted-Python] Need words of wisdom regarding PB

> Paul,
> Thanks for the information. I hadn't thought about multiple pb
> connections, but I might even for my low-latency environment. I can have
> files that range from very small to very very large and getting them
> transferred in a reasonable period of time might might require many
> connections at once.
> In your solution did you just allow the client to "divide" up the work
> or was there something else you did?
> Chaz
> Paul G wrote:
>> ----- Original Message ----- From: "Chaz." <eprparadocs at gmail.com>
>> To: "Twisted general discussion" <twisted-python at twistedmatrix.com>
>> Sent: Sunday, June 25, 2006 2:20 PM
>> Subject: [Twisted-Python] Need words of wisdom regarding PB
>>> I have a problem to solve: I need to get files from one machine to
>>> another. I had thought about all the obvious solutions (and implemented
>>> some of them); for instance adding an FTP server to my Twisted services
>>> and using a client. I thought about doing the file transfer with XML-RPC
>>> and even SOAP. And I even thought of WebDav.
>>> I started reading about PB and thought it might be useful. I thought
>>> about building a "remote" class that simulates open and all its
>>> functions, like read, write, close, etc.
>>> The more I thought about it the more I thought it cool and the way to do
>>> it. That got me thinking that I must be missing something. I am curious
>>> about what you might think of this approach. Is there another better 
>>> way?
>> i implemented file transfer with an optional intermediate gzip stage
>> using pb a long while ago. if you're dealing with fairly low latency, a
>> single pb connection is fine; for higher latencies, you'll want to
>> multiplex multiple pb connections (this has to do with tcp, not pb).
>> with my quick hack, i was able to saturate a fast-e with virtually no
>> significant cpu utilization - the disk io and the network were the
>> bottlenecks.
>> -p

More information about the Twisted-Python mailing list