[Twisted-Python] Streaming File Transfer Protocol?

Darren Govoni darren at ontrenet.com
Fri Feb 19 06:33:22 EST 2010


Hi again,
   Ok, so now it seems the Int32StringReceiver does not receive
"stringReceived" events if the sent bytes
exceed a certain amount. If I send from the client, say 5000 bytes. It
receives it.

But when i send 7376896 bytes at a time, the client indicates it wrote
the bytes, but there is not one
stringReceived callback on the server side of the protocol. No error or
exception. nothing. 
Is this normal behavior?

thanks,
Darren

On Sat, 2010-02-13 at 20:22 -0500, David Bolen wrote:

> Darren Govoni <darren at ontrenet.com> writes:
> 
> > I spoke too fast. But pardon my noobiness.
> >
> > Ok, so I am using a simple protocol that is listening on a TCP port.
> >
> > One the client side, I write 4096 bytes using
> > self.transport.write(bytes)
> >
> > on dataReceived side, I get only 1448. 
> 
> Quite possible, and even likely with a chunk of 4096, given likely
> network latencies and the physical packet sizes at each network hop
> along the way.
> 
> However, dataReceived will eventually be called additional times until
> all of the 4096 bytes that was transmitted and received over the
> socket connection have been handed off to your protocol.  That's just
> the nature of a stream protocol - it's a constant stream of data being
> fed by one end and drained on the other, without any natural
> boundaries or structures within (other than, I suppose, the boundary
> of an octet since you can't receive a partial octet).
> 
> The alternative is to use a datagram protocol like UDP, but then you
> have all the negatives of no guaranteed delivery, out of order
> delivery, completely impossible delivery (when trying a datagram
> larger than the UDP limit), etc...
> 
> Far easier to just handle the TCP stream properly.
> 
> > Now, what I "want" to happen is when I issue a write of a known
> > number of bytes. I "want" those bytes to arrive in total because
> > they represent a pickled object.  The server has no idea if the
> > bytes are split and scattered (again, I want the control protocol to
> > take affect).
> 
> I suspect it may just be a difference in phrasing, but note that I
> consider "arrive in total" to be different from "arrive in the same
> number of I/O operations".  TCP guarantees the former (sans dropped
> connections) but not the latter.  It's a trade-off that you make in
> order to get the other benefits of guaranteed delivery with TCP,
> regardless of network disruptions, latency, etc...
> 
> You're fine as long as you just accept up front that you can't make
> any assumptions as to how the data will arrive at the receiving end.
> So combine the data in whatever sizes it is received (and any number
> of received chunks) until you have it all.  You can then de-pickle it
> or do anything else with it.  As a comparison, that's really all PB is
> doing, although it's banana-encoding the object on the wire rather
> than pickling.
> 
> Depending on the client/server interaction, you may also have the
> opposite problem - the final chunk of data received may cover more
> than one client transmission, and you'll have to split it up
> appropriately.
> 
> That's why if you will be transmitting multiple sets of data over a
> single connection, you'll want some structure (unique boundary codes,
> encoded length information, parseable data like XML, etc...) in the wire
> protocol so your server knows when it is done.
> 
> > 1) Am I doing something wrong here?
> 
> Not so much wrong, as perhaps a little misguided in terms of trying to
> have a stream protocol work less as a stream than it does.
> 
> I suspect you may also be over-estimating a little the complexity of
> handling this aspect of TCP in your own code.
> 
> > 2) Can I force twisted to send ALL the bytes I issue in the write
> > without re-thinking TCP or forcing me to re-implement TCP?
> 
> Again, distinguish between "send ALL the bytes" which *does* in fact
> happen, versus "receive bytes in identically sized chunks" which will
> not happen.  Though I seriously doubt that your demands are such that
> it requires "re-thinking" or "re-implement[ing]" TCP.
> 
> Much easier to stick with the TCP base (loads of benefits), and just
> encode enough structure into your stream to permit the server to
> identify the boundaries of the requests.  Then, code the server to
> look for such boundaries while accepting data in any size chunks, and
> you're done.  It's pretty much what every other TCP protocol that has
> structure to its data does, whether that's length counted, flag bytes,
> specific textual content (such as the final empty line in an HTTP
> request), etc...
> 
> As has been posted in another response, you may find some of the
> existing protocols in twisted.protocol.basic to be helpful for this.
> The older posting of mine that you referenced used a subclass of
> LineReceiver to encode the length in ASCII as part of an initial
> header, for example, though it closed the connection when done.  And,
> for example, Netstring or the Int##String classes takes care of the
> counting on your behalf, and even give subclasses a nice single entry
> point (stringReceived) to use instead of dataReceived, so your server
> need not think about the aggregation or splitting of chunks.
> 
> If nothing else, reading the source to one of those receiver classes
> might help provide a concrete example of the aggregation (or
> splitting) of the stream data that I mention above.
> 
> -- David
> 
> 
> _______________________________________________
> Twisted-Python mailing list
> Twisted-Python at twistedmatrix.com
> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://twistedmatrix.com/pipermail/twisted-python/attachments/20100219/dd2d5f0f/attachment-0001.htm 


More information about the Twisted-Python mailing list