[Twisted-Python] Sending a long string/buffer without copying it

Jacob Gabrielson jacob at cozi.com
Thu Sep 28 19:18:42 EDT 2006


I'm probably missing something, but your outgoing bytes have to be
copied at least once, anyway, into the kernel's address space (unless
you're using some kind of trick like sendfile()).  Clearly creating a
duplicate of the entire multi-MB string would be bad.  But as long as
the Producer kept returning the same, say, 4KB buffer (I'm assuming
that's possible?), you're only talking about one extra copy per write.
That *might* still be significant, but without measuring it it would be
very hard to say.

Just my $.02,

-- Jacob 

-----Original Message-----
From: twisted-python-bounces at twistedmatrix.com
[mailto:twisted-python-bounces at twistedmatrix.com] On Behalf Of Brian
Granger
Sent: Thursday, September 28, 2006 3:48 PM
To: Twisted general discussion
Subject: Re: [Twisted-Python] Sending a long string/buffer without
copying it

But a producer will just make sure the whole thing isn't copied at the
same time right?  It still does many smaller copies - while the memory
is saved there is still the performance hit.

I just wanted to make sure that I wan't missing something obvious.  I
think the right way of doing this is to use a true rw buffer, such as
those created by numpy.

On 9/28/06, Jean-Paul Calderone <exarkun at divmod.com> wrote:
> On Thu, 28 Sep 2006 00:11:41 -0600, Brian Granger
<ellisonbg.net at gmail.com> wrote:
> >HI,
> >
> >In one of my Twisted based applications, I need to send large string 
> >and buffers.  They can be 100's of MB's long (they come from large 
> >numpy arrays).  I would like to be able to send them *without making 
> >any copies* in the process.
> >
> >This seems to be dificult with the way that certain parts of Twisted 
> >are written:
> >
> >in protocols.basic many of the sendString/sendLine method having 
> >things that make a copy of the string or line to be send:
> >
> >    def sendLine(self, line):
> >        """Sends a line to the other end of the connection.
> >        """
> >        return self.transport.write(line + self.delimiter)
> >
> >If line is 100MB, this just made a second 100MB string.  To make 
> >things worse, in my case a server needs to send this line to many 
> >clients that are connected.  The line gets copied for each client!  
> >If I have 10 clients, I have nearly a GB worth of extra memory 
> >allocated for this temporary copy.
> >
> >This problem is easy solve at the protocol level: you just do 
> >separate writes for the delimiter and the line.  Or if you are using 
> >a length prefixed protocol, write the length bytes and the string
separately.
> >
> >BUT....
> >
> >Even if I do that, it appears that Twisted is making copies elsewhere
> >- like in FileDescriptor.doWrite.  So, how can I send something 
> >without making a copy?  I don't mind making copies of slices, just 
> >not the whole thing.
>
> Don't pass the entire thing to a single call to transport.write() (or 
> LineReceiver.sendLine).  Instead, write a producer.
>
> Jean-Paul
>
> _______________________________________________
> Twisted-Python mailing list
> Twisted-Python at twistedmatrix.com
> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>

_______________________________________________
Twisted-Python mailing list
Twisted-Python at twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python




More information about the Twisted-Python mailing list