[Twisted-Python] [Twisted-web] render_GET and memory consumption
glyph at twistedmatrix.com
Fri Dec 24 01:13:40 EST 2010
Cross-posting to Twisted core list because this is starting to get into general non-web stuff. Future replies about this fork of the thread should go there.
On Dec 23, 2010, at 11:22 AM, exarkun at twistedmatrix.com wrote:
> On 22 Dec, 06:51 am, glyph at twistedmatrix.com wrote:
>> On Dec 21, 2010, at 2:24 PM, exarkun at twistedmatrix.com wrote:
>>> Instead, you have to go all the way to producers/consumers, and only
>>> write more data to the transport buffer when it has finished dealing
>>> with what you previously gave it.
>> While everybody should of course use producers and consumers, I feel
>> like there should be a twisted core ticket for this behavior of
>> transport buffering, and a twisted web ticket for this behavior of the
>> request buffering. The naive implementation _could_ be much cheaper
>> memory-wise; at the very least, twisted.web.static.Data ought to do the
>> smart thing.
> Fixing Data sounds like a good idea. I don't know what improvement to
> the transport buffering you're thinking of, though. It doesn't seem
> like there is an obvious, generally correct fix.
Right now, FileDescriptor.write appends its input data directly to _tempDataBuffer. So far, so good: no string mangling.
So let's say we do fd.write(header); fd.write(veryBigBody).
Then we start writing.
FileDescriptor.doWrite comes along and notices that the dataBuffer is empty. The first thing it does in this case: ''.join(_tempDataBuffer); which copies the entire veryBigBody.
FileDescriptor is kinda sorta trying to avoid this problem by maintaining an 'offset' so it doesn't need to re-copy dataBuffer; it could use a similar tactic and do write()s out of individual chunks which are greater than SEND_LIMIT directly, rather than coalescing them together.
Or maybe we could wrap writev() instead of copying stuff at all?
More information about the Twisted-Python