[Twisted-Python] process output modulation

David dvkeeney at gmail.com
Tue Dec 21 11:14:07 MST 2010

We are streaming a large amount of program output to the browser via a
twisted app, and we are seeing huge memory consumption.

We have a database process that generates large amounts of data to
stdout, and we are streaming that to the browser through a twisted
web2 app.  We are using web2 because it supports upload streaming as

Our code looks like:

        env = {}
        input = stream.MemoryStream('')
        SQLDUMP = '/usr/bin/dump'

        pstream = stream.ProcessStreamer(input, SQLDUMP,

        outstream = WatchedStream(pstream.outStream)

        response = http.Response( headers=headers, stream=outstream)

        class WatchedStream(object):

           def __init__(self,stream):
                self.stream = stream
           def split(self, point):
                ... some implementation
           def close(self):
                ... some implementation
           def read(self):
                d = self.stream.read()
                bufSize = sum( [len(b) for b in self.stream.buffer if b])
                log.msg('buffer size: %s'%bufSize)
                return d

Watching the log shows us that the stream (a web2.ProducerStream)
buffer is growing continuously to hundreds of MB.  Doesn't a stream
object have a bufferSize attribute and the ability to throttle the
flow of data based on buffer fullness?  Does that throttling behavior
have to be triggered explicitly?

Yes, I know that web2 is deprecated, but I don't know that the problem
is in the web2 components.  The reactor.spawnProcess documentation
does not seem to address the matter of modulating the read speed.  Any
assistance will be appreciated.

dkeeney at travelbyroad.net
Rdbhost -> SQL databases as a webservice [www.rdbhost.com]

More information about the Twisted-Python mailing list