[Twisted-web] render_GET and memory consumption

exarkun at twistedmatrix.com exarkun at twistedmatrix.com
Tue Dec 21 09:00:31 EST 2010

On 05:19 am, psanchez at fosstel.com wrote:
>Here's a demo HTTP server that returns 10 MB of random data each time a
>client connects.
>import os
>from twisted.internet import reactor
>from twisted.web.server import Site
>from twisted.web.resource import Resource
>data = os.urandom(10*1024*1024)
>class TestPage(Resource):
>     isLeaf = True
>     def render_GET(self, request):
>         return data
>root = Resource()
>root.putChild('test', TestPage())
>reactor.listenTCP(8880, Site(root))
>Now, when I run N clients simultaneously from a different host I see
>that the server's memory consumption increases by N*10 MB. I can't
>reproduce this example when running the clients from the same host as
>the server; the test goes so fast that I can't gather any useful data.
>I run the test using the following httperf command on a different host
>and looking at the Gnome system monitor in the server (top will do as 
>httperf --server --port 8880 --uri /test \
>         --rate 10 --num-conn 10
>When the server is idle memory consumption is 17.1 MB, but during the
>test it jumps to 117.2 MB. My questions are then:
>1. Given that 'data' is a global variable, eventually read-only as 
>why is it replicated for each request? And who is replicating it?

It's copied as part of the process of writing it to the socket.  You 
can't write 10MB at once, and you can't slice a string (to throw away 
the part that you did manage to write) without making a copy of part of 
>2, What would be the proper way to re-write this example so that there
>is one and only one 'data' structure at any time?

Split data up into ~32kB-64kB chunks and write them to the request 
individually.  Then each chunk can just be dropped with no copying.


More information about the Twisted-web mailing list