[Twisted-Python] debugging a memory leak

Alec Matusis matusis at yahoo.com
Wed Feb 24 16:57:16 EST 2010

In desperation of not finding the real memory leak on the production server,

I wrote a test server that I can push to arbitrary high RSS memory. I am far
from sure if this the same leak that I observe in production, but I would
like to understand what this one is. 
This is the server code:


import twisted.protocols.basic
from twisted.internet.protocol import Factory
from twisted.internet import reactor
class HiRate(twisted.protocols.basic.LineOnlyReceiver):
        MAX_LENGTH = 20000
        def lineReceived(self, line):
                if line == 'get':
                        out = 'a'*4000+'\r\r' 
factory = Factory()
factory.protocol = HiRate
reactor.listenTCP(8007, factory, backlog=50, interface='')

This server has to be flooded by "get" requests from this client:


import socket, time

def client():
    """high rate client, needs a dedicated CPU to run"""
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    except socket.error, e:
        print 'client error is %s' %e
    while 1:
        #print "iter %s" %n
            r = s.recv(1024)
        while r:
            #print r
            except socket.error, e:


To reproduce the memory leak, I either need two machines with fast LAN
between them (since the client program takes 100% CPU), or possibly one
machine with a dual core CPU (I have not tried that). It is important that
client.py is given a separate CPU to run.
When the length of the response from the server is sufficient,  (out =
'a'*4000+'\r\r' ,  4000 is enough in my case), the RSS of the server process
starts to leak without a bound.
If you introduce a small delay in the client (#time.sleep(.001)), the leak
does not occur.

Looking at tcpdump on the server machine, I sometimes see many "get" packets
from the client in a row, that are not followed by response packets from the
server with payload 'aaaaa...'.  Only when the server is in this
"overwhelmed" state, the memory seems to grow unbounded.
I first thought it may be an issue of the unbounded send queue on the
server, but the examination of Send-Q with netstat shows that Send-Q
saturates to a certain ceiling value, while the RSS memory of the server
process continues to grow.  

Here are some commands I was using to watch the parameters of the server:
Watch send-Q and recv-Q:
root$ watch -n1 netstat -an 
RSS memory of the server:
root$ watch -n1 ps -orss -p`netstat -nlp | grep :8007 | awk '{print $7}' |
cut -d/ -f1`
Traffic to/from the server:
root$ tcpdump -A -s10024 -nn -i eth1 'port 8007' (in my case I use eth1 for
LAN to the client)

> -----Original Message-----
> From: twisted-python-bounces at twistedmatrix.com [mailto:twisted-python-
> bounces at twistedmatrix.com] On Behalf Of Werner Thie
> Sent: Monday, February 22, 2010 11:39 PM
> To: Twisted general discussion
> Subject: Re: [Twisted-Python] debugging a memory leak
> Hi
> Assuming that if memory not released to the OS can be reused by the
> interpreter because of a suballocation system used in the interpreter
> should eventually lead to a leveling out of the overall memory usage
> over time, that's what I observe with our processes (sitting at several
> 100 MB per process). We are using external C libraries which do lots of
> malloc/free and one of the bigger sources of pain is indeed to bring
> such a library to a point where its clean not only by freeing all memory
> allocated in every circumstance but also Python refcounting wise. I
> usually go thru all the motions to build up a complete debug chain for
> all modules involved in a project and write a test bed to proof clean
> and proper implementation.
> So if your using C/C++ based modules in your project I would mark them
> as highly suspicious to be responsible for leaks until proven otherwise.
> Not to bother you with numbers but I usually allocate about 30% of
> overall project time to bring a server into a production ready state,
> meaning uptimes of months/years, no fishy feelings, no performance
> oscillations, predictable caving and recuperating when overloaded, just
> all the things you have to tick to sign off a project as completed,
> meaning you don't have to do daily 'tire kicking' maintenance and
> periodic reboots.
> Werner
> Alec Matusis wrote:
> > Hi Maarten,
> >
> > Your link
> > http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-
> delete-
> > a-large-object.htm
> > seems to suggest that even though the interpreter does not release
> > back to the OS, it can be re-used by the interpreter.
> > If this was our problem, I'd expect the memory to be set by the highest
> > usage, as opposed to it constantly leaking: in my case, the load is
> > virtually constant, but the memory still leaks over time.
> >
> > The environment is Linux 2.6.24 x86-64, the extensions used are MySQLdb,
> > pyCrypto (latest stable releases for both).
> >
> >> -----Original Message-----
> >> From: twisted-python-bounces at twistedmatrix.com [mailto:twisted-
> python-
> >> bounces at twistedmatrix.com] On Behalf Of Maarten ter Huurne
> >> Sent: Monday, February 22, 2010 6:24 PM
> >> To: Twisted general discussion
> >> Subject: Re: [Twisted-Python] debugging a memory leak
> >>
> >> On Tuesday 23 February 2010, Alec Matusis wrote:
> >>
> >>> When I start the process, both python object sizes and their counts
> >>> proportionally to the numbers of reconnected clients, and then they
> >>> stabilize after all clients have reconnected.
> >>> At that moment, the "external" RSS process size is about 260MB. The
> >>> "internal size" of all python objects reported by Heapy is about
> >>> After two days, the internal sizes/counts stay the same, but the
> > external
> >>> size grows to 1500MB.
> >>>
> >>> Python object counts/total sizes are measured from the manhole.
> >>> Is this sufficient to conclude that this is a C memory leak in one of
> > the
> >>> external modules or in the Python interpreter itself?
> >> In general, there are other reasons why heap size and RSS size do not
> > match:
> >> 1. pages are empty but not returned to the OS
> >> 2. pages cannot be returned to the OS because they are not completely
> > empty
> >> It seems Python has different allocators for small and large objects:
> >> http://www.mail-archive.com/python-list@python.org/msg256116.html
> >> http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-
> >> delete-
> >> a-large-object.htm
> >>
> >> Assuming Python uses malloc for all its allocations (does it?), it is
> >> malloc implementation that determines whether empty pages are returned
> to
> >> the OS. Under Linux with glibc (your system?), empty pages are
> > so
> >> there reason 1 does not apply.
> >>
> >> Depending on the allocation behaviour of Python, the pages may not be
> >> empty
> >> though, so reason 2 is a likely suspect.
> >>
> >> Python extensions written in C could also leak or fragment memory. Are
> you
> >> using any extensions that are not pure Python?
> >>
> >> Bye,
> >> 		Maarten
> >>
> >> _______________________________________________
> >> Twisted-Python mailing list
> >> Twisted-Python at twistedmatrix.com
> >> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
> >
> >
> > _______________________________________________
> > Twisted-Python mailing list
> > Twisted-Python at twistedmatrix.com
> > http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
> _______________________________________________
> Twisted-Python mailing list
> Twisted-Python at twistedmatrix.com
> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python

More information about the Twisted-Python mailing list