[Twisted-web] xmlrpc resource file descriptor leak
Werner Thie
wthie at thiengineering.ch
Thu Jul 3 02:59:48 EDT 2008
Hi
suffering form the same problem a year ago or so, I dug into this by
following the call chain and cgi.py is the source of the 'too many fd'
problem.
For an explanation read the comment starting at line 417 in cgi.py which
reads:
The class is subclassable, mostly for the purpose of overriding
the make_file() method, which is called internally to come up with
a file open for reading and writing. This makes it possible to
override the default choice of storing all files in a temporary
directory and unlinking them as soon as they have been opened.
The trick which is used here is the fact, that an fd hangs around for
some time even if the fd in question was unlinked. It takes some time
for the OS to collect all those unlinked fds, but they will be collected
eventually. The number of fds allowed per process when using cgi.py
(used by twisted) depends on the burst rate of requests, because every
request has per default a FieldStorage and therefore an fd.
The only solution is to up the number of allowed fds per process/per
machine and depends on the OS:
MS Windows: if CRT is used, hardcoded to 2048 else limited by mem
On **ixes use ''ulimit -a' or 'sysctl -a | grep files' to get a printout
the system value, usually something along kern.maxfiles=10000
Per machine:
/etc/sysctl.conf contains the values for the kernel preset when booting.
Per process:
/etc/login.conf contains usually a variable called openfiles-max
On my OpenBSD production system (avg load 30 req/sec) values are
kern.maxfiles=10000
openfiles-max=8192
openfiles-cur=8192
which allows smooth operation of two twisted processes on a dual core
machine.
HTH, Werner
FYI the output of top:
load averages: 0.34, 0.31, 0.31
08:55:01
31 processes: 1 running, 29 idle, 1 on processor
CPU0 states: 10.8% user, 0.0% nice, 2.6% system, 0.0% interrupt,
86.6% idle
CPU1 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt,
100% idle
Memory: Real: 325M/608M act/tot Free: 2913M Swap: 0K/4096M used/tot
PID UNAME PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND
4562 www 2 0 125M 97M sleep/0 poll 242:39 11.82% python2.5
6506 www 2 0 205M 181M run/0 - 34:20 2.00% python2.5
Phil Mayers wrote:
> This is a bit vague, and I wanted to get some feedback before I submit a
> ticket.
>
> We have a long-running twisted / nevow process that basically has:
>
> root
> \- RPC2 - a twisted.web.xmlrpc.XMLRPC sub-class
> \- ui - nevow pages
>
> The thing hung up over the weekend with "too many open file descriptors"
> and before I killed it I did an "lsof"; lots of the files were:
>
> python25 20163 nsg 31u REG 253,0 370 3276854
> /tmp/tmp5QJivu (deleted)
>
> ...and "cat /proc/20163/fd/31" shows:
>
> <?xml version='1.0'?>
> <methodCall>
> <methodName>classify_maclist</methodName>
> <params>
> <param>
> <value><string>HORPROD</string></value>
> </param>
> <param>
> <value><array><data>
> <value><string>xxxx</string></value>
> </data></array></value>
> </param>
> <param>
> <value><int>-1</int></value>
> </param>
> <param>
> <value><int>5</int></value>
> </param>
> </params>
> </methodCall>
>
> ...which is an XMLRPC call from a Zope server on another machine to this
> process. I presume the t.w.http.Request content is getting written to a
> tempfile, but I can't understand why - the Content-Length is tiny (<400
> bytes).
>
> I can't seem to reproduce this in a sample application though; does
> anyone have any ideas how I can narrow down the problem?
>
> _______________________________________________
> Twisted-web mailing list
> Twisted-web at twistedmatrix.com
> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-web
More information about the Twisted-web
mailing list