[Twisted-Python] intermittent problem: not accepting new connections
matusis at yahoo.com
Wed Sep 10 14:32:06 MDT 2008
I have had a twisted epoll server that was heavily used, such that it
saturated CPU (100% shown by "top", about 5000 connections, intense message
I am using twisted 2.5.0 that I patched for epoll bug.
It was run on python 2.4.4 , 2.6.11 kernel on a single core xeon 3.0 GHz
CPU. This server has been on for many months, and it has been rock-stable.
A couple of days ago I migrated that server to a newer machine: same patched
twisted 2.5.0, same python 2.4.4, newer 2.6.24 kernel and a quad core xeon
CPU usage dropped from 100% to 30%, as expected, with the same rate of
However the server now has the following intermittent problem: about twice a
day, it stops accepting new connections for a short period of 5-10 minutes.
telnet times out, I get this:
root at serv2:/proc/net/netfilter# telnet localhost 5229
Existing connections are not cut, they server receives/delivers messages
to/from them just fine.
These short periods of not accepting connections do not correlate with
increased CPU load or with the overall number of connections to the server.
I have had a problem with the same symptoms before, when a server process
run out of its quota of file descriptors.
However, there were clear messages in the twisted log at that time, and
upping the ulimits solved the problem.
This time, there are no errors in ANY logs (twisted log. /var/log/messages,
I am out of ideas on what this could be, because my setup is exactly the
same as I have been using in the last year, except for a faster CPU and a
I suspect that there are some new uncaught accept() exceptions in
internet/tcp.py in the part where it's looking for EMFILE, ENOBUFS, ENFILE,
ENOMEM, ECONNABORTED errors.
More information about the Twisted-Python