[Twisted-Python] Many connections and TIME_WAIT

exarkun at twistedmatrix.com exarkun at twistedmatrix.com
Wed Jan 27 08:05:16 EST 2010

On 04:50 am, donal.mcmullan at gmail.com wrote:
>I've been prototyping a client that connects to thousands of servers 
>calls some method. It's not real important to me at this stage whether
>that's via xmlrpc, perspective broker, or something else.
>What seems to happen on the client machine is that each network 
>that gets opened and then closed goes into a TIME_WAIT state, and 
>there are so many connections in that state that it's impossible to 
>any more.

Yep.  That's what happens to a TCP connection when you close it.
>I'm keeping an eye on the output of
>netstat -an | wc -l
>Initially I've got 569 entries there. When I run my test client, that 
>up really quickly and peaks at about 2824. At that point, the client 
>a callRemoteFailure:

Presumably these numbers have something to do with how quickly you're 
opening and closing new connections.  TIME_WAIT lasts for 2MSL (4 
minutes) to ensure that a future connection doesn't receive data 
intended for a previous connection (clearly a bad thing).

However... 2824 is a pretty low number at which to run out of sockets. 
Perhaps you're running this software on Windows?  I think Windows has a 
ridiculously small number of "client sockets" allocated by default.  I 
seem to recall this being something you can change with a registry edit 
or something like that.

Another option would be to switch to a POSIX-platform instead.

If you're *not* on Windows, then this is odd and perhaps bears further 
>callRemoteFailure [Failure instance: Traceback (failure with no 
><class 'twisted.internet.error.ConnectionLost'>: Connection to the 
>side was lost in a non-clean fashion: Connection lost.

This isn't exactly how I'd expect it to fail, but I also don't know what 
"callRemoteFailure" is or where it comes from, so maybe that's not too 
>Increasing the file descriptor limits doesn't seem to have any effect 

Quite so.  The process has, after all, already closed these sockets. 
They no longer count towards the process's file descriptor limit (oh 
dear, I suppose you're not using Windows if you have a file descriptor 
limit to raise).
>Is there an established Twisted sanctioned canonical way to free up 
>resource? Or am I doing something wrong? I'm looking into tweaking
>SO_REUSEADDR and SO_LINGER - that sound sane?
>Just tapping the lazywebs to see if anyone's already seen this in the 

On most reasonably configured Linux machines, you shouldn't run into 
this problem until you're doing at least an order of magnitude more 
work.  Many times, I have run clients that do many thousands of new 
connections per second, resulting in tens of thousands of TIME_WAIT 
sockets on the system with no problem.  So, I'm not sure why you're 
running into this after only a few thousand.


More information about the Twisted-Python mailing list