[Twisted-Python] TCP Proxy scalability issue

glyph at divmod.com glyph at divmod.com
Fri May 1 11:54:56 EDT 2009


On 02:02 pm, saurav.mohapatra at dimdim.com wrote:
>Worker processes W1...n run listening on P1..n on the loopback and one
>router process (twisted based) runs on public port P0 exposed to the
>real world.

I haven't used it myself, but that sounds a bit like txloadbalancer: 
https://launchpad.net/txloadbalancer

Are you using that?
>The clients connect to P0 and the first few bytes they send indicates
>which worker process they wish to connect to. The Twisted protocol
>implementation then creates a relay TCP connection to loopback worker
>process port and after that forwards received data on P0 to the worker
>port on loop back and sends back data received from worker port to the
>external connection.

>We're noticing significant degradation / starvation of the clients 
>under
>load (around 25 concurrent connections are enough to simulate this).

>We're running on Linux (CentOS 5.2) using python 2.5 and Twisted latest
>source tarball using the epoll reactor and all settings are default.
>Each connection sends back around 8-24kb data per second.

While every application is a unique beast in terms of performance 
tuning, these numbers all sound surprisingly low to me for something as 
simple as a TCP proxy.

Are you saturating your CPU?  What is the load like on the box in 
question, both from the Twisted proxy and from the other processes?

Can you provide a benchmark that we can run somewhere else, to 
demonstrate the issue you're having?
>My question is are there any "production" settings I need to do
>(threadpool etc.) to get the maximum out of twisted.

Nothing that you described will make any use of threads - is there some 
reason you mention threadpool size?

If you *are* using threads then perhaps you shouldn't be, and they're 
causing performance problems :)

However, aside from reactor selection, Twisted is designed to have very 
few knobs to turn; it has one button for performance tuning and we push 
it before it leaves the factory.  So if there's a performance problem, 
the issue is that we need to optimize something in Twisted, or you need 
to optimize something in your application.

You may be able to tweak various linux kernel parameters to be able to 
improve things a bit, but if you're running into problems at 25 
connections, it doesn't sound like you're running into kernel issues to 
me.




More information about the Twisted-Python mailing list