[Twisted-Python] Twisted server is 5 times SLOWER on Solaris than Linux?

Wed Jan 17 18:59:19 MST 2007

On 1/17/07, glyph at divmod.com <glyph at divmod.com> wrote:
>
> On 11:35 pm, jarrod at vertigrated.com wrote:
>
> >There is a "backend" C module that our Twisted server front ends, and it
> is
> >highly multi-threaded.
> >So the T1000 is PERFECT for our application, except that now Twisted is
> the
> >bottleneck. :-(
>
> This seems odd to me.
>
> If all the CPUs are going to be busy doing a multi-threaded back-end's
> work, and Twisted is just doing the I/O, then it seems the T1000 would still
> be a benefit.  The benchmark you mentioned was completely static; there was
> no backend library, no multithreaded CPU load.  Is the performance disparity
> similar when you're running actual workloads?
>

snipped a  lot of good information :-)

Again, it seems weird to me that this is necessary if the back-end library
> is really utilizing all the CPUs already and you are not I/O bound.
>

Here is what we are doing basically.

Twisted takes in data and in a C extension we send the data to multiple
backends in parallel to do processing on it.
Then we aggregate the results and send information back to the client.
This is basically a fancy proxy that parallelizes and distributes work to
other machines on the network.
All the clients run in "keep-alive" mode, so they don't create new
connections for each piece of work they send to the system, so
once they are all connected, they stay connected for their lifetime ( long
time ).

On the Dell 2850's without any backend code, we see 600ms latency with a
test suite of 400 clients.
With the Solaris SPARC machines T1000 and V210 we see 4000 - 5000 ms latency
with the same no-op code and the same 400 clients.

With the backend code we see about an additional 250ms latency on both
platforms, since the "backend" code is just taking the data and sending it
out across the network to process, it just sits waiting on responses. The
backend code is just not doing enough work to stress the machine basically.

We have LOTS and LOTS of test harness code and profiling code to pinpoint
where bottlenecks are. We are going to have process a couple of terabytes a
day thru this system. Latency thru the system is a high priority because of
what kind of system it is.

We can get up to about 1400 clients on the Dell 2850 hardware before latency
starts climbing out of control.
The SPARC hardware is falling over at 400 clients :-(

Thanks to everyone for all the ideas and help.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/twisted-python/attachments/20070117/c8713457/attachment.html>