<br><br><div><span class="gmail_quote">On 1/17/07, <b class="gmail_sendername"><a href="mailto:glyph@divmod.com">glyph@divmod.com</a></b> &lt;<a href="mailto:glyph@divmod.com">glyph@divmod.com</a>&gt; wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

<div><span class="q">On 11:35 pm, <a href="mailto:jarrod@vertigrated.com" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">jarrod@vertigrated.com</a> wrote:<br><br>&gt;There is a &quot;backend&quot; C module that our Twisted server front ends, and it is

<br>&gt;highly multi-threaded.<br>&gt;So the T1000 is PERFECT for our application, except that now Twisted is the<br>&gt;bottleneck. :-(<br><br></span>This seems odd to me.<br><br>If all the CPUs are going to be busy doing a multi-threaded back-end&#39;s work, and Twisted is just doing the I/O, then it seems the T1000 would still be a benefit. &nbsp;The benchmark you mentioned was completely static; there was no backend library, no multithreaded CPU load. &nbsp;Is the performance disparity similar when you&#39;re running actual workloads?

</div></blockquote><div><br>snipped a&nbsp; lot of good information :-) <br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div>Again, it seems weird to me that this is necessary if the back-end library is really utilizing all the CPUs already and you are not I/O bound.

</div></blockquote><div><br>Here is what we are doing basically.<br><br>Twisted takes in data and in a C extension we send the data to multiple backends in parallel to do processing on it.<br>Then we aggregate the results and send information back to the client.

<br>This is basically a fancy proxy that parallelizes and distributes work to other machines on the network.<br>All the clients run in &quot;keep-alive&quot; mode, so they don&#39;t create new connections for each piece of work they send to the system, so

<br>once they are all connected, they stay connected for their lifetime ( long time ).<br><br>On the Dell 2850&#39;s without any backend code, we see 600ms latency with a test suite of 400 clients.<br>With the Solaris SPARC machines T1000 and V210 we see 4000 - 5000 ms latency with the same no-op code and the same 400 clients.

<br><br>With the backend code we see about an additional 250ms latency on both platforms, since the &quot;backend&quot; code is just taking the data and sending it out across the network to process, it just sits waiting on responses. The backend code is just not doing enough work to stress the machine basically.

<br><br>We have LOTS and LOTS of test harness code and profiling code to pinpoint where bottlenecks are. We are going to have process a couple of terabytes a day thru this system. Latency thru the system is a high priority because of what kind of system it is.

<br><br>We can get up to about 1400 clients on the Dell 2850 hardware before latency starts climbing out of control.<br>The SPARC hardware is falling over at 400 clients :-(<br><br>Thanks to everyone for all the ideas and help.

<br></div></div>