[Twisted-Python] fast high load protocol

Vlad Shevchenko vlad.shevchenko at gmail.com
Sun Feb 21 13:26:55 EST 2010


Hi, Johann

Few words about load script: each "client" is a thread, which wait
from 7 to 15 seconds and then make request to nginx, nginx proxy
requests to api-server. On every request api-server makes 3 callRemote
to memory-db. Delay between creating clients - 0.1 sec.

Currently I use 1 amp-connection for api<=>memory-db. Response time dependency:
clients < 1500 = api-response < 1s
1500 < clients < 3000 = api-response < 14s

Recv-q/Send-q appears on 1500 clients line.

API-server utilization on 3K clients
=========================

Recv-Q ~ 750000...1000000

# top
top - 17:38:47 up 222 days, 23:29,  2 users,  load average: 1.05, 0.77, 0.34
Tasks:  45 total,   3 running,  42 sleeping,   0 stopped,   0 zombie
Cpu(s): 37.1%us,  2.6%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si, 60.3%st
Mem:   1747764k total,   574376k used,  1173388k free,   157632k buffers
Swap:   917496k total,        0k used,   917496k free,   273236k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2492 root      25   0  151m  46m 3568 R 39.7  2.7   2:19.66 python


Memory-db-server utilization on 3K clients
==============================
Send-Q ~ 400000...450000

# top
top - 17:38:08 up 16 days, 22:07,  4 users,  load average: 16.59, 9.40, 4.00
Tasks:  65 total,   6 running,  56 sleeping,   0 stopped,   3 zombie
Cpu(s): 37.6%us,  5.6%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.3%si, 56.4%st
Mem:   1747764k total,  1308636k used,   439128k free,   281048k buffers
Swap:   917496k total,        0k used,   917496k free,   616676k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
31804 root      15   0 56596  51m 2820 R 25.7  3.0  16:44.27 python
<----- memory-db
30788 root      15   0 1622m 164m 2648 S  6.9  9.6   0:31.89 python
<----- load script
 1449 root      25   0     0    0    0 Z  4.3  0.0   0:00.35 python <defunct>

I tried tests with 30/50/100 amp connections (with round-robin
algorithm of choosing conn). Thats partially solve the problem: server
can keeps response time 2-4s, but max number of clients going down
with more amp connections. With 100 amp connections load script got
270 errors when achieve 2.7K clients (nginx reported about dropped
connections because of timeout, my nginx proxy timeout settings 120
sec). Recv-Q still exists on most from 100 connections, but
proportionally smaller.
Note about environment: Amazon EC2, both servers are small instances --
1.7 GB of memory, 1 EC2 Compute Unit (1 virtual core with 1 EC2
Compute Unit), 32-bit platform
One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2
GHz 2007 Opteron or 2007 Xeon processor
OS - Ubuntu Hardy

On Sun, Feb 21, 2010 at 8:18 PM, Johann Borck
<johann.borck at densedata.com> wrote:
> Vlad Shevchenko wrote:
>> Thanks a lot, Stephen.
>>
>> AMP probably is what I looking for. Now I can handle a much more
>> clients (2 times more without significantly increasing response time).
>> Server can also handle 3 times more clients without any errors, but
>> response time grow up. I check for netstat and find out Recv-Q about
>> 1706013 on client-side of AMP and Send-Q about 642288 on AMP
>> server-side. Is this meant:
>>     - OS needs tuning (ulimit or ifconfig)
>>     - or Twisted/Python can't handle so much amp-connections?
>>
>>
> How many connections are there? Are the queues that big for all of them?
>> CPU utilization < 10%, free memory about 500M from 1.7G.
>>
>>
> Is this the total system CPU load (how many cores/CPUs)?
>
> Johann
>
>
> _______________________________________________
> Twisted-Python mailing list
> Twisted-Python at twistedmatrix.com
> http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>



-- 
WBR, Vlad Shevchenko



More information about the Twisted-Python mailing list