[Twisted-web] what are the advantage of using a single-threaded server? and when should we use deferreds?

glyph at divmod.com glyph at divmod.com
Sat Apr 26 21:37:30 EDT 2008


On 26 Apr, 06:44 am, inhahe at gmail.com wrote:
>hi, excuse my noobness, I have a few basic questions about twisted, or
>probably about web servers in general.
>
>what is the advantage of using a single-threaded server?
>
>i figured it makes it more scalable because there's too much overhead 
>to
>have a thread for each user when you have many simultaneous users.  but 
>a
>friend i'm talking to now says that using i/o blocking threads is 
>perfectly
>scalable for a large number of simultaneous users.

One way in which a single-threaded server scales better can be 
understood in terms of how the operating system copes with increased 
load.

Let's say your site is being hit very, very hard, and you have to ssh in 
to change some stuff around to update it to deal with more load.  With a 
single-threaded server, the operating system is scheduling two tasks: 
your SSH session and your web server.  Therefore your SSH session gets 
plenty of time to talk to you.  With a multi-threaded or multi-process 
server, it is scheduling zillions of tasks, and you have to think about 
limiting the number of processes that can run (which puts a hard limit 
on the number of concurrent users, that can easily be too many or too 
few for your hardware, especially if those processes are doing network 
I/O of their own).

So, on a poorly-configured multithreaded server, you have to wait for 
the load to die down before your system starts slowing to a crawl.  With 
a single-process server like Twisted, lighttpd, and nginx, you can 
easily get in and poke it with a separate maintenance tool like SSH.

There are also potentially performance differences between event-driven 
and multithreaded servers, but there's a huge amount of optimization 
work that has gone into both approaches, so I wouldn't want to say one 
is definitely faster.  Twisted is definitely a lot slower than several 
competing servers which use a multi-process approach.  However, it can 
be made to scale in a variety of interesting ways.  (I would say that 
your friend is wrong in saying that i/o blocking threads is "perfectly 
scalable", though, especially in the naive case.)
>if that's true i can only see a disadvantage in using a single-threaded
>server -- having to use deferreds and stuff to make things asynchronous

This is not a disadvantage.  Deferreds are great; if you have a race- 
condition firing two Deferreds you can easily write a test to fire them 
in a different order and easily replicate the problem in a debugger to 
figure out what is going on.  If you have the same problem with threads, 
you are basically screwed; it's very hard to reproduce in an environment 
where you can see what's going on and even harder to write a repeatable 
test for.
>i also don't understand how you're supposed to use deferreds
>the twisted doc says deferreds won't *make* your code asynchronous.  so
>let's say you have to do an sql query that takes 10 seconds, deferreds 
>would
>be useless for making that not block unless you have a way of making 
>that
>sql query non-blocking already?  how is that done?  do you run a 
>separate
>thread of your own for each sql query?  one thread for all sql queries?

I could try to explain this, but you should really just read the 
Deferred howto and experiment at the Python prompt for a few hours with 
Deferred-returning APIs like twisted.web.client and 
twisted.enterprise.adbapi.  The short answer is "twisted uses threads 
under the covers to do stuff with SQL, but to your code it just looks 
like a deferred because it's simpler".
>also I wonder in an typical twisted app, just how slow should an 
>operation
>be before you use a deferred?  what if a user enters a username and 
>password
>and i have to look that up in the database. do i use a deferred?  just 
>how
>bad should the query be before using a deferred?

It's not a question of speed, it's a question of blocking.  If you are 
doing CPU-intensive stuff, you might want to put it into a separate 
process so you don't need to break it up into lots of little chunks. 
(Look into spawnProcess.)  However, in general the things that use 
Deferreds are the things that generate some output and then wait for 
some input in response.
>(reading the twisted docs is like reading a brick wall for me, it would 
>be
>nice if someone could just explain things to me in simple terms.)

Good luck.



More information about the Twisted-web mailing list