[Twisted-web] what are the advantage of using a single-threaded
server? and when should we use deferreds?
glyph at divmod.com
glyph at divmod.com
Sat Apr 26 21:37:30 EDT 2008
On 26 Apr, 06:44 am, inhahe at gmail.com wrote:
>hi, excuse my noobness, I have a few basic questions about twisted, or
>probably about web servers in general.
>
>what is the advantage of using a single-threaded server?
>
>i figured it makes it more scalable because there's too much overhead
>to
>have a thread for each user when you have many simultaneous users. but
>a
>friend i'm talking to now says that using i/o blocking threads is
>perfectly
>scalable for a large number of simultaneous users.
One way in which a single-threaded server scales better can be
understood in terms of how the operating system copes with increased
load.
Let's say your site is being hit very, very hard, and you have to ssh in
to change some stuff around to update it to deal with more load. With a
single-threaded server, the operating system is scheduling two tasks:
your SSH session and your web server. Therefore your SSH session gets
plenty of time to talk to you. With a multi-threaded or multi-process
server, it is scheduling zillions of tasks, and you have to think about
limiting the number of processes that can run (which puts a hard limit
on the number of concurrent users, that can easily be too many or too
few for your hardware, especially if those processes are doing network
I/O of their own).
So, on a poorly-configured multithreaded server, you have to wait for
the load to die down before your system starts slowing to a crawl. With
a single-process server like Twisted, lighttpd, and nginx, you can
easily get in and poke it with a separate maintenance tool like SSH.
There are also potentially performance differences between event-driven
and multithreaded servers, but there's a huge amount of optimization
work that has gone into both approaches, so I wouldn't want to say one
is definitely faster. Twisted is definitely a lot slower than several
competing servers which use a multi-process approach. However, it can
be made to scale in a variety of interesting ways. (I would say that
your friend is wrong in saying that i/o blocking threads is "perfectly
scalable", though, especially in the naive case.)
>if that's true i can only see a disadvantage in using a single-threaded
>server -- having to use deferreds and stuff to make things asynchronous
This is not a disadvantage. Deferreds are great; if you have a race-
condition firing two Deferreds you can easily write a test to fire them
in a different order and easily replicate the problem in a debugger to
figure out what is going on. If you have the same problem with threads,
you are basically screwed; it's very hard to reproduce in an environment
where you can see what's going on and even harder to write a repeatable
test for.
>i also don't understand how you're supposed to use deferreds
>the twisted doc says deferreds won't *make* your code asynchronous. so
>let's say you have to do an sql query that takes 10 seconds, deferreds
>would
>be useless for making that not block unless you have a way of making
>that
>sql query non-blocking already? how is that done? do you run a
>separate
>thread of your own for each sql query? one thread for all sql queries?
I could try to explain this, but you should really just read the
Deferred howto and experiment at the Python prompt for a few hours with
Deferred-returning APIs like twisted.web.client and
twisted.enterprise.adbapi. The short answer is "twisted uses threads
under the covers to do stuff with SQL, but to your code it just looks
like a deferred because it's simpler".
>also I wonder in an typical twisted app, just how slow should an
>operation
>be before you use a deferred? what if a user enters a username and
>password
>and i have to look that up in the database. do i use a deferred? just
>how
>bad should the query be before using a deferred?
It's not a question of speed, it's a question of blocking. If you are
doing CPU-intensive stuff, you might want to put it into a separate
process so you don't need to break it up into lots of little chunks.
(Look into spawnProcess.) However, in general the things that use
Deferreds are the things that generate some output and then wait for
some input in response.
>(reading the twisted docs is like reading a brick wall for me, it would
>be
>nice if someone could just explain things to me in simple terms.)
Good luck.
More information about the Twisted-web
mailing list