[Twisted-Python] Benchmark of Python WSGI servers
exarkun at twistedmatrix.com
exarkun at twistedmatrix.com
Fri Mar 18 21:35:08 EDT 2011
On 12:54 am, glyph at twistedmatrix.com wrote:
>On Mar 18, 2011, at 7:44 PM, Michael Thompson wrote:
>> From the guys who brought you async socket benchmark,
>>http://nichol.as/asynchronous-servers-in-python, comes Python WSGI
>Yep, I've seen that before. It's one of the better benchmarks of its
>kind in the Python world, but unfortunately stops short of being good
>The benchmark isn't really saying much that's interesting about WSGI
>servers anyway. It mostly says "all of these servers are more than 20x
>faster than your WSGI app could ever possibly be, if it does anything
>interesting, so at most the server will account for 5% of your
>performance". The logical conclusion: regardless of what server you're
>using, go optimize your app first.
>While I'd love for Twisted to come out on top of that chart (it's
>always best to win at things, right?), such an improvement would be of
>little practical benefit to our users. First because almost nobody has
>a WSGI app that is so trivial that it would be significantly helped by
>speeding up that part of the server, and also the fact that anyone with
>serious performance requirements in Twisted will be optimizing by
>calling Resource and Request APIs directly, asynchronously in the main
>loop (perhaps with multiple processes), not threading WSGI handlers for
>the critical fast path in their application. Which, I hasten to remind
>you, is rarely all of your application. A performance improvement to
>static.File, like making it truly non-blocking, would probably be a
>more significant benefit to most websites that want to be fast than
>making the thing that calls a WSGI function fast.
>>Is twisted coming out of this so badly because they are using the
>>default reactor, as opposed to epoll?
>There isn't really enough analysis to determine why exactly Twisted
>fares poorly on this particular benchmark.
>My pet theory is that it has something to do with transferring data
>from threads to the I/O loop via queue synchronization, and not being
>as smart as it could be about buffering, and that particular technique
>getting slammed really hard for very small request/response pairs. I
>hypothesize that more buffering would occur with larger responses with
>more chunks, and that would bring Twisted's performance up to those of
>these other servers.
>But it's hard to say, and, as I said above the benchmark isn't
>measuring anything too interesting, so it's hard to work up the
>motivation to find out.
>>Perhaps the default reactor should be the best available rather than
>>the lowest common denominator.
>See <http://twistedmatrix.com/trac/ticket/2234>. There should be a
>ticket for the broader goal too, and maybe it's already filed; I
>couldn't find it quickly.
This all seems right on to me. I just wanted to add that of the "top
performers", there is some difference in what's being benchmarked. Some
of them use green threads instead of threads. Some of them are
multiprocess. Compared to a thread-based WSGI container, these
approaches have some performance benefits. If someone wanted to make
Twisted WSGI benchmark better, implementing one (or both) of these
approaches would be one good way to go about it.
A multi-process WSGI container might actually be of practical use, since
it may make more cores available to your server. If an application is
bottlenecked on CPU rather than some high-latency operation (as you can
only process as many concurrent requests as you have threads in your
threadpool), more cores can help.
More information about the Twisted-Python