[Twisted-Python] Benchmark of Python WSGI servers

Fri Mar 18 20:54:26 EDT 2011

On Mar 18, 2011, at 7:44 PM, Michael Thompson wrote:

> From the guys who brought you async socket benchmark,
> http://nichol.as/asynchronous-servers-in-python, comes Python WSGI
> benchmark
> http://nichol.as/benchmark-of-python-web-servers.

Yep, I've seen that before.  It's one of the better benchmarks of its kind in the Python world, but unfortunately stops short of being good :).

The benchmark isn't really saying much that's interesting about WSGI servers anyway.  It mostly says "all of these servers are more than 20x faster than your WSGI app could ever possibly be, if it does anything interesting, so at most the server will account for 5% of your performance".  The logical conclusion: regardless of what server you're using, go optimize your app first.

While I'd love for Twisted to come out on top of that chart (it's always best to win at things, right?), such an improvement would be of little practical benefit to our users.  First because almost nobody has a WSGI app that is so trivial that it would be significantly helped by speeding up that part of the server, and also the fact that anyone with serious performance requirements in Twisted will be optimizing by calling Resource and Request APIs directly, asynchronously in the main loop (perhaps with multiple processes), not threading WSGI handlers for the critical fast path in their application.  Which, I hasten to remind you, is rarely all of your application.  A performance improvement to static.File, like making it truly non-blocking, would probably be a more significant benefit to most websites that want to be fast than making the thing that calls a WSGI function fast.

> Is twisted coming out of this so badly because they are using the
> default reactor, as opposed to epoll?

There isn't really enough analysis to determine why exactly Twisted fares poorly on this particular benchmark.

My pet theory is that it has something to do with transferring data from threads to the I/O loop via queue synchronization, and not being as smart as it could be about buffering, and that particular technique getting slammed really hard for very small request/response pairs.  I hypothesize that more buffering would occur with larger responses with more chunks, and that would bring Twisted's performance up to those of these other servers.

But it's hard to say, and, as I said above the benchmark isn't measuring anything too interesting, so it's hard to work up the motivation to find out.

> Perhaps the default reactor should be the best available rather than
> the lowest common denominator.

See <http://twistedmatrix.com/trac/ticket/2234>.  There should be a ticket for the broader goal too, and maybe it's already filed; I couldn't find it quickly.