[Twisted-web] [Twisted-Python] Speed of rendering?

Wed Feb 27 06:55:19 EST 2013

On Wed, Feb 27, 2013, at 0:39, Glyph wrote:
> 
> On Feb 26, 2013, at 10:05 AM, Peter Westlake <peter.westlake at pobox.com>
> wrote:
> 
> > On Sun, Jan 6, 2013, at 20:22, exarkun at twistedmatrix.com wrote:
> >> On 12:48 am, peter.westlake at pobox.com wrote:
> >>> On Fri, Jan 4, 2013, at 19:58, exarkun at twistedmatrix.com wrote:
> > ...
> >> Codespeed cannot handle more than one result per benchmark.
> >>>> The `timeit` module is probably not suitable to use to collect the data
> > .....
> >>> What method would you prefer?
> >> 
> >> Something simple and accurate. :)  You may need to do some investigation 
> >> to determine the best approach.
> > 
> > 1. This is simple:
> > 
> >    def do_benchmark(content):
> >         t1 = time.time()
> >         d = flatten(request, content, lambda _: None)
> >         t2 = time.time()
> >         assert d.called
> >         return t2 - t1
> > 
> > Do you think it's acceptably accurate? After a few million iterations,
> > the relative error should be pretty small.
> 
> Well it rather depends on the contents of 'content', doesn't it? :)

Yes, sorry, the loop is meant to be around the flatten call!
Corrected version below.

> I think we have gotten lost in the weeds here. We talked
> about using benchlib.py initially, and then you noticed a
> bug, and it was mentioned that benchlib.py was mostly
> written for testing asynchronous things and didn't have
> good support for testing the simple case here, which is
> synchronous rendering of a simple document. However, one
> of twisted.web.template's major features - arguably its
> reason for existing in a world that is practically overrun
> by HTML templating systems - is that it supports
> Deferreds. So we'll want that anyway.

That's true, and I'll include some Deferreds in the content
to be flattened. But if the Deferreds actually do any
lengthy processing, it makes a nonsense of the benchmark. It
only makes sense to use ones that have already fired, i.e.
defer.succeed(...). The other benchmarks are testing
asynchronous operations, as names like "ssl_throughput"
suggest. Flattening doesn't do any of that, and I'm only
trying to measure the speed of flattening.

> The right thing to do here would be to update benchlib
> itself with a few simple tools for doing timing of
> synchronous tasks, and possibly also to just fix the
> unbounded-recursion bug that you noticed, not to start
> building a new, parallel set of testing tools which use
> different infrastructure. That probably means implementing
> a small subset of timeit.

I'm not convinced that the unbounded recursion is actually a
bug. A callback on a fired Deferred will be executed
immediately, and that's correct behaviour. There's no chance
to return control to the reactor, and even if there was,
anything that happened in that time would only skew the
results. The real problem is that recursion-by-Deferred
doesn't have the optimisation for tail recursion found in
most functional languages, because that would be very
difficult and it's not how Deferreds are usually used.

> > 2. For the choice of test data, I had a quick search for
> > benchmarks from other web frameworks. All I found was
> > "hello world" benchmarks, that test the overhead of the
> > framework itself by rendering an empty page. I'll
> > include that, of course.
> 
> "hello world" benchmarks have problems because start-up
> overhead tends to dominate. A realistic web page with some
> slots and renderers sprinkled throughout would be a lot
> better. Although even better would be a couple of cases -
> let's say small, large-sync, and large-async - so we can
> see if optimizations for one case hurt another.

Yes, I'm just making my excuses for not copying benchmarks
from an existing framework.

> As Jean-Paul already mentioned in this thread, you can't
> have more than one result per benchmark, so you'll need to
> choose a fixed number of configurations and create one
> benchmark for each.
> 
> > 3. Regarding option parsing, is there any reason to
> >    prefer twisted.python.usage.Options over [...]
> 
> The reason to prefer usage.Options is consistency. ...

OK

>  The thing to implement would be a different driver() function that makes
> a few simple synchronous calls without running the
> reactor.

If you don't mind the overhead of an extra function call,
that could be as simple as:

def sync_benchmark(iterations, name, func, *args):
    t1 = time.time()
    for _ in range(iterations):
        func(*args)
    t2 = time.time()
    benchlib.benchmark_report(iterations, t2 - t1, name)

I'm not sure if options['iterations'] would be the right
thing to use here, because it gives the number of times to
repeat the whole benchmark, not the number of times round
the inner loop. The async code uses options['duration'],
but there would be more overhead to run synchronous code
for a given duration.

Peter.