[Twisted-Python] Soon to be not-a-newbie?

Sun Jan 27 21:47:48 EST 2008

On Sat, 2008-01-26 at 12:51 +0100, Maarten ter Huurne wrote:
> On Saturday 26 January 2008, Tristan Seligmann wrote:

> > Most deferred operations have no need to 
> > be serialized, and shouldn't be; the linear nature of
> > inlineCallbacks-style makes it very easy to accidentally serialize
> > operations that could otherwise run in parallel instead. Even if you
> > write the generator correctly, it's not as obvious what the actual flow
> > is since it has been crunched into a linear-looking function.

I dunno about this. Serialisation is really common, in my experience.
People are also more used to thinking about things in terms of
sequential actions (do this, then do this, then do this other thing),
which is probably one reason why parallel programming is hard, and why
inlineCallbacks are so appealing.

> There are quite a few cases in which the dependencies between the operations 
> force sequential processing. In those cases, inline callbacks are useful.
> 
> For example, to serve a web page, I want to authenticate the user, then run 
> a database query and finally present the result. Running the query before 
> authenticating the user is not something I'd recommend. In some cases it 
> might be possible to start presenting results before all queries are 
> finished, but in many cases that is not worth the complexity.

Hear hear.

I have a bunch of code that has benefited from refactoring to use
inlineCallbacks for exactly these reasons. Sometimes there is a
sequential flow of data from function to function:

for device in device_list:
  get_some_data_from_device()
    |--parse_data_and_fetch_more_based_on_parse_result()
       |--parse_this_stuff_and_insert_into_database_maybe()
          |--reschedule_poll_for_data()

Manually setting up the deferredChain made my code hard (for me) to
read. Maybe I'm just a bad coder.

> Writing all your routines as inline callbacks without thinking about the 
> dependencies is a bad idea though.

Not thinking about what you're doing is generally bad, yes, but what if
you were to write everything with inlineCallbacks, then optimise to make
needlessly sequential parts parallel?

I've used inlineCallbacks to quickly convert someone else's blocking
Python code to async twisted code, with very little change to the code's
structural flow. The last time I did this, without inlineCallbacks, was
painful, buggy and slow. Anything that helps me unblock code with
twisted is good.

> > Finally, 
> > it is extremely hard to unit test a generator using inlineCallbacks, as
> > there is no easy way of resuming the generator at certain points with
> > certain state to test each part of the generator.
> 
> If the code using inline callbacks looks like this, there is no problem in 
> testing the parts separately:
> 
> 	result1 = yield function1(arg)
> 	result2 = yield function2(result1)
> 
> > I'm not necessarily convinced that inlineCallbacks is always bad, but it
> > certainly leads to subtle traps in most cases, while providing little
> > real benefit (despite the perceived benefit).
> 
> I've converted some routines I wrote before Twisted 2.5 to inline callbacks 
> and it became a lot easier for me to read. Like any tool, it can be used or 
> abused, but I definately think it has its uses.

Yep. The biggest benefit for me has been to make my twisted code feel
more readable overall, which is one of the best bits about Python. In
the past I've felt a bit guilty when writing complex deferred chains
with twisted.. almost like I'd just done something in perl. ;)

Heh. I'm such a fanboy.

-- 
Justin Warren <daedalus at eigenmagic.com>