[Twisted-Python] Re-working a synchronous iterator to use Twisted
Terry Jones
terry at jon.es
Sat Jun 28 16:40:11 MDT 2008
Following the massive interest in my earlier postings on this thread, I'm
following up to myself again :-)
Here's what I was trying to do:
> In case it wasn't clear before, you're pulling "results" (e.g., from a
> search engine) in off the web. Each results pages comes with an indicator
> to tell you whether there are more results. I wanted to write a function
> (see processResults below) that, when called, would call the process
> function below on each result, all done asynchronously.
I posted some cumbersome code to roughly do that. I've since been thinking
about this on and off, with help from Esteve Fernandez, and we've made the
code quite a bit simpler.
I think there's a general pattern here that's worth thinking about.
Roughly: the above need is like the Twisted analogy of using iterators in
regular synchronous programming.
By that I mean that the normal pattern of Twisted usage is: a single event
is anticipated (by the programmer), it occurs once, and its result is
passed down a call/errback chain. That's roughly like a single function
call in synchronous code.
But if you are expecting a sequence of external events to occur and you
want to asynchronously pass their results in turn down a call/errback
chain. The need to do this in synchronous code can be filled with a simple
iterator. But doing this asynchronously (when the fetch of the next batch
of results might take a while) doesn't seem to fit easily into the
single-shot asynchronous Twisted paradigm.
I thought about modifying defer.py to allow a callback chain to be called
multiple times (and to have the "normal" single-shot chain be a special
case). But that was clearly going to get messy. BTW, I find defer.py is
really elegant.
After more thinking about how to make my previously posted code simpler,
Esteve and I came up with what you'll find at
http://python.pastebin.com/f7df56752 (code) and
http://python.pastebin.com/f1e582264 (simple tests)
The idea is that you provide a result fetcher function to the TwIterator
class. This function will be called repeatedly, as needed, to get more
results. It returns a deferred whose callback it should call with a list of
next results (which may be empty), a bool to indicate whether to re-call
the function, and a dict of args to pass to it next time.
The TwIterator class provides you with a list() method that you can use
almost like an iterator:
@inlineCallbacks
def printer(results):
for x in results:
print (yield x)
fetcher.list().addCallback(printer)
This is in some sense like a general asynchronous iterator for Twisted. The
printer function receives an iterator, each element of which is a deferred,
and when that deferred fires it produces the next result.
The test code gives 4 simple example result-fetching functions, and calls
them all asynchronously. If you run it you'll see the results coming out in
a somewhat random order.
I wont go into more detail, given that no-one responded to the first two
postings. It's still possible that I'm trying to solve a problem that can
already be done by some standard Twisted module. I don't know enough about
Twisted to know for sure.
Terry
More information about the Twisted-Python
mailing list