[Twisted-Python] Re-working a synchronous iterator to use Twisted

Terry Jones terry at jon.es
Tue Jun 17 17:10:48 EDT 2008


For the record, here's a followup to my own posting, with working code.
The earlier untested code was a bit of a mess. The below runs fine.

In case it wasn't clear before, you're pulling "results" (e.g., from a
search engine) in off the web. Each results pages comes with an indicator
to tell you whether there are more results. I wanted to write a function
(see processResults below) that, when called, would call the process
function below on each result, all done asynchronously.

This solution feels cumbersome, but it does work (aka prints the expected
output).

Comments welcome (just don't tell me to use version control :-))

Terry


from twisted.internet import defer

def process(result):
    # Stub: process a single result.
    print 'processed result', result

def getPage(uri, offset=0):
    # Stub: return some results and an indicator of whether more are available.
    data = [[0, 1, 2],
            [4, 7, 8, 10],
            [12, 14, 18, 30]]
    results = data[offset]
    more = offset < len(data) - 1
    return defer.succeed((results, more))
        
def getResultsFromPage(page):
    # Stub: get the results from page, return them in a list
    return page[0]

def needToCallAgain(page):
    # Stub: determine if there are more results, given current page. return bool.
    return page[1]
    
# ASYNCHRONOUS result producer
def getResults(uri, offset=0):
    def parsePage(page, offset):
        results = getResultsFromPage(page)
        if needToCallAgain(page):
            d = getResults(uri, offset + 1)
        else:
            d = None
        return results, d
    def returnTheseResults(page, offset):
        resultIterator, done = parsePage(page, offset)
        return resultIterator, done
    return getPage(uri, offset).addCallback(returnTheseResults, offset)

# ASYNCHRONOUS calling
def processResults(uri):
    def cb((resultIterator, deferred)):
        for result in resultIterator:
            process(result)
        if deferred is not None:
            deferred.addCallback(cb)
    return getResults(uri).addCallback(cb)


if __name__ == '__main__':
    def finished(x):
        print 'finished'
    processResults('uri').addCallback(finished)



The above prints

$ python xxx.py
processed result 0
processed result 1
processed result 2
processed result 4
processed result 7
processed result 8
processed result 10
processed result 12
processed result 14
processed result 18
processed result 30
finished




More information about the Twisted-Python mailing list