[Twisted-Python] rewrite of flow.py completed

Clark C. Evans cce at clarkevans.com
Fri Apr 11 03:38:34 EDT 2003


Ok.  The complete re-write of flow.py based on the work of 
extrepum ("Bob Ippolito") is finished, with a test case
checked in.  This differs from extrepum's work [1] in 
some ways, but is almost identical in others.

1.  The implementation and test suite does not use generators,
    thus it is safe for 2.1 usage... although I can't imagine
    anyone using it without generators.   But the advantage 
    is that it could go into the current Twisted code base.
    The test suite uses iterators (with the corresponding
    generator commented out).

2.  This implementation has 4 distinct components:

    a) the core 'Flow' class and its supporting items, could be
       put into something like twisted.python.flow as they only
       depend upon other twisted.python stuff.  

    b) There is a DeferredFlow which uses Flow as a mix-in with
       Deferred, with behavior similar to DeferredList.  This 
       could go into twisted.internet.defer

    c) There is a ThreadedIterator which depends upon DeferredFlow, 
       and can be used to merge blocking behavior into the 
       framework.   This could go into twisted.internet.threads

    d) There is a very small class which builds upon ThreadedIterator
       that uses a pool from twisted.enterprise.adbapi; however, 
       in this context the async features of adbapi more or less
       get in the way.   So, perhaps this is best left as an example.

    In particular, extrepum's work build on top of Deferred, which
    I think is unnecessary; thus quite a bit of simplification 
    occurred when stripping out the Deferred stuff, and instead
    treating DeferredFlow as a cross product or mix-in of 
    Flow and Deferred.

3.  Extrepum's work was more 'flat' and didn't take recursive
    use of the technique into consideration; or perhaps I just 
    didn't understand how it would handle recursive usage.
    In any case, this implementation allows generators to call
    other generators and provides a nice stack unwinding, etc.

4.  I'd like to eventually look more at extrepum's work to 
    see if he had additional aspects that I've missed.  If 
    those aspects are generally useable, then I'll include 
    them.

Anyway, that's about it.  I'm now rebuilding some of my internal
code to use this module and ripping the old flow.py out.  This 
flow overall has been *greatly* simpler usage -- and I can't
thank extrepum enough for pointing me to his usage of generators
to create async flows within Twisted.

I'd also like to thank Itamar, Moshez, Radix, and Exarkun for
helping me along and pointing me in the right directions.

Best,

Clark

[1] twistedmatrix.com/pipermail/twisted-python/2003-February/002808.html

# Twisted, the Framework of Your Internet
# Copyright (C) 2003 Matthew W. Lefkowitz
#
# This library is free software; you can redistribute it and/or
# modify it under the terms of version 2.1 of the GNU Lesser General
# Public License as published by the Free Software Foundation.
#
# This library is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
# Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public
# License along with this library; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307
# USA
#
#
""" Flow -- async data flow

    This module provides a mechanism for using async data flows through
    the use of generators.  While this module does not use generators in
    its implementation, it isn't very useable without them.   A data flow
    is constructed with a top level generator, which can have three 
    types of yield statements:  flow.Cooperate, flow.Generator, or
    any other return value with exceptions wrapped using failure.Failure
    An example program...

    from __future__ import generators
    import flow
    def producer():
        lst = flow.Generator([1,2,3])
        nam = flow.Generator(['one','two','three'])
        while 1:
            yield lst; yield nam
            if lst.stop or nam.stop: 
                return
            yield (lst.result, nam.result)

    def consumer():
        title = flow.Generator(['Title'])
        yield title
        print title.getResult()
        lst = flow.Generator(producer())
        try:
            while 1:
                yield lst
                print lst.getResult()
        except flow.StopIteration: pass

    flow.Flow(consumer()).execute()

"""
from twisted.python import failure
from twisted.python.compat import StopIteration, iter

class FlowCommand: 
    """ Objects given special meaning when returned from yield """
    pass

class Cooperate(FlowCommand):
    """ Represents a request to delay and let other events process

        Objects of this type are returned within a flow when
        the flow would block, or needs to sleep.  This object
        is then used as a signal to the flow mechanism to pause
        and perhaps let other delayed operations to proceed.
    """
    def __init__(self, timeout = 0):
        self.timeout = timeout

class Generator(FlowCommand):
    """ Wraps a generator or other iterator for use in a flow 

        Creates a nested generation stage (a producer) which can provide
        zero or more values to the current stage (the consumer).  After 
        a yield of this object when control has returned to the caller,
        this object will have two attributes:

            stop    This is true if the underlying generator has not 
                    been started (a yield is needed) or if the underlying
                    generator has raised StopIteration

            result  This is the result of the generator if it is active, 
                    the result may be a fail.Failure object if an 
                    exception was thrown in the nested generator.
    """      
    def __init__(self, iterable):
        self._next  = iter(iterable).next
        self.result = None
        self.stop   = 1
    def isFailure(self):
        """ return a boolean value if the result is a Failure """ 
        if self.stop: raise StopIteration()
        return isinstance(self.result, failure.Failure)
    def getResult(self):
        """ return the result, or re-throw an exception on Failure """
        if self.isFailure():
            raise (self.result.value or self.result.type)
        return self.result
    def _generate(self):
        """ update the active and result member variables """ 
        try:
            self.result = self._next()
            self.stop = 0
        except StopIteration:
            self.stop = 1
            self.result = None
        except Cooperate, coop:
            self.stop = 0
            self.result = coop
        except failure.Failure, fail:
            self.stop = 1
            self.result = failure
        except:
            self.stop = 1
            self.result = failure.Failure()

class Flow:
    """ A flow contruct, created with a top-level generator/iterator

        The iterable provided to this flow is the top-level consumer
        object.  From within the consumer, multiple 'yield' calls can
        be made returning either Cooperate or Generate.  If a Generate
        object is returned, then it becomes the current context and
        the process is continued.  Communication from the producer 
        back to the consumer is done by yield of a non FlowItem
    """
    def __init__(self, iterable):
        self.results = []
        self._stack  = [Generator(iterable)]
    def _addResult(self, result):
        """ private called as top-level results are added"""
        self.results.append(result)
    def _execute(self):
        """ private execute, execute flow till a Cooperate is found """
        while self._stack:
            head = self._stack[-1]
            head._generate()
            if head.stop:
                self._stack.pop()
            else:
                result = head.result
                if isinstance(result, FlowCommand):
                    if isinstance(result, Cooperate):
                        return result.timeout
                    assert(isinstance(result, Generator))
                    self._stack.append(result)
                else:
                    if len(self._stack) > 1:
                        self._stack.pop()
                    else:
                        if self._addResult(result):
                            return
    def execute(self):
        """ continually execute, using sleep for Cooperate """
        from time import sleep
        while 1:
            timeout = self._execute()
            if timeout is None: break
            sleep(timeout)

from twisted.internet import defer
class DeferredFlow(Flow, defer.Deferred):
    """ a version of Flow using Twisted's reactor and Deferreds
 
        In this version, a call to execute isn't required.  Instead,
        the iterable is scheduled right away using the reactor.  And,
        the Cooperate is implemented through the reactor's callLater.
 
        Since more than one (possibly failing) result could be returned,
        this uses the same semantics as DeferredList
    """
    def __init__(self, iterable, delay = 0, 
                 fireOnOneCallback=0, fireOnOneErrback=0):
        """initialize a DeferredFlow
        @param iterable:          top level iterator / generator
        @param delay:             delay when scheduling reactor.callLater
        @param fireOnOneCallback: a flag indicating that the first good 
                                  yielded result should be sent via Callback
        @param fireOnOneErrback:  a flag indicating that the first failing
                                  yield result should be sent via Errback
        """
        from twisted.internet import reactor
        defer.Deferred.__init__(self)
        Flow.__init__(self,iterable)
        self.fireOnOneCallback = fireOnOneCallback
        self.fireOnOneErrback  = fireOnOneErrback
        reactor.callLater(delay, self._execute)
    def execute(self): 
        raise TypeError("Deferred Flow is auto-executing") 
    def _addResult(self, result):
        """ emulate DeferredList behavior, short circut if event is fired """
        if not self.called:
            if self.fireOnOneCallback:
                if not isinstance(result, failure.Failure):
                    self.callback((result,len(self.results)))
                    return 1
            if self.fireOnOneErrback:
                if isinstance(result, failure.Failure):
                    self.errback(fail.Failure((result,len(self.results))))
                    return 1
            self.results.append(result)
    def _execute(self):
        timeout = Flow._execute(self)
        if timeout is None:
            if not self.called:
                self.callback(self.results)
        else:
            from twisted.internet import reactor
            reactor.callLater(timeout, self._execute)
 
#
# The following is a thread package which really is othogonal to
# Flow.  Flow does not depend on it, and it does not depend on Flow.
# Although, if you are trying to bring the output of a thread into
# a Flow, it is exactly what you want.   The QueryIterator is 
# just an obvious application of the ThreadedIterator.
#

class ThreadedIterator:
    """
       This is an iterator base class which can be used to build
       iterators which are constructed and run within a Flow
    """
     
    def __init__(self):
        tunnel = _TunnelIterator(self)
        self._tunnel = tunnel
     
    def __iter__(self): 
        from twisted.internet.reactor import callInThread
        callInThread(self._tunnel.process)
        return self._tunnel
     
    def next(self):
        """ 
            The method used to fetch the next value, make sure
            to return a list of rows, not just a row
        """
        raise StopIteration

class _TunnelIterator:
    """
       This is an iterator which tunnels output from an iterator
       executed in a thread to the main thread.   Note, unlike
       regular iterators, this one throws a PauseFlow exception
       which must be handled by calling reactor.callLater so that
       the producer threads can have a chance to send events to 
       the main thread.
    """
    def __init__(self, source):
        """
            This is the setup, the source argument is the iterator
            being wrapped, which exists in another thread.
        """
        self.source     = source
        self.isFinished = 0
        self.failure    = None
        self.buff       = []
     
    def process(self):
        """
            This is called in the 'source' thread, and 
            just basically sucks the iterator, appending
            items back to the main thread.
        """
        from twisted.internet.reactor import callFromThread
        try:
            while 1:
                val = self.source.next()
                self.buff.extend(val)    # lists are thread safe
        except StopIteration:
            callFromThread(self.stop)
        self.source = None
     
    def setFailure(self, failure):
        self.failure = failure
     
    def stop(self):
        self.isFinished = 1
     
    def next(self):
        if self.buff:
           return self.buff.pop(0)
        if self.isFinished:  
            raise StopIteration
        if self.failure:
            raise self.failure
        raise Cooperate()

class QueryIterator(ThreadedIterator):
    def __init__(self, pool, sql, fetchall=0):
        ThreadedIterator.__init__(self)
        self.curs = None
        self.sql  = sql
        self.pool = pool
        self.data = None
        self.fetchall = fetchall
     
    def __call__(self,data):
        self.data = data
        return self
     
    def next(self):
        if not self.curs:
            conn = self.pool.connect()
            self.curs = conn.cursor()
            if self.data: self.curs.execute(self.sql % self.data) 
            else: self.curs.execute(self.sql)
        if self.fetchall:
            res = self.curs.fetchall()
        else:
            res = self.curs.fetchmany()
        if not(res): 
            raise StopIteration
        return res

----- End forwarded message -----




More information about the Twisted-Python mailing list