[Twisted-Python] Thoughts on Producer / Consumer

Clark C. Evans cce at clarkevans.com
Thu Mar 6 00:11:04 MST 2003


While I'm musing... I have a few thoughts on Producer/Consumer as well;
but I don't yet have any experience with the Twisted constructs yet...
so I could be speaking incorrectly.

Before I get going, I wanted to point out that there is some overlap
with this producer/consumer mechanism and the Deferred mechnaism.  In
particular, the callback chain processing in _runCallbacks seems very
much like a sequence of steps in an operation.  And, the Deferred
package also has a mechanism to merge two or more deferreds into a
single deferred, etc.  These things are very much in the spirit of a
event 'reactor' processing.  This observation is especially true when
you allow the Deferred operation to 'return multiple times', aka one
callback per row.

That said, I expected a Producer/Consumer chain to have something like
the following:
  
   class Producer:
      def cancel(self): pass           # permanently stop processing
      def pause(self): pass            # stop producing events
      def resume(self): pass           # resume after being paused
      
   class Consumer(self):
      def start(self): pass            # called before events start
      def handle(self,data): pass      # called for each event
      def finish(self): pass           # called after events are done

   class Processor(Producer,Consumer): pass

   Further, it could be possible that a consumers and producers have
   a many-to-one relationship, in which case, all of the Producer
   functions would take a consumer argument and all of the Consumer
   functions would take a producer argument.

The IConsumer interface seems close, I guess start/handle/finish is the
same as registerProducer/write/unregisterProducer.  Besides not liking
the names, I'm not sure if it is semantically the same as what I was
thinking.  The registerProducer call seems to hint at a
many-producer-to-one-consumer call model, where a consumer can be the
target of many producers.  However, I don't see a corresponding producer
argument in the 'write' or 'unregisterProducer'... so that can't be it.
Lastly, I think that 'write' is too specific, in a database case you
want to hand rows to the consumer, not strings (as implied by the
'write' name).

I grok the IProducer interface much better, in fact it has almost the
same arguments that I expected.  As for the 'streaming' flag, if it is
set, it seems that the producer will generate one event and then
automatically pause.  So, I'd rename streaming to "autopause" or
"pauseonwrite" as this better reflects the semantics -- right now, after
just reading the code, I'm not sure if 'streaming' means that it pauses
automatically after each write or not.  Further, the choice if a
producer is started initially or stopped initally seems one that should
be made higher-up; tying it to 'streaming' or 'unstreaming' doesn't
really make sense.  Finally, the producer interface should probably have
two attributes, 'autopause' and 'isPaused' or something equivalent; the
registerProducer function doesn't need the streaming flag.

Exception handling doesn't seem to be spelled out at all in this chain,
perhaps it need not be.  I don't know, but it seems that some sort of
error propigation would be very useful so that the offending 'initial'
data can be found when something down-stream goes bezerk.  So, someway
to collect a stack trace up the producer/consumer chain would be quite
useful.  This could be done with one function on the Producer:
getErrorContext(), which returns a string

Another useful thing for this model to do is allow a 'tag' to be passed
as an argument to each of the start/handle/finish functions.  This,
together with allowing nested start/finish calls would allow
hierarchical streams to be handled.  In this model, the 'start' call
could provide a subordinate consumer to handle events for a particular
child subtree.   This mechanism is very handy for common
content-producing patterns.

Lastly, the linkage back to the Deferred could be done by a
DeferredProcess, where each item added to a DeferredProcess is one of a
Producer (the first item in the stack), a Process (middle items), and a
Consumer (final item).

Clark




More information about the Twisted-Python mailing list