[Twisted-Python] Should I use asynchronous programming in my own modules?

Thu Oct 18 09:09:39 EDT 2007

On Thu, 2007-10-18 at 14:41 +0200, Jürgen Strass wrote:

> I'm rather new to twisted and asynchronous programming in general. 
> Overall, I think I've understood the asynchronous programming model and 
> its implications quite well. Nevertheless, there are some remaining 
> questions.
> 
> To give some example, I'd like to develop my own simplified document 
> format in XML and a corresponding parser. The output of the parser (a 
> specialized document object model) will be traversed and translated into 
> HTML afterwards. This module could be useful outside any twisted 
> application, of course. Instead of generating HTML one could develop a 
> generator that produces LaTeX, for example. But it could also be used to 
> render HTML pages in a twisted web application. The question is this: 
> since parsing and generating large documents could block the reactor in 
> a twisted app, should I use any of twisted's asynchronous programming 
> features in this module (for better integration with twisted) or should 
> I rather develop it in a traditional way and run it in a thread?

What you mean by "traditional" is actually a pull parser. Parsing APIs
can be pull or push (i.e. asynchronous). Well-designed parsers are
always push, because push parsers can be trivially converted to blocking
pull parsers, but not vice-versa. Some examples of push/asynch parsers:
twisted's Protocol class, or the SAX API.

The key difference: a pull parser will *read* data from some data source
with a blocking API. A push parser gets the data *pushed* to it by the
user.

So: write a push parser. You won't need to use any Twisted facilities.
To make things a bit clearer - here's how you convert a push parser into
a pull parser:

def parse(f):
   parser = MyParser()
   for line in f:
      parser.push(line)
   return parser.result()

In Twisted, a push parser will often get data pushed to it from
Protocol.dataReceived.