[Twisted-Python] Should I use asynchronous programming in my own modules?

Fri Oct 19 04:32:59 EDT 2007

Johann Borck schrieb:
> [...]
> I think the main misunderstanding is "[..]to use the reactor's
> scheduling mechanism instead of running it in a thread."  Twisteds
> reactor is not a superior multi-purpose scheduler (as JP mentioned),
> but a domain-specific event handler for networking. While your
> use-case might (that's my guess) profit from choosing 'chunking' over
> pythons threading, it still wouldn't from choosing it over the
> scheduling of your OS.
>
> hm, did I get you right there?
>
>   
Oh yes, you're right. The whole time I thought of the reactor as a 
multi-purpose scheduler and didn't get JP's answer right. The 
misunderstanding in part results from the core documentation. In chapter 
1.3, it is said: "This document will give you a high level overview of 
concurrent programming (interleaving several tasks) and of Twisted's 
concurrency model: non-blocking code or asynchronous code." Then, two 
examples follow: (1) CPU bound tasks and (2) tasks that wait for data. 
Because I thought the introduction applied to both type of examples, I 
also assumed that Twisted's concurrency model would apply for both types 
of tasks. Of course, I already wondered about not being able to find any 
examples for (1), while the whole rest of the docs deals with (2). ;-)

Moreover, I once read Douglas Schmidt's book "Pattern Oriented Software 
Architecture (2)", which describes several patterns - including the 
reactor pattern - for middleware-oriented applications. The book led to 
some confusion on mine about when to use which pattern and what 
concurrency mechanism would be best for a particular situation.

The - maybe wrong - conclusion I've drawn from that book is that context 
switching overhead (be it threads or processes) isn't only bad for I/O 
bound tasks, but also for most other concurrent tasks. As you said,  
function calls in python are expensive, nevertheless I thought they were 
less expensive than the overhead caused by context switching between 
threads and processes, at least on a single processor system. Or have I 
made a mistake here? Moreover, couldn't the creation of whole new 
processes be even more expensive? I mean, with "long" running algorithms 
I really meant tasks that could take some minutes. Processes would be 
very fine here. But what about - for example - CPU bound tasks that only 
take some hundred milliseconds, but nevertheless would block the 
reactor? Would you use processes in this case, too? Maybe prespawned 
processes? Or should I use rather threads in such a case?

Many thanks for your enlighting reply,
Jürgen