[Twisted-Python] Really Basic clarification on defers

Johann Borck johann.borck at densedata.com
Tue Aug 4 17:14:03 MDT 2009


On Tue, Aug 4, 2009 at 10:08 AM, John Aherne <johnaherne at rocs.co.uk 
<mailto:johnaherne at rocs.co.uk>> wrote:
> This is a really basic problem we are trying to decide about,
>
> We have programs that run quite happily, so far. Its main task is to 
> receive data from port A and send it out via port B. Then receive data 
> via port B and send it out via port A. It's pretty much like a chat 
> setup. You just build up a list of connected clients and send data to 
> them as required
>
> One side A receives some input from a tcp port - about 100-200 
> characters, and forwards it to another port B. We do not need to wait 
> for any response. If we get a response we pick that up through line 
> receiver. We also run a calllater to check if we got a response on 
> linereceiver within the timeframe specified. If not we drop the 
> connection.
>
> Traffic coming in from port B is analysed and some subset is sent back 
> to port A.
>
> Ignoring port A for the moment, just concentrating on port B, we have 
> tried three options:--
>
> 1. We set up a defer to handle the sendline to port B so that the 
> reactor would schedule it in its own good time.

The reactor always schedules reads and writes "in its own good time", 
which means it writes whenever there's data to write and the socket is 
ready for writing. If you have data that can't be written at once, 
because it's too much for the socket to handle in a non-blocking 
fashion, the reactor (along with the transport) will take care of it, 
and defer its delivery itself, no need for any deferreds you'd had to 
care about here.

Correct me if I'm wrong, but as I understand your description, option 1. 
and 2. do not behave identically. This is how I interpret it:
option 1:

A sends msg1 to [svc] : wrap msg1 in deferred1
[ - time - ]
B sends data? to [svc] :
                                        1. callback deferred1: [svc] 
sends msg1 to B
                                        2. handle data?
B sends rsp1 to [svc]: [svc] sends rsp1 to A

option 2:

A sends msg1 to [svc] : [svc] sends msg1 to B
B sends rsp1 to [svc] :  [svc] sends rsp1 to A

If this is the case, you rely on some data? being sent to [svc] before 
msg1 can be forwarded to B. That means that you have msg1 in memory 
until you receive data? from B. This doesn't cause problems in your 
case, since you handle small messages in big intervals. But if you'd 
increase the load significantly, you'd also need significantly more RAM 
for no good reason. A case where option 1 might make sense would be if 
it depended on data? provided by B, to decide if or how to continue 
processing msg1. Then you had a valid use-case for deferreds. Since 
there are no such requirements, option 2 is definitely the right choice.

> No threads involved using the standard twisted setup. When we get a 
> response through receiveline we fire the callback defer. If we timeout 
> via callLater we fire the errback to clear the defer. In this case the 
> defer does not seem to be doing very much
>
> 2. Now a fresh pair of eyes is looking at the code and saying why are 
> we using a deferred for sending data to port B. We could just issue a 
> straight sendline as part of the main code and carry on. If we get a 
> response via linereceiver,we process it normally, otherwise we set our 
> callLater running and timeout and lose the connection. So no deferreds 
> required at all. It does seem to work.What we are not sure about is 
> what penalty is incurred in terms of reliability or throughput by 
> using sendline without a deferred.

There's absolutely no penalty (unless you allow the notion of negative 
penalties). Using sendline directly is faster than using a deferred in 
between, even if you don't count the memory overhead. I think there's a 
bit confusion about the role of deferreds in twisted here. Deferreds 
don't help you (or the reactor) with scheduling, they only provide you 
with a means to continue some processing after a certain event occurred.
> We are not too sure what the holdup will be and whether it could end 
> up halting the show. Is it better to schedule these messages via 
> deferreds or am I missing something obvious
>
> 3. So we then did an experiment and used defertothread to run the 
> sendline in a separate thread with its own defer to maximise the 
> asynchronous running of the code. So now we are running threads when 
> one of the reasons for looking at twisted was that we could avoid 
> threads as much as possible.

Do you use sendline (the twisted api) from within the thread?  If yes 
and it works, it works accidentally, probably also due to the very small 
load, and is definitely wrong (as well as unnecessary), twisted is not 
threadsafe, with the exception of a few methods/functions like 
callInThread/callFromThread/defertoThread etc.


hope that helps,
Johann




More information about the Twisted-Python mailing list