[Twisted-Python] Really Basic clarification on defers

Kevin Horn kevin.horn at gmail.com
Tue Aug 4 14:35:10 EDT 2009


On Tue, Aug 4, 2009 at 10:08 AM, John Aherne <johnaherne at rocs.co.uk> wrote:

> This is a really basic problem we are trying to decide about,
>
> We have programs that run quite happily, so far. Its main task is to
> receive data from port A and send it out via port B. Then receive data via
> port B and send it out via port A. It's pretty much like a chat setup. You
> just build up a list of connected clients and send data to them as required
>
> One side A receives some input from a tcp port - about 100-200 characters,
> and forwards it to another port B. We do not need to wait for any response.
> If we get a response we pick that up through line receiver. We also run a
> calllater to check if we got a response on linereceiver within the timeframe
> specified. If not we drop the connection.
>
> Traffic coming in from port B is analysed and some subset is sent back to
> port A.
>
> Ignoring port A for the moment, just concentrating on port B, we have tried
> three options:--
>
> 1. We set up a defer to handle the sendline to port B so that the reactor
> would schedule it in its own good time. No threads involved using the
> standard twisted setup. When we get a response through receiveline we fire
> the callback defer. If we timeout via callLater we fire the errback to clear
> the defer. In this case the defer does not seem to be doing very much
>
> 2. Now a fresh pair of eyes is looking at the code and saying why are we
> using a deferred for sending data to port B. We could just issue a straight
> sendline as part of the main code and carry on. If we get a response via
> linereceiver,we process it normally, otherwise we set our callLater running
> and timeout and lose the connection. So no deferreds required at all. It
> does seem to work.What we are not sure about is what penalty is incurred in
> terms of reliability or throughput by using sendline without a deferred. We
> are not too sure what the holdup will be and whether it could end up halting
> the show. Is it better to schedule these messages via deferreds or am I
> missing something obvious
>
> 3. So we then did an experiment and used defertothread to run the sendline
> in a separate thread with its own defer to maximise the asynchronous running
> of the code. So now we are running threads when one of the reasons for
> looking at twisted was that we could avoid threads as much as possible.
>
> The conundrum we are trying to resolve now is which option should we use.
> Do any of the options have a built-in problem awaiting the unwary. In theory
> all 3 options work. But if No 1 works well enough for our volume of traffic
> should we adopt that one. Or is it better to start using the defertothread
> option. Is there a simple answer
>
> The traffic is not large, upto a 100-200 remote devices on port B. They
> will send GPS data every 20 secs, and about 500 messages of about 200 bytes
> average throught the day. The remote devices will respond in an irregular
> manner without dropping the connection, so we force a disconnectf if
> important messages are not getting through. They are then forced to
> reconnect.
>
> We have looked through the code searching for enlightment and it does seem
> to be well documented, but the information we are looking for comes well
> before the doc strings.
>
> Hopefully, someone can give us some pointers in the right direction.
>
> Thanks for any help.
>
> John Aherne
>
>
>
It seems to me that the volume of traffic you are dealing with isn't so high
that you need to worry too much about direct sendline causing problems.  If
I were writing this from scratch based on my understanding of what you've
written above, I would probably go with option 2. (Keep in mind, my
understanding may be flawed...so...)  However, if you've already got things
working with option 1, and the added complexity isn't causing you any
trouble, I don't see any real reason not to use that, since you've already
got that working.  Others may disagree...

Option 3 seems totally unnecessary to me.  I typically stay away from
threads in Twisted unless I have a long running non-network process to deal
with (disk access, db access, heavy math processing, etc.).  Especially
because of the relative "heaviness" of threads when using Python (due to
complex interactions with the GIL), I would avoid this method...it will
probably hurt performance more than Option 1 (though still probably not
enough to matter).

Others feel free to slap me if I'm giving bad advice :)

Kevin Horn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://twistedmatrix.com/pipermail/twisted-python/attachments/20090804/471865af/attachment.htm 


More information about the Twisted-Python mailing list