[Twisted-Python] Two main loops

Nitro nitro at dr-code.org
Mon Nov 12 15:35:05 EST 2007


Am 12.11.2007, 21:15 Uhr, schrieb Jean-Paul Calderone <exarkun at divmod.com>:

>> The OS is responsible to break everything down into individual tcp  
>> packets and  puzzle them together at the other end. The whole packet  
>> thing is not  visible to pb.
>
> True, but the actual performance bottleneck here is in PB's conversion of
> Python objects to strings, which is done by code entirely within Twisted.

Yes.

>> So if you want pb to split your data, then do splitted  callRemotes.
>> It's not desirable because you don't want your program to be stalled at  
>> uncontrollable times when the OS decides to send 20 tcp packets at once.
>
> Your program won't ever stall because of this.

If you do it as Jesper suggested and serialize when packets are about to  
be send then it will stall when you try to send (=serialize) lots of  
packets at once.

>> You want to spread the load as evenly as possible and you'll have to do  
>> this manually.
>
> PB's serializer could try very hard to avoid running for a long period of
> time without giving control back to the reactor.  Of course, someone  
> would
> have to implement this.  Whether or not that is worthwhile to implement,
> instead of doing what Glyph suggested (manually breaking up the work into
> smaller pieces) is a separate question.

Of course you can do it. But does it make sense? If you ask pb to do lots  
of work then you can't be surprised it takes long. If I ask pb to  
serialize things I want it to do it *now*. If pb starts distributing work  
over time itself then I will end up with lags which are bad in my  
situation.
I'd like to hear about the actual situation where twisted is stalling  
because of serialization. As said before I am using pb and a pb-like thing  
which also uses banana for serialization (= even more overhead to generate  
messages and spread them over multiple udp packets). I am serializing lots  
of data and it works like a charm. After all if you have so much data to  
serialize then you are very likely to do something with the data on the  
sending and receiving end. And this processing is much more likely to be a  
performance bottleneck than twisted serialization.
I guess we can discuss a lot about how pb *could* work. We should keep in  
mind real world situations though which tend to shift priorities a lot.

-Matthias




More information about the Twisted-Python mailing list