[Twisted-Python] Performance issue in reactor.callLater
behnel_ml at gkec.informatik.tu-darmstadt.de
Thu Sep 9 03:21:35 EDT 2004
James Y Knight schrieb:
> However, I'm unconvinced about the reasoning in:
>> Ok, I thought about it. And I believe the best way to do this is to
>> write a specialized heap version. I copied the heapq implementation
>> and adapted it so that it keeps track of the position in the heap for
>> each DelayedCall. That makes it possible to remove a call without
>> linear overhead and in log(N) time.
> To me, it seems like the right thing is what I think you had before:
> 1) on cancel, mark it canceled
That works. However, in cases with many pending delayed calls leaving the cancelled ones on the heap may unnecessarily increase the heap size quite a bit, so I figured it might be better to keep the size low. But I admit that this is an undermotivated decision.
I personally had the case of an ACK tracker where my first approach was to start a timeout for each ACK and then cancel it. But when I saw the desastrous performance of the original Twisted implementation I quickly wrote my own queue for it. Anyway, in such a case you'd have a lot of cancelled delayed calls on the heap and they'd unnecessarily increase the time for new insertions (which is done, as I said, as late as possible, but before the run-loop).
I played with the idea of giving cancel() a new boolean parameter "quick_cancel" that would skip the removal step if set to True. All you'd need to (re-)add in that case would be the check in the runUntilCurrent loop if the current call was cancelled. That may be a good compromise.
> 2) on delay, mark it delayed
That doesn't work.
> 3) when about to run a DelayedCall, if it is cancelled, ignore it. If it
> is delayed, reinsert it into the heap in the proper place.
> Why did you change to the current method from that?
Delaying a call breaks the heap invariant and therefore may hinder new insertions. There is a simple work around for delay: Store the delay time somewhere in the object and set it when we find the call as first item. /Then/ move it to the right position. This lessens the number of reinsertions if the call is delayed more often. Could be worth it.
However, this doesn't work for reset. Reset accepts an arbitrary time value that may be lower than the original. A work around here could be to correct the heap position in either direction when the time value changes, but that is about as much work as removing and reinserting it.
I chose the current solution as I figured it would be a good trade-off between code complexity, special cases and performance. The quick_cancel thing is easy to implement (4 lines), though.
Could I have other opinions on this?
More information about the Twisted-Python