#5075 defect closed fixed (fixed)
IntNStringReceiver copies too much data in dataReceived, causing AMP to be slow
Reported by: | Glyph | Owned by: | Jean-Paul Calderone |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | core | Keywords: | optimization |
Cc: | Corbin Simpson | Branch: |
branches/faster-amp-5075
branch-diff, diff-cov, branch-cov, buildbot |
Author: | glyph, zseil, exarkun |
Description
If large packets containing many strings arrive to an IntNStringReceiver
, it will copy the entire string by slicing it back multiple times. Profiling indicates this is a bottleneck in AMP.
Attachments (1)
Change History (34)
comment:1 Changed 11 years ago by
Author: | → glyph |
---|---|
Branch: | → branches/faster-amp-5075 |
comment:2 follow-up: 3 Changed 11 years ago by
Keywords: | optimization review added |
---|---|
Owner: | Glyph deleted |
Build results, when they're ready.
Dear Reviewer - there are no unit tests in this branch, and I know that. This is because the code included in this branch is supposed to be completely functionally equivalent to the code it is changing, but faster. If you spot any functional changes that imply missing coverage and should involve adding new tests, I'll happily do that.
We do have a system for empirically verifying that code is faster but it's not quite as well integrated as our unit tests, which make sure the code is correct. In particular, speed.twistedmatrix.com does not have any way to run the benchmarks on a branch. If you look at http://speed.twistedmatrix.com/about/ you will notice a box that links to our benchmark suite. ("+junk" is just Launchpad's way of saying "critical production code", so you can ignore that part of the URL if it's confusing.)
I validated the code in this branch by running amp.py
from the benchmark suite on both trunk and the branch. I got the following results:
the branch: 2655.0 amp/sec (13275 amp in 5 seconds) trunk: 2337.0 amp/sec (11685 amp in 5 seconds)
That's a 13.6% speedup, but to account for variances in hardware, etc, I went with a more conservative 12% in the NEWS file.
Thanks for reviewing in this brave new world of optimization!
comment:3 follow-up: 4 Changed 11 years ago by
Keywords: | review removed |
---|---|
Owner: | set to Glyph |
Hello,
Replying to glyph:
Build results, when they're ready.
Dear Reviewer - there are no unit tests in this branch, and I know that. This is because the code included in this branch is supposed to be completely functionally equivalent to the code it is changing, but faster. If you spot any functional changes that imply missing coverage and should involve adding new tests, I'll happily do that.
Thanks for trying to speed this up!
The buildbot errors seem to show that delayed self.recvd
assignment breaks some
protocols that switch to a different protocol in stringReceived
(namely AMP). Maybe you could work around that by making alldata
and offt
instance variables (private and with better names) and convert recvd
to a property that would slice the received data only when needed?
What was the effect of importing functions from struct
directly? Did you try using struct.Struct
and its unpack_from
method instead? Now that Twisted is dropping Python 2.4 support you might gain some speed with that.
In sendString
, did you try using transport.writeSequence
or two transport.write
calls instead of string concatenation? It would be interesting to know which is faster.
I think that some direct test for Int*StringReceiver protocol switching would still be beneficial, it is slightly inconvenient to rely on AMP's tests to test this functionality.
comment:4 Changed 11 years ago by
Replying to zseil:
Thanks for trying to speed this up!
Thanks for a great review - although the failing tests did make it easy to see what the response should be, thanks for going beyond that :). I hope to respond soon.
comment:5 Changed 11 years ago by
Status: | new → assigned |
---|
comment:6 Changed 11 years ago by
Keywords: | review added |
---|---|
Owner: | Glyph deleted |
Status: | assigned → new |
I fixed the remaining twisted/test/test_stateful.py
failures. These were boring, only caused by test_stateful.py
re-using some code from test_protocols.py
in a way which was no longer valid with this branch's changes to the latter. I just refactored some mixins so that test_stateful.py
is unaffected by the new tests for the recvd
attribute.
I think glyph already took care of all the other review points.
comment:7 Changed 11 years ago by
Keywords: | review removed |
---|
Thanks for pushing this forward!
There are still a few problems with this branch:
- The
if 'recvd' in self.__dict__:
block on line 736 is still problematic since it doesn't reset thealldata
buffer which gets assigned toself._unprocessed
at the end of thewhile
loop. - The
recvd
buffer accessed fromlengthLimitExceeded
is incorrect if that event happens in the firstwhile
loop iteration. - The
_workingBuffer
attribute should be cleared at the end of thewhile
loop, so that it doesn't keep the already processed data in memory. - The
_workingBuffer
doesn't need to be set in eachwhile
iteration, it does not change. - Line 729 is not needed,
_unprocessed
will be set at the end of thewhile
loop.
Failing tests for 1. and 2. are in the attachment.
Below are twisted-benchmarks/int16receiver.py
benchmark numbers for possible optimizations suggested in the previous review, none of those suggestions brings any worthwhile gains:
without this branch: -------------------- 38352.0 int16strings/sec (191760 int16strings in 5 seconds) 38624.0 int16strings/sec (193120 int16strings in 5 seconds) 38352.0 int16strings/sec (191760 int16strings in 5 seconds) 38624.0 int16strings/sec (193120 int16strings in 5 seconds) 38352.0 int16strings/sec (191760 int16strings in 5 seconds) current branch: --------------- 44064.0 int16strings/sec (220320 int16strings in 5 seconds) 44064.0 int16strings/sec (220320 int16strings in 5 seconds) 44064.0 int16strings/sec (220320 int16strings in 5 seconds) 44064.0 int16strings/sec (220320 int16strings in 5 seconds) 44064.0 int16strings/sec (220320 int16strings in 5 seconds) current branch without direct pack, unpack import: -------------------------------------------------- 42160.0 int16strings/sec (210800 int16strings in 5 seconds) 42160.0 int16strings/sec (210800 int16strings in 5 seconds) 42160.0 int16strings/sec (210800 int16strings in 5 seconds) 42160.0 int16strings/sec (210800 int16strings in 5 seconds) 42160.0 int16strings/sec (210800 int16strings in 5 seconds) current branch + 2 writes: -------------------------- 44336.0 int16strings/sec (221680 int16strings in 5 seconds) 44336.0 int16strings/sec (221680 int16strings in 5 seconds) 44064.0 int16strings/sec (220320 int16strings in 5 seconds) 44336.0 int16strings/sec (221680 int16strings in 5 seconds) 44336.0 int16strings/sec (221680 int16strings in 5 seconds) current branch + writeSequence: ------------------------------- 44064.0 int16strings/sec (220320 int16strings in 5 seconds) 44064.0 int16strings/sec (220320 int16strings in 5 seconds) 44064.0 int16strings/sec (220320 int16strings in 5 seconds) 44064.0 int16strings/sec (220320 int16strings in 5 seconds) 44064.0 int16strings/sec (220320 int16strings in 5 seconds) current branch + struct.Struct.unpack: -------------------------------------- 45152.0 int16strings/sec (225760 int16strings in 5 seconds) 45152.0 int16strings/sec (225760 int16strings in 5 seconds) 45152.0 int16strings/sec (225760 int16strings in 5 seconds) 45152.0 int16strings/sec (225760 int16strings in 5 seconds) 45152.0 int16strings/sec (225760 int16strings in 5 seconds) current branch + struct.Struct.unpack_from: ------------------------------------------- 42432.0 int16strings/sec (212160 int16strings in 5 seconds) 42432.0 int16strings/sec (212160 int16strings in 5 seconds) 42432.0 int16strings/sec (212160 int16strings in 5 seconds) 42432.0 int16strings/sec (212160 int16strings in 5 seconds) 42432.0 int16strings/sec (212160 int16strings in 5 seconds)
Let me know if you would like to see the exact changes made in those runs, I can attach the patches that I used.
comment:8 Changed 11 years ago by
Owner: | set to Jean-Paul Calderone |
---|
thanks for the review and patch. Assigning it back to exarkun because he made the latest changes to this branch and can take a look at the patch.
comment:9 Changed 11 years ago by
(In [32576]) Apply 5075-additional-tests.patch, slightly modified, and some fixes to make the new tests pass.
This makes setting recvd
to an empty string work correctly. It also makes the recvd
attribute have the same value inside a lengthLimitExceeded
call as it did before these
changes.
refs #5075
comment:10 Changed 11 years ago by
Author: | glyph → glyph, zseil, exarkun |
---|---|
Keywords: | review added |
Owner: | Jean-Paul Calderone deleted |
I made the new tests pass. I also made some of the other changes based on comment 7. I had trouble with the benchmark on my machine, it produced results with a starkly bimodal distribution in many cases. The result does seem faster than trunk, though - for both the int16 benchmark and the amp benchmark.
comment:12 Changed 11 years ago by
Hi,
The current status of this looks pretty good. There are a few things, but they are all cosmetic:
- Without reading t.p.amp.BinaryBoxProtocol._switchTo(), it is remarkably unobvious what the "recvd" attribute does, why it needs such an outrageous hack, etc. A comment would be greatly appreciated.
- Similarly, explaining why the loop is written to be so optimized would be pretty nice. Most future maintainers hopefully can learn from the oral tradition, but again, comments would be nice.
With the most recent changes ([32578]), the code is relatively readable.
This isn't a review. I'll do a proper review another time -- right now isn't going to work. Submitting what I've typed so far.
comment:14 Changed 11 years ago by
comment:15 Changed 11 years ago by
comment:16 Changed 11 years ago by
comment:17 Changed 11 years ago by
Thanks for taking a look MostAwesomeDude. I added some more docs. In the process, I also noticed _workingBuffer
is basically redundant, and that dataReceived
is easily made non-recursive, so I fixed those issues too.
comment:18 Changed 11 years ago by
Cc: | Corbin Simpson added |
---|---|
Keywords: | review removed |
Hey,
With the new documentation patches, this new work is incredibly readable. The new docstring for _RecvdCompatHack really helps explain what recvd is for.
The test cases are convincing, especially the ones that cover recvd, which is really where I feel the iffy cases appear. test_protocols and test_amp pass with flying colors here.
I'm A-OK with this. Removing the "review" tag; this has passed my review. Thanks for your hard work!
comment:19 Changed 11 years ago by
Owner: | set to Jean-Paul Calderone |
---|
Whoops, reassigning. Forgot that.
comment:20 Changed 11 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:21 Changed 10 years ago by
Sorry to comment on a closed issue, but has the bug to deprecated recvd been opened?
comment:23 follow-up: 24 Changed 9 years ago by
Type: | enhancement → defect |
---|
Hi,
The optimized version of dataReceived() in suffers from a race condition involving the buffer self._unprocessed. Therefore, dataReceived() is not re-entrant anymore as it should.
The update self._unprocessed = alldata[currentOffset:] occurs at the very end only (which was of course the idea of the optimization). But keeping the function re-entrant is probably more important.
We experienced the problem for the first time about a year ago when switching to a version later then 11.0.0 (version 11.1.0 has the optimized code). Data is read from the buffer multiple times (as the update did not happen yet). Only a few weeks ago I took the time to track the problem (as originally we though it was our fault), and found this race condition. Our work around is to overload dataReceived() with the old code in our subclass of IntNStringReceiver--that way we can use the latest version of Twisted. (We work with a locally modified version of VIFF, from Aarhus University).
Berry Schoenmakers TU Eindhoven The Netherlands
comment:24 follow-up: 25 Changed 9 years ago by
Replying to berry:
Hi,
The optimized version of dataReceived() in suffers from a race condition involving the buffer self._unprocessed. Therefore, dataReceived() is not re-entrant anymore as it should.
Nowhere does it say that dataReceived
should be re-entrant. And in fact I am quite surprised you expect it to be :).
The update self._unprocessed = alldata[currentOffset:] occurs at the very end only (which was of course the idea of the optimization). But keeping the function re-entrant is probably more important.
Again I am surprised. Why do you think this? Normally I am a huge proponent of keeping compatibility, but this is an extremely fiddly detail to be depending upon.
We experienced the problem for the first time about a year ago when switching to a version later then 11.0.0 (version 11.1.0 has the optimized code).
This is why we ask our users to test pre-releases. They are usually announced on the Twisted mailing list. Would it be useful to you if they were announced on the labs.twistedmatrix.com blog as well?
At this point, it's too late to revert this change, although perhaps we could add some guard rails, like a useful exception to be raised around dataReceived
if it's called re-entrantly.
Data is read from the buffer multiple times (as the update did not happen yet). Only a few weeks ago I took the time to track the problem (as originally we though it was our fault), and found this race condition. Our work around is to overload dataReceived() with the old code in our subclass of IntNStringReceiver--that way we can use the latest version of Twisted. (We work with a locally modified version of VIFF, from Aarhus University).
It seems obvious to me that the solution is to simply stop calling dataReceived
re-entrantly. Why do you do it in the first place?
comment:25 follow-up: 26 Changed 9 years ago by
Replying to glyph:
Replying to berry:
...
It seems obvious to me that the solution is to simply stop calling
dataReceived
re-entrantly. Why do you do it in the first place?
Thanks for your quick response. Let me try to give some more context.
As I'm not that familiar with all the (implementation) details of the Twisted reactor etc. you may have to correct me at times.
We are using the VIFF (viff.dk) framework which provides Runtimes that can be used to perform secure multiparty computations. VIFF introduces a subclass of Int16StringReceiver and in that subclass the function stringReceived() is implemented, as it should. Function dataReceived() is never called by 'us' directly; it is only called from within the reactor.
The way the reactor is set up and used by VIFF may indirectly be the cause of 'forbidden' re-entrant calls to dataReceived(). So, maybe the way VIFF handles the reactor (and uses calls to SelectReactor.doIteration() etc.) is wrong. Or, the VIFF implementation of stringReceived() is indirectly leading to another call of dataReceived() which shouldn't happen.
In any case, what happens is that dataReceived() is called re-entrantly, and the same data is read twice. Only if we use the 'boosted' version of VIFF (with a c implementation of deferreds) the problem occurs. So, when things go `fast'. Otherwise, the race condition doesn't occur. The problem only happens once, during initialization, so there may be something wrong in the 'logic' that VIFF is using in that part.
Anyway, it is reassuring to hear from you that a function like dataReceived() was never supposed to be re-entrant. Frankly, when browsing through the Twisted code before I was already wondering how things could work at all, as re-entrancy didn't seem to be addressed at all. But it was the only conclusion I could make -- assuming that VIFF is doing things correctly.
comment:26 follow-up: 27 Changed 9 years ago by
Replying to berry:
Replying to glyph:
Replying to berry:
...
It seems obvious to me that the solution is to simply stop calling
dataReceived
re-entrantly. Why do you do it in the first place?Thanks for your quick response. Let me try to give some more context.
Thanks.
As I'm not that familiar with all the (implementation) details of the Twisted reactor etc. you may have to correct me at times.
In the absence of code examples, I'm not sure I can correct you :).
We are using the VIFF (viff.dk) framework which provides Runtimes that can be used to perform secure multiparty computations. VIFF introduces a subclass of Int16StringReceiver and in that subclass the function stringReceived() is implemented, as it should. Function dataReceived() is never called by 'us' directly; it is only called from within the reactor.
Unfortunately I can't browse the code, since http://hg.viff.dk/viff/ is a 500 error.
The way the reactor is set up and used by VIFF may indirectly be the cause of 'forbidden' re-entrant calls to dataReceived(). So, maybe the way VIFF handles the reactor (and uses calls to SelectReactor.doIteration() etc.) is wrong. Or, the VIFF implementation of stringReceived() is indirectly leading to another call of dataReceived() which shouldn't happen.
In order for this to be called re-entrantly, the only thing I could think of is that VIFF would need to explicitly call dataReceived
.
In any case, what happens is that dataReceived() is called re-entrantly, and the same data is read twice. Only if we use the 'boosted' version of VIFF (with a c implementation of deferreds) the problem occurs. So, when things go `fast'. Otherwise, the race condition doesn't occur. The problem only happens once, during initialization, so there may be something wrong in the 'logic' that VIFF is using in that part.
It sounds like VIFF's C implementation of Deferreds is the culprit then. Deferreds are quite tricky to implement properly, so it's not surprising to me that they have a bug. Do they run against Twisted's Deferred test suite and pass? If not, they should probably get rid of them :).
Have you considered using PyPy instead? http://pypy.org In some cases this can give you equivalently good performance to a C extension, just by running Python.
Anyway, it is reassuring to hear from you that a function like dataReceived() was never supposed to be re-entrant. Frankly, when browsing through the Twisted code before I was already wondering how things could work at all, as re-entrancy didn't seem to be addressed at all. But it was the only conclusion I could make -- assuming that VIFF is doing things correctly.
It sounds to me like there is a bug in VIFF. If you can extract some stack-traces from a re-entrant call to dataReceived, it would be useful to see where the recursive call is coming from. If there is a place where Twisted causes dataReceived
to be called re-entrantly, it's a bug that we should address.
comment:27 Changed 9 years ago by
Replying to glyph:
Replying to berry:
Replying to glyph:
Replying to berry:
...
Thanks again for your very useful suggestions. The problem may come from many 'directions' indeed.
You may have a look into the VIFF code yourself (see http://www.win.tue.nl/~berry/TUeVIFF/ for our 'local' version), but maybe the following stack trace is already giving you sufficient info. There are exactly two calls to dataReceived(). The program starts by exchanging an ID number between the 'players' and then needs to exchange some data. When reading the data, the ID number is read for the second time though as it's still in the buffer of dataReceived() apparently.
File "millionaires.py", line 156, in <module>
reactor.run()
File "C:\Python27\lib\site-packages\twisted\internet\base.py", line 1173, in run
self.mainLoop()
File "C:\Python27\lib\site-packages\twisted\internet\base.py", line 1185, in mainLoop
self.doIteration(t)
File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\reactor.py", line 46, in doIteration
SelectReactor.doIteration(self, t)
File "C:\Python27\lib\site-packages\twisted\internet\selectreactor.py", line 145, in doSelect
_logrun(selectable, _drdw, selectable, method, dict)
File "C:\Python27\lib\site-packages\twisted\python\log.py", line 88, in callWithLogger
return callWithContext({"system": lp}, func, *args, kw)
File "C:\Python27\lib\site-packages\twisted\python\log.py", line 73, in callWithContext
return context.call({ILogContext: newCtx}, func, *args, kw)
File "C:\Python27\lib\site-packages\twisted\python\context.py", line 118, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, kw)
File "C:\Python27\lib\site-packages\twisted\python\context.py", line 81, in callWithContext
return func(*args,kw)
File "C:\Python27\lib\site-packages\twisted\internet\selectreactor.py", line 151, in _doReadOrWrite
why = getattr(selectable, method)()
File "C:\Python27\lib\site-packages\twisted\internet\tcp.py", line 215, in doRead
return self._dataReceived(data)
File "C:\Python27\lib\site-packages\twisted\internet\tcp.py", line 221, in _dataReceived
rval = self.protocol.dataReceived(data)
File "C:\Python27\lib\site-packages\twisted\protocols\basic.py", line 755, in dataReceived
self.stringReceived(packet)
File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\runtime.py", line 345, in stringReceived
self.factory.identify_peer(self)
File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\runtime.py", line 488, in identify_peer
self.protocols_ready.callback(self.runtime)
File "millionaires.py", line 77, in init
m1_ge_m2 = m1 >= m2
File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\comparison.py", line 93, in greater_than_equal
b, bits = self.decomposed_random_sharing(field, m)
File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\comparison.py", line 55, in decomposed_random_sharing
bits = [self.prss_share_bit_double(field) for _ in range(bits)]
File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\passive.py", line 815, in prss_share_bit_double
b_p = self.prss_share_random(field, binary=True)
File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\passive.py", line 727, in prss_share_random
threshold=2*self.threshold)
File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\passive.py", line 113, in open
self.activate_reactor()
File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\runtime.py", line 962, in activate_reactor
reactor.doIteration(0)
File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\reactor.py", line 46, in doIteration
SelectReactor.doIteration(self, t)
File "C:\Python27\lib\site-packages\twisted\internet\selectreactor.py", line 145, in doSelect
_logrun(selectable, _drdw, selectable, method, dict)
File "C:\Python27\lib\site-packages\twisted\python\log.py", line 88, in callWithLogger
return callWithContext({"system": lp}, func, *args, kw)
File "C:\Python27\lib\site-packages\twisted\python\log.py", line 73, in callWithContext
return context.call({ILogContext: newCtx}, func, *args, kw)
File "C:\Python27\lib\site-packages\twisted\python\context.py", line 118, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, kw)
File "C:\Python27\lib\site-packages\twisted\python\context.py", line 81, in callWithContext
return func(*args,kw)
File "C:\Python27\lib\site-packages\twisted\internet\selectreactor.py", line 151, in _doReadOrWrite
why = getattr(selectable, method)()
File "C:\Python27\lib\site-packages\twisted\internet\tcp.py", line 215, in doRead
return self._dataReceived(data)
File "C:\Python27\lib\site-packages\twisted\internet\tcp.py", line 221, in _dataReceived
rval = self.protocol.dataReceived(data)
File "C:\Python27\lib\site-packages\twisted\protocols\basic.py", line 755, in dataReceived
self.stringReceived(packet)
File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\runtime.py", line 367, in stringReceived
traceback.print_stack()
comment:28 follow-up: 29 Changed 9 years ago by
The problem is in VIFF's activate_reactor
method. The reactor is very explicitly not re-entrant, and they are calling doIteration
from within dataReceived
. I'm surprised this even appears to work! Please file a bug with them telling them to stop doing that; perhaps we should have a bug which makes doIteration
raise an exception if called reentrantly.
comment:29 Changed 9 years ago by
Replying to glyph:
The problem is in VIFF's
activate_reactor
method. The reactor is very explicitly not re-entrant, and they are callingdoIteration
from withindataReceived
. I'm surprised this even appears to work! Please file a bug with them telling them to stop doing that; perhaps we should have a bug which makesdoIteration
raise an exception if called reentrantly.
Right, thanks a lot! I was assuming that VIFF's way of calling doIteration() was normal behavior, and that's why I thought that dataReceived() had to be re-entrant. Indeed VIFF is working fine for quite a diverse set of programs, and the faulty behavior seems to arise only during initialization, under special conditions (among which is the use of a compiled c implementation of Deferreds). But you're saying that VIFF's approach may be fundamentally flawed. I'll see who I can (still) contact on this.
comment:30 follow-up: 31 Changed 9 years ago by
Two things I noticed about VIFF that might help the authors:
- There's no need for a custom reactor at all.
twisted.internet.task.cooperate
will help them implement the fair queuing/scheduling that seems to be desired (basically,callLater(0, f)
is how you schedule something to happen "as soon as possible"; there's no need to hook into the implementation ofdoIteration
). - The
activate_reactor
calls actually seem entirely superfluous. The reactor is already "activated" whenever it is running. Turning the implementation ofactivate_reactor
into a no-op (or deleting it and all users of it, depending on how VIFF handles API compatibility) should fix the re-entrancy issue.
Hope this helps.
comment:31 follow-up: 32 Changed 9 years ago by
Replying to exarkun:
Two things I noticed about VIFF that might help the authors: ... Hope this helps.
Brilliant! Indeed turning activate_reactor() into a no-op makes the problem disappear entirely! This explains a lot, and also your other remark will help a lot. Things in VIFF should be much simpler in the end as well with a proper use of reactors.
comment:32 Changed 9 years ago by
Quick addition: turns out that no-oping activate_reactor() has a negative impact on the performance, resulting in much larger memory usage. Apparently, activate_reactor() calls to doIteration() ensures that processing is faster in VIFF.
comment:33 Changed 9 years ago by
Quick addition: turns out that no-oping activate_reactor() has a negative impact on the performance, resulting in much larger memory usage.
That make some sense, since by re-entering the reactor immediately after scheduling any I/O, buffers are going to be emptied sooner. However, some other solution will need to be found since re-entering the reactor leads to bugs like the one you observed. :)
We should probably continue this discussion elsewhere, perhaps on the mailing list. If you have some reproducable benchmarks, I'm sure there are people who would be interested in discussing how to make them perform better.
(In [31740]) Branching to 'faster-amp-5075'