Opened 7 years ago

Closed 6 years ago

Last modified 5 years ago

#5075 defect closed fixed (fixed)

IntNStringReceiver copies too much data in dataReceived, causing AMP to be slow

Reported by: Glyph Owned by: Jean-Paul Calderone
Priority: normal Milestone:
Component: core Keywords: optimization
Cc: Corbin Simpson Branch: branches/faster-amp-5075
branch-diff, diff-cov, branch-cov, buildbot
Author: glyph, zseil, exarkun

Description

If large packets containing many strings arrive to an IntNStringReceiver, it will copy the entire string by slicing it back multiple times. Profiling indicates this is a bottleneck in AMP.

Attachments (1)

5075-additional-tests.patch (3.4 KB) - added by zseil 6 years ago.
Additional tests

Download all attachments as: .zip

Change History (34)

comment:1 Changed 7 years ago by Glyph

Author: glyph
Branch: branches/faster-amp-5075

(In [31740]) Branching to 'faster-amp-5075'

comment:2 Changed 7 years ago by Glyph

Keywords: optimization review added
Owner: Glyph deleted

Build results, when they're ready.

Dear Reviewer - there are no unit tests in this branch, and I know that. This is because the code included in this branch is supposed to be completely functionally equivalent to the code it is changing, but faster. If you spot any functional changes that imply missing coverage and should involve adding new tests, I'll happily do that.

We do have a system for empirically verifying that code is faster but it's not quite as well integrated as our unit tests, which make sure the code is correct. In particular, speed.twistedmatrix.com does not have any way to run the benchmarks on a branch. If you look at http://speed.twistedmatrix.com/about/ you will notice a box that links to our benchmark suite. ("+junk" is just Launchpad's way of saying "critical production code", so you can ignore that part of the URL if it's confusing.)

I validated the code in this branch by running amp.py from the benchmark suite on both trunk and the branch. I got the following results:

the branch: 2655.0 amp/sec (13275 amp in 5 seconds) trunk: 2337.0 amp/sec (11685 amp in 5 seconds)

That's a 13.6% speedup, but to account for variances in hardware, etc, I went with a more conservative 12% in the NEWS file.

Thanks for reviewing in this brave new world of optimization!

comment:3 in reply to:  2 ; Changed 7 years ago by zseil

Keywords: review removed
Owner: set to Glyph

Hello,

Replying to glyph:

Build results, when they're ready.

Dear Reviewer - there are no unit tests in this branch, and I know that. This is because the code included in this branch is supposed to be completely functionally equivalent to the code it is changing, but faster. If you spot any functional changes that imply missing coverage and should involve adding new tests, I'll happily do that.

Thanks for trying to speed this up!

The buildbot errors seem to show that delayed self.recvd assignment breaks some protocols that switch to a different protocol in stringReceived (namely AMP). Maybe you could work around that by making alldata and offt instance variables (private and with better names) and convert recvd to a property that would slice the received data only when needed?

What was the effect of importing functions from struct directly? Did you try using struct.Struct and its unpack_from method instead? Now that Twisted is dropping Python 2.4 support you might gain some speed with that.

In sendString, did you try using transport.writeSequence or two transport.write calls instead of string concatenation? It would be interesting to know which is faster.

I think that some direct test for Int*StringReceiver protocol switching would still be beneficial, it is slightly inconvenient to rely on AMP's tests to test this functionality.

comment:4 in reply to:  3 Changed 7 years ago by Glyph

Replying to zseil:

Thanks for trying to speed this up!

Thanks for a great review - although the failing tests did make it easy to see what the response should be, thanks for going beyond that :). I hope to respond soon.

comment:5 Changed 7 years ago by Glyph

Status: newassigned

comment:6 Changed 6 years ago by Jean-Paul Calderone

Keywords: review added
Owner: Glyph deleted
Status: assignednew

I fixed the remaining twisted/test/test_stateful.py failures. These were boring, only caused by test_stateful.py re-using some code from test_protocols.py in a way which was no longer valid with this branch's changes to the latter. I just refactored some mixins so that test_stateful.py is unaffected by the new tests for the recvd attribute.

I think glyph already took care of all the other review points.

comment:7 Changed 6 years ago by zseil

Keywords: review removed

Thanks for pushing this forward!

There are still a few problems with this branch:

  1. The if 'recvd' in self.__dict__: block on line 736 is still problematic since it doesn't reset the alldata buffer which gets assigned to self._unprocessed at the end of the while loop.
  2. The recvd buffer accessed from lengthLimitExceeded is incorrect if that event happens in the first while loop iteration.
  3. The _workingBuffer attribute should be cleared at the end of the while loop, so that it doesn't keep the already processed data in memory.
  4. The _workingBuffer doesn't need to be set in each while iteration, it does not change.
  5. Line 729 is not needed, _unprocessed will be set at the end of the while loop.

Failing tests for 1. and 2. are in the attachment.

Below are twisted-benchmarks/int16receiver.py benchmark numbers for possible optimizations suggested in the previous review, none of those suggestions brings any worthwhile gains:

without this branch:
--------------------
38352.0 int16strings/sec (191760 int16strings in 5 seconds)
38624.0 int16strings/sec (193120 int16strings in 5 seconds)
38352.0 int16strings/sec (191760 int16strings in 5 seconds)
38624.0 int16strings/sec (193120 int16strings in 5 seconds)
38352.0 int16strings/sec (191760 int16strings in 5 seconds)

current branch:
---------------
44064.0 int16strings/sec (220320 int16strings in 5 seconds)
44064.0 int16strings/sec (220320 int16strings in 5 seconds)
44064.0 int16strings/sec (220320 int16strings in 5 seconds)
44064.0 int16strings/sec (220320 int16strings in 5 seconds)
44064.0 int16strings/sec (220320 int16strings in 5 seconds)

current branch without direct pack, unpack import:
--------------------------------------------------
42160.0 int16strings/sec (210800 int16strings in 5 seconds)
42160.0 int16strings/sec (210800 int16strings in 5 seconds)
42160.0 int16strings/sec (210800 int16strings in 5 seconds)
42160.0 int16strings/sec (210800 int16strings in 5 seconds)
42160.0 int16strings/sec (210800 int16strings in 5 seconds)

current branch + 2 writes:
--------------------------
44336.0 int16strings/sec (221680 int16strings in 5 seconds)
44336.0 int16strings/sec (221680 int16strings in 5 seconds)
44064.0 int16strings/sec (220320 int16strings in 5 seconds)
44336.0 int16strings/sec (221680 int16strings in 5 seconds)
44336.0 int16strings/sec (221680 int16strings in 5 seconds)

current branch + writeSequence:
-------------------------------
44064.0 int16strings/sec (220320 int16strings in 5 seconds)
44064.0 int16strings/sec (220320 int16strings in 5 seconds)
44064.0 int16strings/sec (220320 int16strings in 5 seconds)
44064.0 int16strings/sec (220320 int16strings in 5 seconds)
44064.0 int16strings/sec (220320 int16strings in 5 seconds)

current branch + struct.Struct.unpack:
--------------------------------------
45152.0 int16strings/sec (225760 int16strings in 5 seconds)
45152.0 int16strings/sec (225760 int16strings in 5 seconds)
45152.0 int16strings/sec (225760 int16strings in 5 seconds)
45152.0 int16strings/sec (225760 int16strings in 5 seconds)
45152.0 int16strings/sec (225760 int16strings in 5 seconds)

current branch + struct.Struct.unpack_from:
-------------------------------------------
42432.0 int16strings/sec (212160 int16strings in 5 seconds)
42432.0 int16strings/sec (212160 int16strings in 5 seconds)
42432.0 int16strings/sec (212160 int16strings in 5 seconds)
42432.0 int16strings/sec (212160 int16strings in 5 seconds)
42432.0 int16strings/sec (212160 int16strings in 5 seconds)

Let me know if you would like to see the exact changes made in those runs, I can attach the patches that I used.

Changed 6 years ago by zseil

Attachment: 5075-additional-tests.patch added

Additional tests

comment:8 Changed 6 years ago by Thijs Triemstra

Owner: set to Jean-Paul Calderone

thanks for the review and patch. Assigning it back to exarkun because he made the latest changes to this branch and can take a look at the patch.

comment:9 Changed 6 years ago by Jean-Paul Calderone

(In [32576]) Apply 5075-additional-tests.patch, slightly modified, and some fixes to make the new tests pass.

This makes setting recvd to an empty string work correctly. It also makes the recvd attribute have the same value inside a lengthLimitExceeded call as it did before these changes.

refs #5075

comment:10 Changed 6 years ago by Jean-Paul Calderone

Author: glyphglyph, zseil, exarkun
Keywords: review added
Owner: Jean-Paul Calderone deleted

I made the new tests pass. I also made some of the other changes based on comment 7. I had trouble with the benchmark on my machine, it produced results with a starkly bimodal distribution in many cases. The result does seem faster than trunk, though - for both the int16 benchmark and the amp benchmark.

comment:11 Changed 6 years ago by Jean-Paul Calderone

comment:12 Changed 6 years ago by Corbin Simpson

Hi,

The current status of this looks pretty good. There are a few things, but they are all cosmetic:

  • Without reading t.p.amp.BinaryBoxProtocol._switchTo(), it is remarkably unobvious what the "recvd" attribute does, why it needs such an outrageous hack, etc. A comment would be greatly appreciated.
  • Similarly, explaining why the loop is written to be so optimized would be pretty nice. Most future maintainers hopefully can learn from the oral tradition, but again, comments would be nice.

With the most recent changes ([32578]), the code is relatively readable.

This isn't a review. I'll do a proper review another time -- right now isn't going to work. Submitting what I've typed so far.

comment:13 Changed 6 years ago by Jean-Paul Calderone

(In [32632]) Beef up the maintainer documentation

refs #5075

comment:14 Changed 6 years ago by Jean-Paul Calderone

(In [32633]) Remove the _workingBuffer entirely - it is the same as _unprocessed with one trivial tweak

refs #5075

comment:15 Changed 6 years ago by Jean-Paul Calderone

(In [32634]) Make the recvd-was-set case iterate instead of recursing

refs #5075

comment:16 Changed 6 years ago by Jean-Paul Calderone

(In [32635]) One more comment about recvd, and move it to the top

refs #5075

comment:17 Changed 6 years ago by Jean-Paul Calderone

Thanks for taking a look MostAwesomeDude. I added some more docs. In the process, I also noticed _workingBuffer is basically redundant, and that dataReceived is easily made non-recursive, so I fixed those issues too.

comment:18 Changed 6 years ago by Corbin Simpson

Cc: Corbin Simpson added
Keywords: review removed

Hey,

With the new documentation patches, this new work is incredibly readable. The new docstring for _RecvdCompatHack really helps explain what recvd is for.

The test cases are convincing, especially the ones that cover recvd, which is really where I feel the iffy cases appear. test_protocols and test_amp pass with flying colors here.

I'm A-OK with this. Removing the "review" tag; this has passed my review. Thanks for your hard work!

comment:19 Changed 6 years ago by Corbin Simpson

Owner: set to Jean-Paul Calderone

Whoops, reassigning. Forgot that.

comment:20 Changed 6 years ago by Jean-Paul Calderone

Resolution: fixed
Status: newclosed

(In [32684]) Merge faster-amp-5075

Author: glyph, zseil, exarkun Reviewer: zseil, exarkun, MostAwesomeDude Fixes: #5075

Speed up IntNStringReceiver (and its subclasses, notably AMP) by avoiding some unnecessary string copying in its parser.

comment:21 Changed 5 years ago by therve

Sorry to comment on a closed issue, but has the bug to deprecated recvd been opened?

comment:22 Changed 5 years ago by Jean-Paul Calderone

Not obviously.

comment:23 Changed 5 years ago by berry

Type: enhancementdefect

Hi,

The optimized version of dataReceived() in suffers from a race condition involving the buffer self._unprocessed. Therefore, dataReceived() is not re-entrant anymore as it should.

The update self._unprocessed = alldata[currentOffset:] occurs at the very end only (which was of course the idea of the optimization). But keeping the function re-entrant is probably more important.

We experienced the problem for the first time about a year ago when switching to a version later then 11.0.0 (version 11.1.0 has the optimized code). Data is read from the buffer multiple times (as the update did not happen yet). Only a few weeks ago I took the time to track the problem (as originally we though it was our fault), and found this race condition. Our work around is to overload dataReceived() with the old code in our subclass of IntNStringReceiver--that way we can use the latest version of Twisted. (We work with a locally modified version of VIFF, from Aarhus University).

Berry Schoenmakers TU Eindhoven The Netherlands

comment:24 in reply to:  23 ; Changed 5 years ago by Glyph

Replying to berry:

Hi,

The optimized version of dataReceived() in suffers from a race condition involving the buffer self._unprocessed. Therefore, dataReceived() is not re-entrant anymore as it should.

Nowhere does it say that dataReceived should be re-entrant. And in fact I am quite surprised you expect it to be :).

The update self._unprocessed = alldata[currentOffset:] occurs at the very end only (which was of course the idea of the optimization). But keeping the function re-entrant is probably more important.

Again I am surprised. Why do you think this? Normally I am a huge proponent of keeping compatibility, but this is an extremely fiddly detail to be depending upon.

We experienced the problem for the first time about a year ago when switching to a version later then 11.0.0 (version 11.1.0 has the optimized code).

This is why we ask our users to test pre-releases. They are usually announced on the Twisted mailing list. Would it be useful to you if they were announced on the labs.twistedmatrix.com blog as well?

At this point, it's too late to revert this change, although perhaps we could add some guard rails, like a useful exception to be raised around dataReceived if it's called re-entrantly.

Data is read from the buffer multiple times (as the update did not happen yet). Only a few weeks ago I took the time to track the problem (as originally we though it was our fault), and found this race condition. Our work around is to overload dataReceived() with the old code in our subclass of IntNStringReceiver--that way we can use the latest version of Twisted. (We work with a locally modified version of VIFF, from Aarhus University).

It seems obvious to me that the solution is to simply stop calling dataReceived re-entrantly. Why do you do it in the first place?

comment:25 in reply to:  24 ; Changed 5 years ago by berry

Replying to glyph:

Replying to berry:

...

It seems obvious to me that the solution is to simply stop calling dataReceived re-entrantly. Why do you do it in the first place?

Thanks for your quick response. Let me try to give some more context.

As I'm not that familiar with all the (implementation) details of the Twisted reactor etc. you may have to correct me at times.

We are using the VIFF (viff.dk) framework which provides Runtimes that can be used to perform secure multiparty computations. VIFF introduces a subclass of Int16StringReceiver and in that subclass the function stringReceived() is implemented, as it should. Function dataReceived() is never called by 'us' directly; it is only called from within the reactor.

The way the reactor is set up and used by VIFF may indirectly be the cause of 'forbidden' re-entrant calls to dataReceived(). So, maybe the way VIFF handles the reactor (and uses calls to SelectReactor.doIteration() etc.) is wrong. Or, the VIFF implementation of stringReceived() is indirectly leading to another call of dataReceived() which shouldn't happen.

In any case, what happens is that dataReceived() is called re-entrantly, and the same data is read twice. Only if we use the 'boosted' version of VIFF (with a c implementation of deferreds) the problem occurs. So, when things go `fast'. Otherwise, the race condition doesn't occur. The problem only happens once, during initialization, so there may be something wrong in the 'logic' that VIFF is using in that part.

Anyway, it is reassuring to hear from you that a function like dataReceived() was never supposed to be re-entrant. Frankly, when browsing through the Twisted code before I was already wondering how things could work at all, as re-entrancy didn't seem to be addressed at all. But it was the only conclusion I could make -- assuming that VIFF is doing things correctly.

comment:26 in reply to:  25 ; Changed 5 years ago by Glyph

Replying to berry:

Replying to glyph:

Replying to berry:

...

It seems obvious to me that the solution is to simply stop calling dataReceived re-entrantly. Why do you do it in the first place?

Thanks for your quick response. Let me try to give some more context.

Thanks.

As I'm not that familiar with all the (implementation) details of the Twisted reactor etc. you may have to correct me at times.

In the absence of code examples, I'm not sure I can correct you :).

We are using the VIFF (viff.dk) framework which provides Runtimes that can be used to perform secure multiparty computations. VIFF introduces a subclass of Int16StringReceiver and in that subclass the function stringReceived() is implemented, as it should. Function dataReceived() is never called by 'us' directly; it is only called from within the reactor.

Unfortunately I can't browse the code, since http://hg.viff.dk/viff/ is a 500 error.

The way the reactor is set up and used by VIFF may indirectly be the cause of 'forbidden' re-entrant calls to dataReceived(). So, maybe the way VIFF handles the reactor (and uses calls to SelectReactor.doIteration() etc.) is wrong. Or, the VIFF implementation of stringReceived() is indirectly leading to another call of dataReceived() which shouldn't happen.

In order for this to be called re-entrantly, the only thing I could think of is that VIFF would need to explicitly call dataReceived.

In any case, what happens is that dataReceived() is called re-entrantly, and the same data is read twice. Only if we use the 'boosted' version of VIFF (with a c implementation of deferreds) the problem occurs. So, when things go `fast'. Otherwise, the race condition doesn't occur. The problem only happens once, during initialization, so there may be something wrong in the 'logic' that VIFF is using in that part.

It sounds like VIFF's C implementation of Deferreds is the culprit then. Deferreds are quite tricky to implement properly, so it's not surprising to me that they have a bug. Do they run against Twisted's Deferred test suite and pass? If not, they should probably get rid of them :).

Have you considered using PyPy instead? http://pypy.org In some cases this can give you equivalently good performance to a C extension, just by running Python.

Anyway, it is reassuring to hear from you that a function like dataReceived() was never supposed to be re-entrant. Frankly, when browsing through the Twisted code before I was already wondering how things could work at all, as re-entrancy didn't seem to be addressed at all. But it was the only conclusion I could make -- assuming that VIFF is doing things correctly.

It sounds to me like there is a bug in VIFF. If you can extract some stack-traces from a re-entrant call to dataReceived, it would be useful to see where the recursive call is coming from. If there is a place where Twisted causes dataReceived to be called re-entrantly, it's a bug that we should address.

comment:27 in reply to:  26 Changed 5 years ago by berry

Replying to glyph:

Replying to berry:

Replying to glyph:

Replying to berry:

...

Thanks again for your very useful suggestions. The problem may come from many 'directions' indeed.

You may have a look into the VIFF code yourself (see http://www.win.tue.nl/~berry/TUeVIFF/ for our 'local' version), but maybe the following stack trace is already giving you sufficient info. There are exactly two calls to dataReceived(). The program starts by exchanging an ID number between the 'players' and then needs to exchange some data. When reading the data, the ID number is read for the second time though as it's still in the buffer of dataReceived() apparently.

File "millionaires.py", line 156, in <module>

reactor.run()

File "C:\Python27\lib\site-packages\twisted\internet\base.py", line 1173, in run

self.mainLoop()

File "C:\Python27\lib\site-packages\twisted\internet\base.py", line 1185, in mainLoop

self.doIteration(t)

File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\reactor.py", line 46, in doIteration

SelectReactor.doIteration(self, t)

File "C:\Python27\lib\site-packages\twisted\internet\selectreactor.py", line 145, in doSelect

_logrun(selectable, _drdw, selectable, method, dict)

File "C:\Python27\lib\site-packages\twisted\python\log.py", line 88, in callWithLogger

return callWithContext({"system": lp}, func, *args, kw)

File "C:\Python27\lib\site-packages\twisted\python\log.py", line 73, in callWithContext

return context.call({ILogContext: newCtx}, func, *args, kw)

File "C:\Python27\lib\site-packages\twisted\python\context.py", line 118, in callWithContext

return self.currentContext().callWithContext(ctx, func, *args, kw)

File "C:\Python27\lib\site-packages\twisted\python\context.py", line 81, in callWithContext

return func(*args,kw)

File "C:\Python27\lib\site-packages\twisted\internet\selectreactor.py", line 151, in _doReadOrWrite

why = getattr(selectable, method)()

File "C:\Python27\lib\site-packages\twisted\internet\tcp.py", line 215, in doRead

return self._dataReceived(data)

File "C:\Python27\lib\site-packages\twisted\internet\tcp.py", line 221, in _dataReceived

rval = self.protocol.dataReceived(data)

File "C:\Python27\lib\site-packages\twisted\protocols\basic.py", line 755, in dataReceived

self.stringReceived(packet)

File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\runtime.py", line 345, in stringReceived

self.factory.identify_peer(self)

File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\runtime.py", line 488, in identify_peer

self.protocols_ready.callback(self.runtime)

File "millionaires.py", line 77, in init

m1_ge_m2 = m1 >= m2

File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\comparison.py", line 93, in greater_than_equal

b, bits = self.decomposed_random_sharing(field, m)

File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\comparison.py", line 55, in decomposed_random_sharing

bits = [self.prss_share_bit_double(field) for _ in range(bits)]

File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\passive.py", line 815, in prss_share_bit_double

b_p = self.prss_share_random(field, binary=True)

File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\passive.py", line 727, in prss_share_random

threshold=2*self.threshold)

File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\passive.py", line 113, in open

self.activate_reactor()

File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\runtime.py", line 962, in activate_reactor

reactor.doIteration(0)

File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\reactor.py", line 46, in doIteration

SelectReactor.doIteration(self, t)

File "C:\Python27\lib\site-packages\twisted\internet\selectreactor.py", line 145, in doSelect

_logrun(selectable, _drdw, selectable, method, dict)

File "C:\Python27\lib\site-packages\twisted\python\log.py", line 88, in callWithLogger

return callWithContext({"system": lp}, func, *args, kw)

File "C:\Python27\lib\site-packages\twisted\python\log.py", line 73, in callWithContext

return context.call({ILogContext: newCtx}, func, *args, kw)

File "C:\Python27\lib\site-packages\twisted\python\context.py", line 118, in callWithContext

return self.currentContext().callWithContext(ctx, func, *args, kw)

File "C:\Python27\lib\site-packages\twisted\python\context.py", line 81, in callWithContext

return func(*args,kw)

File "C:\Python27\lib\site-packages\twisted\internet\selectreactor.py", line 151, in _doReadOrWrite

why = getattr(selectable, method)()

File "C:\Python27\lib\site-packages\twisted\internet\tcp.py", line 215, in doRead

return self._dataReceived(data)

File "C:\Python27\lib\site-packages\twisted\internet\tcp.py", line 221, in _dataReceived

rval = self.protocol.dataReceived(data)

File "C:\Python27\lib\site-packages\twisted\protocols\basic.py", line 755, in dataReceived

self.stringReceived(packet)

File "C:\Users\Berry\AppData\Roaming\Python\Python27\site-packages\viff\runtime.py", line 367, in stringReceived

traceback.print_stack()

comment:28 Changed 5 years ago by Glyph

The problem is in VIFF's activate_reactor method. The reactor is very explicitly not re-entrant, and they are calling doIteration from within dataReceived. I'm surprised this even appears to work! Please file a bug with them telling them to stop doing that; perhaps we should have a bug which makes doIteration raise an exception if called reentrantly.

comment:29 in reply to:  28 Changed 5 years ago by berry

Replying to glyph:

The problem is in VIFF's activate_reactor method. The reactor is very explicitly not re-entrant, and they are calling doIteration from within dataReceived. I'm surprised this even appears to work! Please file a bug with them telling them to stop doing that; perhaps we should have a bug which makes doIteration raise an exception if called reentrantly.

Right, thanks a lot! I was assuming that VIFF's way of calling doIteration() was normal behavior, and that's why I thought that dataReceived() had to be re-entrant. Indeed VIFF is working fine for quite a diverse set of programs, and the faulty behavior seems to arise only during initialization, under special conditions (among which is the use of a compiled c implementation of Deferreds). But you're saying that VIFF's approach may be fundamentally flawed. I'll see who I can (still) contact on this.

comment:30 Changed 5 years ago by Jean-Paul Calderone

Two things I noticed about VIFF that might help the authors:

  1. There's no need for a custom reactor at all. twisted.internet.task.cooperate will help them implement the fair queuing/scheduling that seems to be desired (basically, callLater(0, f) is how you schedule something to happen "as soon as possible"; there's no need to hook into the implementation of doIteration).
  2. The activate_reactor calls actually seem entirely superfluous. The reactor is already "activated" whenever it is running. Turning the implementation of activate_reactor into a no-op (or deleting it and all users of it, depending on how VIFF handles API compatibility) should fix the re-entrancy issue.

Hope this helps.

comment:31 in reply to:  30 ; Changed 5 years ago by berry

Replying to exarkun:

Two things I noticed about VIFF that might help the authors: ... Hope this helps.

Brilliant! Indeed turning activate_reactor() into a no-op makes the problem disappear entirely! This explains a lot, and also your other remark will help a lot. Things in VIFF should be much simpler in the end as well with a proper use of reactors.

comment:32 in reply to:  31 Changed 5 years ago by berry

Quick addition: turns out that no-oping activate_reactor() has a negative impact on the performance, resulting in much larger memory usage. Apparently, activate_reactor() calls to doIteration() ensures that processing is faster in VIFF.

comment:33 Changed 5 years ago by Jean-Paul Calderone

Quick addition: turns out that no-oping activate_reactor() has a negative impact on the performance, resulting in much larger memory usage.

That make some sense, since by re-entering the reactor immediately after scheduling any I/O, buffers are going to be emptied sooner. However, some other solution will need to be found since re-entering the reactor leads to bugs like the one you observed. :)

We should probably continue this discussion elsewhere, perhaps on the mailing list. If you have some reproducable benchmarks, I'm sure there are people who would be interested in discussing how to make them perform better.

Note: See TracTickets for help on using tickets.