Opened 7 years ago

Closed 6 years ago

#4376 enhancement closed fixed (fixed)

gtk2reactor wakes up at least 10 times a second

Reported by: thomasvs Owned by: Itamar Turner-Trauring
Priority: normal Milestone:
Component: core Keywords:
Cc: Andoni Morales, Jean-Paul Calderone, Paul Sladen, Jason Branch: branches/gtk2-wakeups-4376
branch-diff, diff-cov, branch-cov, buildbot
Author: itamarst

Description

The current implementation does timeout scheduling from Twisted instead of glib's timeout_add for efficiency reasons. In the case with lots of pending callLater's that's indeed better.

However, in the case of having few or non pending callLaters, this is highly inefficient. I have a long-running program on my N900 that wakes up 10 times a second and drains my phone's battery.

Attachments (1)

4376.patch (1.1 KB) - added by Andoni Morales 7 years ago.
Don't schedule a timeout if there are no pendings calls

Download all attachments as: .zip

Change History (20)

comment:1 Changed 7 years ago by Andoni Morales

Cc: Andoni Morales added
Keywords: review gtk2reactor added
Owner: Glyph deleted

comment:2 Changed 7 years ago by Andoni Morales

Keywords: review removed
Owner: set to Andoni Morales

Forget this patch, it's totally wrong.

Changed 7 years ago by Andoni Morales

Attachment: 4376.patch added

Don't schedule a timeout if there are no pendings calls

comment:3 Changed 7 years ago by Andoni Morales

Keywords: review added
Owner: Andoni Morales deleted

The first time I uploaded the wrong file. This time it should be the good one :)

comment:4 Changed 7 years ago by Jean-Paul Calderone

Cc: Jean-Paul Calderone added
Keywords: review removed
Owner: set to Andoni Morales

The patch looks pretty good. It definitely fixes a big part of the issue. I think there's a bit more room for improvement though. ;)

  1. simulate still imposes a 0.1 second maximum on the timeout it uses to reschedule itself. Can we get away with removing this as well? If it stays, the reactor will still wake up 10 times a second as long as there are any outstanding delayed calls.
  2. The 0.1 in callLater looks like a slightly cheesy hack to make ensure that an earlier delayed call set up after a later one will actually get run on time. It'd be nice to do this a bit more cleanly (perhaps looking at _pendingTimedCalls[0] and cancelling and rescheduling _simtag if it's earlier than the previous one).
  3. The multiplications by 1010 still in gtk2reactor.py are a sort of irksome. I dunno if that really falls under this ticket or not; it might be worth thinking about a bit though.

comment:5 Changed 7 years ago by Jean-Paul Calderone

I talked to thomasvs offline about this a bit. He said there was potentially a performance issue with the extra calls to reactor.timeout() necessary to address point 1 above, but we both agreed this bears further investigation (timeout should really be a cheap operation, and something else in glib2reactor should be fixed if it's not there; another possibility is that the performance problem was coming from somewhere we still don't entirely understand).

comment:6 in reply to:  5 ; Changed 7 years ago by Andoni Morales

Replying to exarkun:

I talked to thomasvs offline about this a bit. He said there was potentially a performance issue with the extra calls to reactor.timeout() necessary to address point 1 above, but we both agreed this bears further investigation (timeout should really be a cheap operation, and something else in glib2reactor should be fixed if it's not there; another possibility is that the performance problem was coming from somewhere we still don't entirely understand).

Trying to address 2 rescheduling the next timeout in callLater(), we saw a degradation increasing with the number of concurrent connections. Whith few concurrent connections we were able the get almost twice iterations per seconds more, but this performance decreased linearly with the number of concurrent connections. My guess is the that in the case of X concurrent connections, timeout() is called only once in simulate(), whilst it would be called X times in callLater(). I think profiling the benchmark test would let us understand much better what is really going on :)

Regarding 1, the idea was to create a new timeout as it would be done in simulate(), which is actually also using min(timeout, 0.1) and this would wake up the reactor 10 times per seconds as long as there still are delayed call.

Probably looking at _pendingTimedCalls[0] would be a good approach. I'll try to profile the benchmark with gprof2dot[1], since a graphic representation of the time spent on each call is probably the best way to figure out where this improvement should be done.

[1]http://code.google.com/p/jrfonseca/wiki/Gprof2Dot

comment:7 in reply to:  6 Changed 7 years ago by mik

I don't get it... Why can't callLater() just do a gobject.timeout_add(), make a custom IDelayedCall class and let glib handle the timeout?

comment:8 Changed 7 years ago by Jean-Paul Calderone

glib handles timeouts much less efficiently than any of the existing Twisted reactors. Letting glib take care of these would probably make the glib-based reactors unusable for applications with many timeouts.

comment:9 Changed 7 years ago by Paul Sladen

Cc: Paul Sladen added

This appears to have been the cause of:

"-syncd polls at 10 Hz when not enabled/signed up " https://bugs.launchpad.net/twisted/+bug/571648

...originally discovered via powertop.

comment:10 Changed 6 years ago by Jason

Cc: Jason added

comment:11 Changed 6 years ago by Itamar Turner-Trauring

Owner: changed from Andoni Morales to Itamar Turner-Trauring

Some playing around with gtk3 reactor suggests we can just get rid of the 0.1 minimum timeout and be done with it... possibly with an addition of self.wakeUp() call to reactor.stop(). It doesn't seem necessary at all - unless someone can tell me why it's there in the first place?

comment:12 Changed 6 years ago by itamarst

Author: itamarst
Branch: branches/gtk2-wakeups-4376

(In [33265]) Branching to 'gtk2-wakeups-4376'

comment:13 Changed 6 years ago by Itamar Turner-Trauring

Keywords: review added; gtk2reactor removed
Owner: Itamar Turner-Trauring deleted

I deleted the timeouts... and all the tests still pass. And I see no reason why they shouldn't. (There's one test that was failing with dirty reactor, but that seems to be because it's not a great test, so I fixed it.)

http://buildbot.twistedmatrix.com/boxes-supported?branch=/branches/gtk2-wakeups-4376

comment:14 Changed 6 years ago by dobey

Keywords: review removed
Owner: set to itamarst

Looks good to me. +1.

comment:15 Changed 6 years ago by Jean-Paul Calderone

  1. This seems to cause problems for the manhole gtk client. While that application is broken in a number of other ways, this change seems to prevent it from establishing a network connection at all. :/
  2. The change around line 481 to test_ssl.py adds a new pyflakes warning. It looks like that change isn't actually necessary?

comment:16 Changed 6 years ago by Itamar Turner-Trauring

Some things I learned:

  1. doIteration is only used when running trial using gtk2 reactor. "trial -r gtk2" thus uses completely different code path than a running gtk2reactor program.
  2. The 100ms wakeup is how reactor.callFromThread works. This e.g. adds 100ms latency to every DNS call by default.
  3. We have no ReactorMixin tests that prove callFromThread works.

I have a sketch of a fix for the first two, to be checked in soon. It needs to be backed out and then re-added TDD style once the third item is done.

comment:17 Changed 6 years ago by Itamar Turner-Trauring

Keywords: review added
Owner: itamarst deleted

(I was wrong about trial, iterate is only used in test cleanup, not test running. I was also partially wrong about callFromThread, see below for new solution.)

I added a couple more tests, rewrote lots of the reactor, and removed the changes to test_ssl.

Previously, simulate was called every 100ms, and on every I/O event. This ensured callFromThread scheduled events happened, at the expense of extra processing and power usage. The new code only calls _simulate when the reactor is woken up via the waker (so that callFromThread works), or when the earliest scheduled callLater is pending. Since reactor.stop only sets a flag which is relevant to Twisted code, we use wakeUp to ensure glib will call back into Twisted immediately after current event handler is done, otherwise it wouldn't necessarily notice immediately.

http://buildbot.twistedmatrix.com/boxes-supported?branch=/branches/gtk2-wakeups-4376

comment:18 Changed 6 years ago by Jean-Paul Calderone

Keywords: review removed
Owner: set to Itamar Turner-Trauring

Thanks. manhole now works on my Ubuntu 10.04 desktop. strace confirms that it is not waking up when there is nothing to do (although blinking the input cursor while the window is visible counts as something to do).

  1. Can you stick the notion of idleness into the news fragment? The reactors may well wake up 10 times per second, if there is work to do. :)
  2. I might have made the waker customizable with a factory instead of an argument, like:
        if not self.waker:
            self.waker = self._wakerFactory()
        ...
    
    It's a minor point, but it keeps the public interface as small as it was.
  3. I think that I had build up some kind of understanding of how gtk2reactor worked before (over years of poking it). I don't think I understand how it works now. If you feel like it, it would probably be a pretty big maintenance win if you added a few maintainer notes about how the pieces fit together.
  4. The way ReactorBuilder is used by GlibTimeTestsBuilder isn't really what I had in mind for ReactorBuilder. I think we should eventually figure out a better way to test particular reactors.

Build results look good. Please merge when you're happy you've addressed these points.

comment:19 Changed 6 years ago by itamarst

Resolution: fixed
Status: newclosed

(In [33295]) Merge gtk2-wakeups-4376.

Fixes: #4376 Review: exarkun Author: itamar

When gtk2 and glib2 reactors are idle, they no longer wake up 10 times a second, and in general are more efficient.

Note: See TracTickets for help on using tickets.