[Twisted-Python] A pseudo-deferred class that can be canceled

Thu Jan 7 17:29:04 EST 2010

Terry Jones wrote:

> After I went to bed I realized that someone is immediately going to
> want to have a cancel function that returns a deferred. And what
> happens if something goes wrong in a cancel function?

FWIW, the way I've dealt with these sorts of things (in Tahoe, at least)
has been to set a flag that is checked at a variety of useful stopping
points, and if the flag is set, raise some sort of "Interrupted"
exception, which bypasses the rest of the callback chain. Then, at the
end, I add an errback which only catches Interrupted and ignores it. I
think of this as the Deferred equivalent of returning early from a
subroutine. It has the nice property that anything that goes wrong in
the interrupt/cancellation process will be reported in the same place as
any other errors.

Something like:

 class Interrupted(Exception): pass

 class Foo:
   interrupted = False
   def start(self):
     d = self.do_one()
     d.addCallback(self.do_two)
     d.addCallback(self.do_three)
     d.addErrback(self.eat_interrupt)
     return d
   def interrupt(self):
     self.interrupted = True
   def do_one(self):
     return startSomething()
   def do_two(self, res):
     if self.interrupted: raise Interrupted()
     return startSomethingElse()
   def do_three(self, res):
     if self.interrupted: raise Interrupted()
     return startSomethingOther()
   def eat_interrupt(self, f):
     f.trap(Interrupted)
     return None

 d = Foo().start()

This doesn't explicitly cancel whatever step is currently in progress,
but it makes sure that we won't move on to the next step. When the steps
are small and reasonably side-effect free, this seems to work pretty
well.

If the steps were larger, then I'd have do_one/do_two/do_three record a
counter to indicate what step was currently in progress, and then change
interrupt() to perform whatever sort of cancellation was appropriate for
that particular step. This usually makes it more obvious what sorts of
references or objects or whatever you'll be needing to cancel the work
that's been started, because you have to stash them (for use by
interrupt()) at the same time that you start the work:

   def do_two(self, res):
     if self.interrupted: raise Interrupted()
     self.current_step = 2
     handle = self.start_something_long()
     self.handle_to_cancel_step_2 = handle
     return handle.start()
  def interrupt(self):
     self.interrupted = True
     ...
     if self.current_step == 2:
       self.handle_to_cancel_step_2.cancel()

I've also had systems (somewhere in Buildbot, I think) where interrupt()
took and stashed a Failure argument, and made sure that the
already-running Deferred chain errbacked with that, by using:

 if self.interrupted: raise self.interrupted

but I'm not fond of that technique anymore, since the Failure that pops
out of the chain won't have a stack trace that references anything in
the chain. It's a tricky subject: you care both about who called
interrupt() and at what point in the chain was the interrupt recognized.
One other trick I've used is to have self.interrupted be a string,
recording "why" the process was interrupted, and arrange for the
Interrupt() class to include that string in its repr.

All that said, the handful of places where I've used these techniques
have since grown large enough that I'm planning to rewrite them in terms
of state machines, and to have exactly one Deferred (used to indicate
overall completion). The immediate problem that the big Deferred chain
is causing me is that remote Foolscap calls to hosts that have silently
disconnected (e.g. they got unplugged from the network) will stall for
20 minutes, causing the rest of the chain to stall, and a state machine
approach will make it easier to build adaptive timeouts around these
calls.

cheers,
 -Brian