[Twisted-Python] A pseudo-deferred class that can be canceled

Wed Jan 6 23:55:30 EST 2010

Hi Glyph

It's very late here, so I'll limit myself to a few thousand lines of reply.

>>>>> "Glyph" == Glyph Lefkowitz <glyph at twistedmatrix.com> writes:
Glyph> On Jan 6, 2010, at 7:09 AM, Terry Jones wrote:

Glyph> What I mean is, there are a lot of weird little edge-cases in how
Glyph> multiple layers of the stack interact when they're dealing with a
Glyph> shared Deferred, and if we're

The above was truncated.

Glyph> However, upon further inspection I think that they key distinction
Glyph> between what you've proposed and what I'm talking about is the
Glyph> distinction between cancelling *one* layer of the callback chain and
Glyph> cancelling *all* layers of the callback chain.

Yes, that's right. I nearly made a diagram for people today, but didn't
know if anyone would be interested. But here's one way to look at it.

In today's deferred world, you have (in general) situations like this:

  func makes d -> c1 -> c2 -> c3 -> c4 -> c5 -> c6 -> c7 -> client -> c8 -> c9

I.e., the client makes a call, gets its hands on a deferred (which already
has zero or more call/errbacks on its chain) and adds its own callbacks.

At that point cancellation is very hard. Neither the client, nor the
deferred itself, or the original function, can know how to cancel the
operation. From the POV of the client and the deferred, that callback chain
is just a bunch of indistinguishable functions.

My ControllableDeferred class, if used by just the client, makes it
possible for the client to cut the link between c7 and c8, either by iself
(the client) calling the deferred it receives, which causes c8 to fire/err,
or by deactivating it, thereby arranging that c8 is never called.

So the ControllableDeferred in some sense introduces a cut point between
two callbacks in the original chain. And the cut point is in a sensible
logical position because the client added c8 and c9 and can be expected to
know what to do to clean them up if it decides it's done waiting for the
original d to fire.

But the above picture is more uniform than the reality: it hides the fact
that the callbacks were added to the deferred in groups (of zero or more).
That is, the chain really looks like this:

  func makes d -> w1 -> w2 -> x1 -> x2 -> y1 -> y2 -> y3 -> client -> c8 -> c9

which is to say that the client in fact called function Y. Y called X. X
called W, and W called something that returned a deferred. W then adds w1,
and w2 to d and returns it to X. X adds x1 and x2 to d and returns it to Y.
Y adds y1-3 and returns it to C.

So you can imagine now that we insert a cut point at each logical boundary,
and then the cancel information can flow back up the chain and each logical
unit presumably knows how to discard / abort etc., whatever it may have in
progress.

That picture is mainly for clarity. I'm sure you're miles ahead already...

Glyph> Your description (elided for brevity's sake) was very helpful.
Glyph> You've got resources which your callbacks are consuming by way of
Glyph> being "currently outstanding", and you want to be able to free
Glyph> *those* resources, without necessarily worrying about

However you were going to finish that sentence, I agree :-)

I want to free the resources, and I want to be able to get on with whatever
it is I'm supposed to be doing.

>> Yes, agreed. I like the fact that the class is simple and that it deals
>> with the client-side issues, allowing ignoring, timing out, early firing,
>> etc.  As you say, the much harder problem remains. But the harder problem
>> is a bit less messy now (at least in my mind): it's "just" cancellation.
>> Responsibilities are cleanly divided by my class - the client takes care of
>> itself, and cancellation has *only* to deal with callbacks placed on a
>> deferred that was generated by what the client called.

Glyph> I don't think that you can completely separate the problems.  You
Glyph> seem to have a reasonable solution to the problem of one layer of
Glyph> the Deferred stack, but once you're trying to deal with multiple
Glyph> layers of the stack at once, interactions occur which can be
Glyph> difficult to reconcile with the same API, many of which are already
Glyph> documented in the ticket's discussion.

It may be that there are interactions between W and Y (for example) in my
above (2nd) diagram, but I expect that would be infrequent. E.g., W might
decide to attach a callback to d after it has been returned to (and added
to by) X, Y, etc.  That seems to be a problem, but if W were to add those
extra callbacks within another logical unit of the callback chain, it would
be alerted of the cancellation in the normal fashion (twice). Make sense?

>> Looked at from this POV, an approach to cancellation would be for code that
>> is able to cancel operations it has begun to also provide a cancel method.
>> One way to think about doing this would be to have the cancel method take a
>> deferred as an argument.

Glyph> This is a *very* interesting idea, although I don't like the API
Glyph> that you propose for it.  By separating the cancel method from the
Glyph> Deferred itself, you remove the ability for a trivial client of that
Glyph> Deferred to say "forget about it" without also maintaining a
Glyph> reference to the thing that gave it the Deferred in the first place.

I agree that's less desirable, but I'm not sure it's a necessary
consequence of the approach. Or maybe it is.

Today I modified my ControllableDeferred class to allow a cancelFunc
argument. The __init__ is a tiny bit more clunky, but it has methods just
like the old class, e.g.

    def callback(self, result):
        if not self._called:
            self._called = True
            if self.cancelFunc:
                self.cancelFunc(self._calld)
            defer.Deferred.callback(self, result)

    def deactivate(self):
        if not self._called:
            self._called = True
            if self.cancelFunc:
                self.cancelFunc(self._calld)

This is just what you suggest - the cancel function is inside the deferred
class (my ControllableDeferred is a subclass of Deferred, so that's
literally true).  The client, receiving an instance of this class, can just
say "forget about it" and the cancel goes back to wherever it should go, if
anywhere.  So you can imagine writing a getPage function (or class) that
returns ControllableDeferred instances. Calling the deferred or
deactivating it would then result in a HTTPClientFactory instance calling
transport.loseConnection.

>> Something like my class could then hand the deferred back, effectively
>> saying "my client is no longer interested in this deferred. You can
>> call/errback it, or not, it makes no difference to us". If you've done
>> that once, you can do it multiple times - by which I mean that I might
>> write code that's a client of getPage, and getPage is a client of XXX,
>> and XXX is a client of YYY, etc. Each could in turn pass the deferred it
>> got back to the thing that created it.

Glyph> This implies, to me, that the cancellation callback would be better
Glyph> passed to addCallbacks(): effectively creating a third callback
Glyph> chain going from invoker to responder rather than the other way
Glyph> 'round as callbacks and errbacks do.

Yes, I like that a lot, at least in a 5:30am superficial kinda way.

A key difference between what I'd imagined and what you're suggesting is
that in my approach, the cancel call goes directly to the thing (it would
need to be a class instance, I suspect) that got the deferred. I.e. from my
2nd diagram, if the client calls cancel (or deactivate, as in the code),
then the thing that added y1 to the chain is going to have its cancel
method called (or some method that it asked to have called). So the control
in a sense jumps back over y3, y2 and y1 to the root of the logical Y
section.

Your approach passes the signal back up the chain. Most secondary steps,
like y3 and y2, will pass the call along without taking any action. But
they don't have to, which is good. And the first callback of a logical unit
can always do exactly what would have been done in my approach above.

I think your approach is better.

Glyph> I have stumbled in the direction of this thought a few times already
Glyph> but this is the first time I've had a really clear grasp of how it
Glyph> would work.  Now I can see that each layer of the stack may have its
Glyph> own resources that it might want to clean up... previously I thought
Glyph> this could be done entirely with errbacks, but in this version, it
Glyph> doesn't matter if the base deferred doesn't know how to kick off the
Glyph> errback chain: all the resources on the *rest* of the callback chain
Glyph> can be cleaned up.

Yes. And the logical divisions of the call/errback chain are going to
ignore each other in any case. Once a further-down-the-chain function has
either called or deactivated the deferred (to put it simply - it's actually
not just one deferred, at least in my implementation), it doesn't matter at
all what the upstream (earlier) functions do - the result, if any, is not
going through.

Glyph> I'm going to need to figure out some good values for XXX and YYY
Glyph> here in order to truly dispel the fog, though.

I'm a bit foggy too. That's why I started playing with getPage to try to
use a common example with at least a few levels of processing. But I didn't
have time to think about it clearly. I wrote some foggy code, which I wont
inflict on you. I'm pretty sure there's a clean solution in here though,
that we can get to with a bit more back & forth.

>> If there's no cancel method, then that's as far as can be gone with
>> canceling.

Glyph> This is one of the really tricky issues that has faced this feature
Glyph> all along: what happens when some part of the chain involved doesn't
Glyph> know what to do with a canceller?  And your solution here seems like
Glyph> it may be a very elegant hack: do exactly the same thing as other
Glyph> parts of the callback chain.  What I mean is: currently, if a
Glyph> particular callback pair doesn't have a callback or an errback, the
Glyph> behavior is to do nothing and pass the result through.  Cancellation
Glyph> could do exactly the same thing!

Yes, that's great. That's *your* solution, btw :-)

>> At that point the result is no longer passed because the first
>> ControllableDeferred instance that's involved will effectively snip the
>> link (or send an early result) in the sequence of steps that would
>> originally have been done.

Glyph> Severing the link seems like a problem though; if we do that, then
Glyph> introducing any non-cancellation-aware Deferred - or callback, for
Glyph> that matter - into a cancellation-aware pipeline will prevent
Glyph> cancellations from propagating further up, and there should be no
Glyph> reason to do that.

Yes, agreed.

>> And it keeps all code for doing cancellation out of the Deferred class.

Glyph> Why is it that you want to keep the cancellation code out of
Glyph> Deferred?  It seems very useful to me to have one object that you
Glyph> can say "stop" to, without necessarily knowing what's going on above
Glyph> it or where it came from.

Yes, I guess I didn't want to keep it out of there - especially since I
already put it in today.... I guess what I really meant was that I wanted
it to be clean / simple, because Deferreds are that way already (once
you've spent a couple of years thinking about them).

>> OK, sorry for so many words. I hope this seems like it's heading in a
>> useful direction. It does to me.

Glyph> Yes, this has been very useful.  I hope we can distill this into
Glyph> some useful conclusions soon. :)

I think we can / will.  It should be fairly easy to build an example based
on getPage. I badly wanted to today, but we have a ton of stuff going on
right now and I forced myself to put this aside for some hours.

Terry