Opened 4 years ago

Closed 2 years ago

#4710 defect closed fixed (fixed)

TCP4ClientEndpoint.connect's deferred cannot be cancelled successfully

Reported by: marienz Owned by: glyph
Priority: high Milestone:
Component: core Keywords:
Cc: glyph Branch: branches/endpoint-cancel-4710
(diff, github, buildbot, log)
Author: glyph Launchpad Bug:

Description

If you try to cancel the deferred returned from TCP4ClientEndpoint before it has connected the following happens:

  • the _canceller function in t.i.endpoints calls connector.stopConnecting()
  • this ends up calling connector.connectionFailed()
  • connectionFailed calls factory.clientConnectionFailed
  • factory is the endpoints._WrappingFactory, whose clientConnectionFailed errbacks its deferred (the same deferred we are cancelling)
  • the canceller then tries to errback its deferred with a ConnectingCancelledError, which triggers AlreadyCalledError

test_endpoints misses this because the MemoryReactor's _FakeConnector's stopConnecting does not do anything (so it does not hit the factory's clientConnectionFailed).

Attached is a pretty terrible testcase demonstrating this, which fails for me with an AlreadyCalledError on twisted 10.1.0 and trunk.

Attachments (1)

endpoint_test.py (906 bytes) - added by marienz 4 years ago.
failing testcase

Download all attachments as: .zip

Change History (15)

Changed 4 years ago by marienz

failing testcase

comment:1 Changed 4 years ago by glyph

  • Priority changed from normal to high

comment:2 Changed 4 years ago by <automation>

  • Owner glyph deleted

comment:3 Changed 2 years ago by exarkun

#5485 was a duplicate of this.

comment:4 Changed 2 years ago by glyph

  • Cc glyph added
  • Owner set to glyph
  • Status changed from new to assigned

comment:5 Changed 2 years ago by glyph

  • Author set to glyph
  • Branch set to branches/endpoint-cancel-4710

(In [34082]) Branching to 'endpoint-cancel-4710'

comment:6 Changed 2 years ago by glyph

  • Keywords review added
  • Owner glyph deleted
  • Status changed from assigned to new

First attempt at a fix is in the branch. Build results also available, eventually.

comment:7 Changed 2 years ago by wirepair

  • Owner set to wirepair

comment:8 follow-up: Changed 2 years ago by wirepair

  • Owner changed from wirepair to glyph

Just to confirm, on line 141: http://twistedmatrix.com/trac/changeset/34083#file0 you have the canceler returning error.ConnectingCancelledError but the documentation for ConnectingCancelledError states that this exception is: "An C{Exception} that will be raised when an L{IStreamClientEndpoint} is cancelled before it connects." Although in this case can't it be cancelled *after* it connects (say while waiting for the remote end to do something)? Should the documentation state that the returned exception could also be returned in the above case? Or does it not matter?

comment:9 in reply to: ↑ 8 Changed 2 years ago by glyph

  • Owner glyph deleted

Hi wirepair,

Thanks for looking at this. To answer your question:

Replying to wirepair:

Just to confirm, on line 141: http://twistedmatrix.com/trac/changeset/34083#file0 you have the canceler returning error.ConnectingCancelledError

(By the way, your terminology's a bit weird here. The canceler is failing the existing Deferred, or perhaps calling it back; it's not "returning" ConnectingCancelledError.)

but the documentation for ConnectingCancelledError states that this exception is: "An C{Exception} that will be raised when an L{IStreamClientEndpoint} is cancelled before it connects." Although in this case can't it be cancelled *after* it connects (say while waiting for the remote end to do something)?

In that case, you can call cancel() on the Deferred, but it won't do anything. If you cancel a Deferred A that is waiting for another Deferred B (in other words: a callback on A returned B), then B will be cancelled. But in that case, where A is the Deferred returned from connect, you won't necessarily get a ConnectingCancelledError; you'll get whatever exception results from a cancelled B at that point in its callback chain. Normally, the code that called connect is going to be dealing with a Protocol anyway; something invoking connect directly won't be waiting for the remote end to do things unless it's added additional callbacks to the Deferred returned by connect before its own.

Should the documentation state that the returned exception

(Again, exceptions are raised or Deferreds are failed with exceptions; they're not "returned" unless you're actually writing a function that does return MyException().)

could also be returned in the above case? Or does it not matter?

I don't think it matters. Perhaps you could construct an experiment to satisfy your curiosity? I'm not 100% sure what you're asking.

On the "review etiquette" side of things, though, you made a couple of mistakes here. If this comment was intended to be a review, you should generally number your points (so it's easy to reference them when addressing them), say whether you think it's ready to land on trunk once those bits of feedback are dealt with. Also, there are some obvious violations of the coding standard in this branch which you should be looking for according to the Twisted development and review process pages, so f it is a review you missed some stuff :). Also, if it's a review you should remove the 'review' keyword and re-assign it to me at the same time. If it's not intended to be a review, and is instead just a comment or a question, you should not change the assignment state or the keywords. By re-assigning it to me, you are hinting to other reviewers (members of TMLabs) that this ticket will require some special expertise to review and I'm the only one who should do it, so they should look at other tickets first.

I'm sorry that these points aren't documented more clearly, and thanks for taking a look regardless; it is by getting new folks to attempt to do reviews that we educate new people and also how we notice the flaws and gaps in our existing process documentation. I wanted to point you at a "new reviewer's guide" but we don't seem to have any such thing :-).

comment:10 Changed 2 years ago by exarkun

  • Owner set to exarkun
  • Status changed from new to assigned

comment:11 follow-up: Changed 2 years ago by exarkun

  • Keywords review removed
  • Owner changed from exarkun to glyph
  • Status changed from assigned to new

Thanks! This is a very sad misbehavior.

  1. The change to proto_helpers.py adds more unverified fakeness. Please file a ticket for verifying fake connectors against actual connectors somehow.
  2. The behavior added to the endpoint tests, above the comment This should probably be a feature of MemoryReactor., should probably also be part of the fake connectors in proto_helpers.py somehow. Please file a ticket for putting it there (with the appropriate verification).
  3. Add a bugfix fragment.
  4. _WrappingFactory is missing documentation for its two new methods and its one new attribute.
  5. Looks like ClientFactory.doStart is called before ClientFactory.startedConnecting. Application code might cancel the connection Deferred in doStart, but there is no _connector attribute until ClientFactory.startedConnecting. Not sure how much I really care; these callback methods are all a little bit stupid. Maybe it's not too hard to account for, though.

Otherwise looks good. Please merge after addressing the above and you get another good buildbot run.

comment:12 in reply to: ↑ 11 Changed 2 years ago by glyph

Replying to exarkun:

Thanks! This is a very sad misbehavior.

  1. The change to proto_helpers.py adds more unverified fakeness. Please file a ticket for verifying fake connectors against actual connectors somehow.

Done, #5629.

  1. The behavior added to the endpoint tests, above the comment This should probably be a feature of MemoryReactor., should probably also be part of the fake connectors in proto_helpers.py somehow. Please file a ticket for putting it there (with the appropriate verification).

Done, #5630.

  1. Add a bugfix fragment.

Done.

  1. _WrappingFactory is missing documentation for its two new methods and its one new attribute.

Done.

  1. Looks like ClientFactory.doStart is called before ClientFactory.startedConnecting. Application code might cancel the connection Deferred in doStart, but there is no _connector attribute until ClientFactory.startedConnecting. Not sure how much I really care; these callback methods are all a little bit stupid. Maybe it's not too hard to account for, though.

I was going to go ahead and fix this, but then I made this awesome discovery: it's not possible for an application to get a reference to the Deferred returned by connect() in doStart, because the Deferred is created by the function that is calling (indirectly) synchronously calling startedConnecting. This point is somewhat subtle, though, so I added a @note in _canceller.

Otherwise looks good. Please merge after addressing the above and you get another good buildbot run.

Working on that now.

comment:13 Changed 2 years ago by glyph

A new build is running.

comment:14 Changed 2 years ago by glyph

  • Resolution set to fixed
  • Status changed from new to closed

(In [34165]) Merge endpoint-cancel-4710.

Make endpoint.cancel() work.

Author: glyph

Reviewer: exarkun

Fixes: #4710

Calling .cancel() on any Twisted-provided client endpoint
(TCP4ClientEndpoint, UNIXClientEndpoint, SSL4ClientEndpoint) now works as
documented, rather than logging an AlreadyCalledError.

Note: See TracTickets for help on using tickets.