Opened 4 years ago

Last modified 4 years ago

#6104 enhancement new

Make it possible to cancel getPage and downloadPage

Reported by: Lukas Lalinsky Owned by: Lukas Lalinsky
Priority: normal Milestone:
Component: web Keywords:
Cc: jknight Branch: branches/get-page-cancel-6104
branch-diff, diff-cov, branch-cov, buildbot
Author: glyph


Currently there is no way to cancel the deferreds returned by getPage or downloadPage. The attached patch adds a canceller to these deferreds, so calling d.cancel() will actually close the connection and fail with CancelledError. The implementation is similar to twisted.web.xmlrpc.Proxy.callRemote, with a different way of calling the errback.

I had to change HTTPClientFactory.clientConnectionFailed, because even though the docstring suggests it should be possible to send a result to the deferred before this method is called, it wouldn't fire _disconnectedDeferred in that case and the user-specified callback/errback would never be actually called.

Attachments (2)

get-page-cancel.diff (6.0 KB) - added by Lukas Lalinsky 4 years ago.
get-page-cancel-2.diff (553 bytes) - added by Lukas Lalinsky 4 years ago.

Download all attachments as: .zip

Change History (11)

comment:1 Changed 4 years ago by DefaultCC Plugin

Cc: jknight added

Changed 4 years ago by Lukas Lalinsky

Attachment: get-page-cancel.diff added

comment:2 Changed 4 years ago by Glyph

Author: glyph
Branch: branches/get-page-cancel-6104

(In [36747]) Branching to 'get-page-cancel-6104'

comment:3 Changed 4 years ago by Glyph

I checked it into a branch and forced some builds.

comment:4 Changed 4 years ago by Glyph

Keywords: review removed
Owner: set to Lukas Lalinsky

Hi luks, thanks for your contribution.

This branch looks pretty good; coding-standard compliant, starting with unit tests. However, it has one significant flaw in its test cases. They use the real reactor and wait for real time. To be fair, they took from a bad example sitting right next to them, but the 10 second delay is noticeably slow when running the tests. Tests should always run quickly; it's important to be able to get through the whole suite as fast as is reasonable, so development doesn't bog down in endless waiting.

Rather than just shorten the delay though, I suggest you follow the example set forth in the trial tutorial of how to test code that relies on time passing, with twisted.internet.task.Clock.

Please re-submit for review when you have tuned up the unit tests, and submit a patch against the branch I created rather than trunk.


comment:5 Changed 4 years ago by Lukas Lalinsky

Thank you for the review.

I'm not 100% sure now, but I think it did not wait the full 10 seconds when I tested it. I wouldn't have submitted the patch with that. It does slow the tests now, so I'll fix it and add the new patch.

Changed 4 years ago by Lukas Lalinsky

Attachment: get-page-cancel-2.diff added

comment:6 Changed 4 years ago by Lukas Lalinsky

Keywords: review added
Owner: changed from Lukas Lalinsky to Glyph

Ok, it turns out the patch wasn't correctly merged and the web-specific canceller wasn't called at all. However, the default canceller from Deferred was used, so it still raised the CancelledError exception. See the new patch with the missing change.

This means that the code it not in fact properly tested, but I'm afraid I do not know how to test it. The end result, with or without the specific canceller implementation, is the same -- the deferred raises CancelledError. The only thing that comes to my mind is a time-based test, e.g. if the client timeout is 10 seconds (can be lowered) and it took 10 or more seconds to raise the CancelledError exception, consider it a failure.

Do you have any ideas how to fix the tests?

Another problem is the test_getPageCancelLater which calls callLater. The getPage API doesn't have any kind of hook, where I could detect that I'm in the middle of a request, but it's not finished yet. I used 1ms as an estimate, but it's not an explicit test.

I think the Clock example will not help me, because when using getPage, I do not have access to the low level protocol and I'm not really testing timeouts. I'm testing if I can disconnect from the server in various stages of the connection and I want the disconnection to happen almost immediately after I call d.cancel().

comment:7 Changed 4 years ago by Glyph

Owner: Glyph deleted

I still might review this, but no reason somebody else shouldn't do it if I can't get to it soon.

comment:8 Changed 4 years ago by Itamar Turner-Trauring

Keywords: review removed
Owner: set to Lukas Lalinsky

The way to test this is to use a fake reactor (probably subclassing twisted.test.proto_helpers.InMemoryReactor and Clock). That way you can manually push the system through various stages and make sure cancellation happens correctly in each particular phase. For example, to test cancellation after connection was made, you would call buildProtocol on the factory in the reactor, and attach a proto_helpers.StringTransport to the protocol, then call .cancel(). Then you can write a deterministic, fast test for each possible place where cancellation may happen.

As you say, there's no way to get to the reactor right now. Therefore what you'll need to do is add an extra parameter to twisted.web.client.getPage, namely reactor, which defaults to current reactor and used by both getPage and the various classes (factory and protocol) it is integrated with.

comment:9 Changed 4 years ago by Itamar Turner-Trauring

Oh, and of course some of the test may work by creating a protocol instance directly, rather than going via getPage.

Note: See TracTickets for help on using tickets.