Opened 10 years ago

Last modified 8 years ago

#5192 enhancement new

100-continue support for twisted.web.client.Agent

Reported by: Tristan Seligmann Owned by: Stacey Sern
Priority: normal Milestone:
Component: web Keywords:
Cc: jknight Branch: branches/100continue-5192-2
branch-diff, diff-cov, branch-cov, buildbot
Author: dreid, glyph

Attachments (6)

100Continue.patch (12.1 KB) - added by Doru Arfire 10 years ago.
Adds 100-Continue to HTTP11ClientProtocol
5192_2.patch (15.8 KB) - added by Doru Arfire 10 years ago.
5192_3.patch (30.8 KB) - added by Doru Arfire 10 years ago.
5192_4.patch (35.9 KB) - added by Doru Arfire 9 years ago.
5192_5.patch (3.0 KB) - added by sdsern 8 years ago.
5192_6.patch (3.3 KB) - added by sdsern 8 years ago.

Download all attachments as: .zip

Change History (33)

comment:1 Changed 10 years ago by DefaultCC Plugin

Cc: jknight added

comment:2 Changed 10 years ago by Doru Arfire

How transparent needs this to be to the user? Should it see the '100-continue' response and maybe call a writeBodyTo method on the request or just set the Expect and let twisted handle it.

comment:3 Changed 10 years ago by Jean-Paul Calderone

It seems like there probably aren't many cases where you will care if you got the 100 status (as opposed to not getting anything). It should perhaps be possible to tell if you did or not, but the default behavior should probably be to just do the right thing when the status arrives (ie, begin writing the body using the object already passed to the request method).

comment:4 Changed 10 years ago by Doru Arfire

Is there a reason not to move this logic down to HTTP11ClientProtocol? It seems a more natural place for things that have to do with the HTTP protocol. The interface would be the same, only the 100-Continue response will be handled internally. A way the user can find out whether the body was actually written would be to use the body producer.

comment:5 Changed 10 years ago by Jean-Paul Calderone

Sure; I doubt this feature could be implemented without some changes to HTTP11ClientProtocol. The summary says Agent because Agent is the public-facing API.

Changed 10 years ago by Doru Arfire

Attachment: 100Continue.patch added

Adds 100-Continue to HTTP11ClientProtocol

comment:6 Changed 10 years ago by Doru Arfire

Keywords: review added

Adding 100-Continue support to HTTP11ClientProtocol.

comment:7 Changed 10 years ago by Itamar Turner-Trauring

What curl does to handle badly behaving servers that don't support "100 continue" (like twisted.web until #4673 is merged) is send headers, wait for response, if none is forthcoming after 1 second it then sends body on assumption server doesn't do 100 continue support. This seems like it'd be nice to have.

This is not necessary to close this ticket, so if this behavior is not included at merge time, please open a separate ticket.

Changed 10 years ago by Doru Arfire

Attachment: 5192_2.patch added

comment:8 Changed 10 years ago by Doru Arfire

5192_2.patch adds support for sending the request body after a timeout expires in case the server didn't send a response. This timeout is fixed and there's no mechanism for the moment for the caller to know whether the request body was sent because the server replied with 100-continue or because the timeout expired (apart from the 1 second delay).

comment:9 Changed 10 years ago by Jean-Paul Calderone

Keywords: review removed
Owner: set to Doru Arfire

Thanks. It's really great to see progress here. Some thoughts about the patch:

  1. Instead of re-using the WAITING state, it's probably better to introduce a new state. WAITING already means something, and the state machine is easier to follow if the states don't get overloaded.
  2. Similarly, instead of the callbacks and errbacks underneath the if _expects100Continue: check, it would be better to extend the state machine with more states and transitions. For example, the current brokenServerTimer approach looks like it will cause problems if the 100-Continue response arrives after the timeout fires.
  3. In the tests, BufferProducer can be replaced with FileBodyProducer(StringIO(...)).
  4. Both the implementation and the tests should avoid using the global reactor. Instead parameterize it and use a twisted.internet.task.Clock instance in the tests to keep them isolated from the real time.
  5. The tests don't exercise all of the new/changed code. brokenServerTimer and ebFirstResponse in particular need more test coverage. I highly recommend to report the state of test coverage.

Thanks a lot for your efforts here! I'm looking forward to the next version.

Changed 10 years ago by Doru Arfire

Attachment: 5192_3.patch added

comment:10 Changed 10 years ago by Doru Arfire

Ok, added another iteration of this patch. Some notes:

  • As exarkun suggested I've expanded the state machine by adding states for waiting for the 100-continue response and its body.
  • When we send the request body because we didn't get a response in time (maybe because the server doesn't know about expectations), we check the first response received. If it is an 100-Continue response this means that the server was just slow and we consume it and return the second response to the user.
  • As a result of the approach at 2. the TRANSMITTING and WAITING states are a little bit overloaded, in the sense that while in there we might have self._forcedRequestBody == True but it seemed overkill to double them for this situation.
  • Finally, a request call produces a few request specific variables in the HTTP11ClientProtocol and these should probably be migrated into something like a RequestContext, maybe when we do persistent HTTP connections.

comment:11 Changed 9 years ago by Itamar Turner-Trauring

Keywords: review added
Owner: Doru Arfire deleted

Looks like this is ready for review again.

comment:12 Changed 9 years ago by therve

Keywords: review removed
Owner: set to Doru Arfire

Sorry for the long wait.

As a preliminary comment, I'd say it's too bad the changes are so deep in HTTP11ClientProtocol. I guess it makes sense, but it feels weird.

  • I've noticed a couple of things untested (ie no tests fail if I touch it). Eg:
    +        _requestBodyDeferred.chainDeferred(self._requestDeferred)
    As well as _cleanupOn100ContinueError_TRANSMITTING, _cleanupOn100ContinueError_WAITING, _connectionLost_WAITING_100_CONTINUE_RESPONSE, _connectionLost_WAITING_100_CONTINUE_RESPONSE_BODY

Alright, for the cosmetics:

  • +        self.protocol = HTTP11ClientProtocol(reactor = self.clock)
    +        cooperator = Cooperator(scheduler = _immediateScheduler, started = False)
    +        producer = FileBodyProducer(StringIO(body), cooperator = cooperator)
    +    def __init__(self, reactor = None):
    +    def _setupParser(self, request, data = ''):
    Please remove the spaces around the "=" operator in these cases.
  • Please put 2 blank lines between your test methods.
  • +               TEorCL = "Transfer-Encoding: chunked\r\n"
    +           else:
    +               TEorCL = 'Content-Length: %d\r\n' % (self.bodyProducer.length,)
    You should use single quotes consistently. Also, TEorCL is not a great variable name. What about "header"?
  • +        Format this L{Request}'s headers as HTTP/1.1 and write them
    +        synchronously to the given transport
    I'm not sure what you mean by synchronously in this context, but it sounds a bit weird.
  • +            _expects100Continue = '100-continue' in [x.lower() for x in
    +                    _expectations]
    Indentation is a little bit weird. I also think the code would be more clear if you had a distinct for loop.
  • +            if self._state in ['TRANSMITTING',
    +                    'WAITING_100_CONTINUE_RESPONSE',
    +                    'WAITING_100_CONTINUE_RESPONSE_BODY']:
    Again strange indentation. What about:
    +            if self._state in ['TRANSMITTING',
    +                               'WAITING_100_CONTINUE_RESPONSE',
    +                               'WAITING_100_CONTINUE_RESPONSE_BODY']:
    Otherwise, when you break a line you should indent by one level only, like
    +        self._firstResponseDeferred.addCallbacks(
    +            self._handleFirstResponse, self._handle100ContinueError)
  • +        self._forcedRequestBody = False #yet
    You can remove the comment.
  • +        L{Request} has been parse. If the L{Response} wasn't an 100-Continue
    Typo: parsed.
  • +        expects 100-Continue. This are errors on parsing the first response
    Typo: these.

Thanks a lot for the patch, it's in a great shape. I hope we can review it faster next time...

comment:13 Changed 9 years ago by Jean-Paul Calderone

What about "header"?

That's bad in a different way. TEorCL stands for "Transfer-Encoding or Content-Length". How about "bodyFramingHeader"?

Changed 9 years ago by Doru Arfire

Attachment: 5192_4.patch added

comment:14 Changed 9 years ago by Doru Arfire

Sorry for the delay. Also, thank you for the updates, they were very useful.

I've updated the patch to work with quiescent callbacks.

Also, I've integrated your suggestions.

Regarding the parts that are not tested:

+        _requestBodyDeferred.chainDeferred(self._requestDeferred)

and cleanupOn100ContinueError_TRANSMITTING, _cleanupOn100ContinueError_WAITING seem to cause some tests to fail when I touch (comment parts of ) them. However, the failure consists in the test timeouting. Is that an accepted mode of failure?

The tests for _connectionLost_* were indeed missing. Also, I've added tests for the correct interaction with quiescent callbacks.

Thank you for the suggestions, looking forward to the next iteration.

comment:15 Changed 9 years ago by Doru Arfire

Keywords: review added

comment:16 Changed 9 years ago by Thijs Triemstra

Owner: Doru Arfire deleted

comment:17 Changed 9 years ago by Glyph

Owner: set to Glyph
Status: newassigned

comment:18 Changed 9 years ago by Glyph

Author: glyph
Branch: branches/100continue-5192

(In [35762]) Branching to '100continue-5192'

comment:19 Changed 9 years ago by Glyph

Keywords: review removed


Thank you so much for your continued patience and this excellent and substantial patch.

There are a couple of issues, which are minor enough that I think I can fix them for you and then commit it.

  1. As per the api-documentation builder, there are newly-invalid docstrings along with your changes.
  2. There should actually be two spaces after a period in a docstring, in HTTP11ClientProtocol where you "fixed" that it should be adjusted back :).
  3. This needs a news file.
  4. Timing out is not a particularly great failure mode. Tests should always fail (not error, not time out, not hang) when you write them. Luckily this is easy to fix, because you're not using the global reactor any more! When you have a Clock(), that means that you can assert that the Deferred was called by advancing the clock rather than waiting for the Deferred to be called by the reactor. (If this requires more extensive fixing than it appears at first glance, then I'll have to send this back to you.)
  5. All new and modified functions and classes need to have docstrings; even private ones, for future maintainers. So _writeHeaders should have a docstring. Thanks for covering everything else though!


comment:20 Changed 8 years ago by sdsern

Owner: changed from Glyph to sdsern
Status: assignednew

comment:21 Changed 8 years ago by sdsern

Status: newassigned

Changed 8 years ago by sdsern

Attachment: 5192_5.patch added

Changed 8 years ago by sdsern

Attachment: 5192_6.patch added

comment:22 Changed 8 years ago by sdsern

Keywords: review added
Owner: sdsern deleted
Status: assignednew

Attachment 5192_6.patch contains the changes mentioned in comment 19. Ignore attachment 5192_5.patch which is missing a new file.

comment:23 Changed 8 years ago by David Reid

(In [38667]) Apply 5192_6.patch from sdsern. Refs #5192.

comment:24 Changed 8 years ago by David Reid

Author: glyphdreid, glyph
Branch: branches/100continue-5192branches/100continue-5192-2

(In [38668]) Branching to 100continue-5192-2.

comment:25 Changed 8 years ago by David Reid

Keywords: review removed
Owner: set to sdsern


Thanks for your contribution.

I applied your patch in anticipation of reviewing this branch but I was unable to merge forward and review it because of non-trivial merge conflicts the work done in #4330.

I did however notice a few issues:

  1. In HTTP11ClientProtocol the docstring:
    +    @ivar _firstResponseTimer: A L{Deferred} that fires after
    +        TIMEOUT_100_CONTINUE seconds and forcefully sends the L{Request} body
    +        to the server.

_firstResponseTimer is not a Deferred, it is a DelayedCall.

  1. The original patch introduces trailing whitespace in
  2. In _finishResponse_WAITING:
    +        if self._state in ['WAITING',
    +                'WAITING_100_CONTINUE_RESPONSE']:
    should be:
+        if self._state in ['WAITING',
+                           'WAITING_100_CONTINUE_RESPONSE']:
  1. All these new tests should be able to use self.successResultOf and self.failureResultOf instead of returning a Deferred from the test method. Tests using these methods are generally shorter and easier to read while also allowing you to:
    1. Easily assert things about the expected result of a deferred.
    2. Ensure that the tests are entirely synchronous and remove unwanted non-determinism.

Again, thank you for your contribution, I look forward to a more thorough review of this patch once it applies cleanly to trunk.

comment:26 Changed 8 years ago by Stacey Sern

Owner: changed from sdsern to Stacey Sern
Status: newassigned

comment:27 Changed 8 years ago by Stacey Sern

Status: assignednew
Note: See TracTickets for help on using tickets.