Opened 3 years ago

Closed 17 months ago

Last modified 17 months ago

#5434 enhancement closed fixed (fixed)

Web browser-like redirection Agent for twisted.web.client

Reported by: twm Owned by: jonathanj
Priority: normal Milestone:
Component: web Keywords: agent
Cc: jknight, tom.most@…, jonathanj Branch: branches/browser-like-redirect-agent-5434-2
(diff, github, buildbot, log)
Author: jonathanj Launchpad Bug:

Description (last modified by exarkun)

The existing twisted.web.client.RedirectAgent implements redirection in a strictly by-the-RFC manner, which breaks on web sites which assume the historic behavior of web browsers. An additional browser-like implementation would be useful for interacting with legacy systems. It should implement the following quirks:

  • Treat 301 Moved Permanently and 302 Found redirects on non-{HEAD,GET} requests like 303 See Other: change the method to GET and proceed, instead of failing.
  • Follow redirects which give relative URLs in the Location header, resolving them to absolute.

Change History (21)

comment:1 Changed 3 years ago by DefaultCC Plugin

  • Cc jknight added

comment:2 Changed 3 years ago by exarkun

  • Description modified (diff)

comment:3 Changed 3 years ago by twm

  • Cc tom.most@… added

comment:4 Changed 3 years ago by exarkun

#5462 was a duplicate.

comment:5 Changed 19 months ago by jonathanj

  • Author set to jonathanj
  • Branch set to branches/browser-like-redirect-agent-5434

(In [37854]) Branching to 'browser-like-redirect-agent-5434'

comment:6 Changed 19 months ago by jonathanj

(In [37855]) Add tests for 301 and 302 redirects failing on POST requests too, refs #5434.

comment:7 Changed 19 months ago by jonathanj

(In [37856]) Refactor RedirectAgent redirect reponse code handling, refs #5434.

comment:8 Changed 19 months ago by jonathanj

(In [37857]) Implement BrowserLikeRedirectAgent, a redirect agent that treats HTTP 301 and 302 like HTTP 303 for non-HEAD/GET requests, changing the method to GET before proceeding. refs #5434

comment:9 Changed 19 months ago by jonathanj

(In [37885]) Resolve relative URIs in RedirectAgent and BrowserLikeRedirectAgent, refs #5434, #5462.

comment:10 Changed 19 months ago by jonathanj

  • Keywords agent review added

comment:12 Changed 19 months ago by jonathanj

  • Cc jonathanj added

comment:13 Changed 19 months ago by rwall

  • Owner set to rwall
  • Status changed from new to assigned

Reviewing...

comment:14 follow-up: Changed 19 months ago by rwall

  • Keywords review removed
  • Owner changed from rwall to jonathanj
  • Status changed from assigned to new

Code Review

Thanks Jonathan. This branch looks pretty good to me. Here are a few
notes and comments.

Notes:

Please fix the following:

  1. Build Failures
    1. Python3 test failures need fixing (I think t.p.compat.nativeString may help): https://buildbot.twistedmatrix.com/builders/python-3.3-tests/builds/683/steps/shell/logs/stdio
    2. Twisted Checker. Some of these warnings need fixing: https://buildbot.twistedmatrix.com/builders/twistedchecker/builds/509/steps/run-twistedchecker/logs/new%20twistedchecker%20errors
  2. source:branches/browser-like-redirect-agent-5434/twisted/web/test/test_agent.py
    1. Minor nit...Why change C{GET} to I{GET}? I'd have thought C{GET} was more appropriate. Choose one and use it consistently throughout the tests and the client module.
    2. Typo "changes the methods" looks like it should be "changes the method"
    3. Create a new ticket or add an existing trac ticket number alongside the comment "XXX: If we had a way to get the final absolute URI"
    4. Minor nit...You seem to be using a mixture of L{RedirectAgent} and L{client.RedirectAgent} in the test docstrings. You should use L{client.RedirectAgent} because that's how it's imported in this module, but it doesn't really matter because API docs aren't generated for the test modules. It would be nice to be consistent though.
    5. Typo "origialResolveLocation"
    6. Add a link to "HTTP 1.1 bis draft" in the test docstring.
    7. BrowserLikeRedirectAgentTests.test_redirectToGet301 and test_redirectToGet302 docstrings contain L{RedirectAgent}. Should be L{BrowserLikeRedirectAgent}.
  3. source:branches/browser-like-redirect-agent-5434/twisted/web/client.py
    1. Add @param, @type and @return to _urljoin. It's a private function, so I don't think it's strictly required by the coding standard, but it will be helpful to anyone reading the code.
    2. Should '#' be b'#' in _urljoin, to catch early any attempt to concat unicode string with byte string in Python3?

Please resubmit for review after answering or addressing the numbered
comments above.

-RichardW.

comment:15 Changed 19 months ago by rwall

I also suggest updating the Agent Howto documentation, which currently describes the strict redirect behaviour.

comment:16 in reply to: ↑ 14 Changed 19 months ago by jonathanj

  • Keywords review added

Replying to rwall:
Thank you for the review, Richard!

  1. Minor nit...Why change C{GET} to I{GET}? I'd have thought C{GET} was more appropriate. Choose one and use it consistently throughout the tests and the client module.

C{} is usually used for content that could be valid Python code while I{} simply produces emphasis. Since the "GET" in this documentation is about the HTTP concept and not a Python string or constant it seems like I{} is more correct than C{}. I've changed the documentation, local to my changes, to use I{}.

  1. Create a new ticket or add an existing trac ticket number alongside the comment "XXX: If we had a way to get the final absolute URI"

There is already a ticket for this and it's actually in review (which causes a weird chicken and egg problem) but I've updated the comment to mention it.

I've addressed all your other review points.

comment:17 Changed 19 months ago by jonathanj

  • Owner jonathanj deleted

comment:18 Changed 17 months ago by tom.prince

  • Keywords review removed
  • Owner set to jonathanj

The XXX in _testRedirectURI should have a ticket to resolve it, or a note to in #5435 to change this.

Other than that, this looks good.

(The python3.3 builder is currently down, but running the tests locally, they pass)

I like the cleaned up tests.

Please merge, and add a note to #5435.

comment:19 Changed 17 months ago by jonathanj

  • Branch changed from branches/browser-like-redirect-agent-5434 to branches/browser-like-redirect-agent-5434-2

(In [38669]) Branching to 'browser-like-redirect-agent-5434-2'

comment:20 Changed 17 months ago by jonathanj

  • Resolution set to fixed
  • Status changed from new to closed

(In [38672]) Merge browser-like-redirect-agent-5434-2.

Author: jonathanj
Reviewer: tom.prince, rwall
Fixes: #5434, #5462

Introduce BrowserLikeRedirectAgent, a RedirectAgent subclass, that handles redirects more like a web browser, specifically it treats HTTP 301 and 302 like HTTP 303 on non-HEAD/GET requests and changes the method to GET before proceeding.

RedirectAgent now also resolves relative URI references in the Location header against the most recent request URI.

comment:21 Changed 17 months ago by exarkun

Note that assertEquals is on its way to being deprecated but this branch introduces several new uses of it.

Note: See TracTickets for help on using tickets.