Opened 9 years ago

Closed 9 years ago

#3614 defect closed duplicate (duplicate)

HTTPPageGetter bug on getPage with followRedirect: relative vs full url paths

Reported by: ostinelli Owned by:
Priority: normal Milestone:
Component: web Keywords: HTTPPageGetter getPage followRedirect
Cc: Branch:
Author:

Description

When following redirects on a getPage method, HTTPPageGetter will redirect to the URL specified in the LOCATION header of a 301 response.

However, since location headers do not specify the complete destination url, this may result in a 404 error. For instance, a 301 response with a location header such as

Location: /redirected.php

will result in a 404 since HTTPPageGetter will 'forget' about the full url of the redirecting page, and actually try to get /redirected.php [not http://www.example.com/redirected.php].

proposed solution is a modification in twisted.web.client:

    def handleStatus_301(self):
        l = self.headers.get('location')
        if not l:
            self.handleStatusDefault()
            return
        url = l[0]
        if self.followRedirect:
            scheme, host, port, path = \
                _parse(url, defaultPort=self.transport.getPeer().port)

###### \/ suggested correction

            if url[:4] <> 'http':
                url = "%s/%s" % (self.factory.url[:self.factory.url.rfind('/')], url)

###### /\ suggested correction

            self.factory.setURL(url)

Change History (2)

comment:1 Changed 9 years ago by Jean-Paul Calderone

Resolution: duplicate
Status: newclosed

Duplicate of #3384

comment:2 Changed 7 years ago by <automation>

Owner: jknight deleted
Note: See TracTickets for help on using tickets.