[Twisted-web] getting response headers from client.getPage
Tommi Virtanen
tv at twistedmatrix.com
Thu Nov 11 07:35:19 MST 2004
Lars Woetmann Pedersen wrote:
> how do I get the responce header in printResult
> the code looks like this now, the content of the page
> is in data, but I also need the header to check the etag and
> last-modified
>
> d = client.getPage(myurl)
> d.addCallback(printResult)
>
>
> def printResult(data):
> print 'printing result:'
> print data
I do not want to be rude, but honestly, I think this requires some
rudeness: ASKING QUESTIONS WITHOUT SPENDING FIVE MINUTES
ON THE PROBLEM YOURSELF DOES NOT LEAVE A GOOD IMPRESSION
OF YOU IN THE MINDS OF OTHERS.
Looking for the word "header" in twisted.web.client:
class HTTPClientFactory(protocol.ClientFactory):
"""Download a given URL.
@type deferred: Deferred
@ivar deferred: A Deferred that will fire when the content has
been retrieved. Once this is fired, the ivars `status', `version',
and `message' will be set.
@type status: str
@ivar status: The status of the response.
@type version: str
@ivar version: The version of the response.
@type message: str
@ivar message: The text message returned with the status.
@type response_headers: dict
@ivar response_headers: The headers that were specified in the
response from the server.
"""
Okay, so the headers are in the factory after the deferred fires.
Let's see how we can get our hands on them:
def getPage(url, contextFactory=None, *args, **kwargs):
"""Download a web page as a string.
Download a page. Return a deferred, which will callback with a
page (as a string) or errback with a description of the error.
See HTTPClientFactory to see what extra args can be passed.
"""
scheme, host, port, path = _parse(url)
factory = HTTPClientFactory(url, *args, **kwargs)
if scheme == 'https':
from twisted.internet import ssl
if contextFactory is None:
contextFactory = ssl.ClientContextFactory()
reactor.connectSSL(host, port, factory, contextFactory)
else:
reactor.connectTCP(host, port, factory)
return factory.deferred
Okay, so you can't really access the factory via getPage.
Just write a custom getPage that returns factory instead of
factory.deferred.
def myGetPage(url, contextFactory=None, *args, **kwargs):
scheme, host, port, path = _parse(url)
factory = HTTPClientFactory(url, *args, **kwargs)
reactor.connectTCP(host, port, factory)
return factory
More information about the Twisted-web
mailing list