[Twisted-web] Gzipped Response with web.client.Agent

exarkun at twistedmatrix.com exarkun at twistedmatrix.com
Tue Aug 10 10:13:23 EDT 2010


On 9 Aug, 11:34 am, sergei.vokdin at yandex.ru wrote:
>Hello,
>
>I've putted together examples from the web (see below) to replace 
>getPage with web.client.Agent and it works fine. Now, I'd like to get 
>gzipped response, but I can't get gunzipping to work before returning 
>result.
>
>Thanks for help!

What are you having trouble with?  Your code looks okay, more or less. 
If I were writing it, I'd try to make the gzip support more transparent, 
but the way you've done it seems like it should probably work.

A few simple things I notice that could cause problems, but won't 
necessarily...

  * Using repeated string concatenation to buffer the response is going 
to be extremely slow for responses of any significant size.

  * In general, there's no guarantee you'll be able to decode the un- 
gzipped bytes using utf-8.  You could easily have downloaded a gzipped 
TIFF image.

  * Similarly, there's no guarantee that the un-gzipping will succeed if 
you got a truncated response (represented by the PotentialDataLoss 
failure).

  * The server might have sent back un-gzipped contents.  You have to 
check one of the response headers to see if it's appropriate to do the 
decompression.

Jean-Paul
>class StringGzipReceiver(Protocol):
>    def __init__(self):
>        self.string = None
>        self.deferred = defer.Deferred()
>
>    def dataReceived(self, bytes):
>        print "dataReceived"
>        print type(bytes)
>        if self.string:
>            self.string += bytes
>        else:
>            self.string = bytes
>
>    def connectionLost(self, reason):
>        if reason.check(ResponseDone) or 
>reason.check(PotentialDataLoss):
>            gzipper = gzip.GzipFile(fileobj=self.string)
>            gz = gzipper.read()
>            result = unicode(gz, 'UTF-8')
>            self.deferred.callback(result)
>        else:
>            self.deferred.errback(reason)
>
>
>class StringReceiver(Protocol):
>    def __init__(self):
>        self.string_io = codecs.getwriter('utf_8')(StringIO())
>        self.deferred = defer.Deferred()
>
>    def dataReceived(self, bytes):
>        self.string_io.write(bytes)
>
>    def connectionLost(self, reason):
>        if reason.check(ResponseDone) or 
>reason.check(PotentialDataLoss):
>            self.deferred.callback(self.string_io.getvalue())
>        else:
>            self.deferred.errback(reason)
>
>
>class StringProducer(object):
>    implements(IBodyProducer)
>
>    def __init__(self, body):
>        self.body = body
>        self.length = len(body)
>
>    def startProducing(self, consumer):
>        consumer.write(self.body)
>        return succeed(None)
>
>    def pauseProducing(self):
>        pass
>
>    def stopProducing(self):
>        pass
>
>
>
>
>
>
>
>def SearchHotelsByID():
>    host = 'demo.com'
>    postdata = 'some data'
>    headers = {
>        'Host'              : [host],
>        'Accept-Encoding'   : ['gzip']
>        }
>
>    def cbRequest(response):
>        stringReceiver = StringGzipReceiver()
>        response.deliverBody(stringReceiver)
>        return stringReceiver.deferred
>
>    def _noPage(failure):
>        print "Error: %s" % failure.getErrorMessage()
>        print failure.getTraceback()
>        return failure
>
>    agent = Agent(reactor)
>    d = agent.request(
>        'POST',
>        url,
>        headers=Headers(headers),
>        bodyProducer=StringProducer(postdata)
>        )
>    d.addCallback(cbRequest)
>    d.addErrback(_noPage)
>    d.addBoth(finish)
>
>    return d
>
>_______________________________________________
>Twisted-web mailing list
>Twisted-web at twistedmatrix.com
>http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-web



More information about the Twisted-web mailing list