[Twisted-Python] FW: Large (GB) File Upload

Michael Schlenker msc at contact.de
Tue Jul 31 05:42:22 EDT 2012


Am 28.07.2012 20:55, schrieb exarkun at twistedmatrix.com:
> On 04:09 pm, bob.novas at shinkuro.com wrote:
>> Hi - I'd like some guidance, please, on writing a TCPServer that can
>> efficiently receive and process (to a database) large files received 
>>from a
>> browser client form (multipart/form-data).  I'd like to be able to 
>> process
>> the command and headers without waiting for allContentReceived, if 
>> possible.
>> In other words, I'd like to actually handle the incoming stream rather 
>> than
>> buffer it to a file or a string and then handle the Request when all 
>> content
>> is received.  Note that the server is used only by a local browser by a
>> single user.  It is not used by a large user populace.
>>
>> Is this possible? Are there any examples? Any guidance would be 
>> appreciated.
>> This was the best  I could find -
>> http://twistedmatrix.com/pipermail/twisted- 
>> python/2007-July/015738.html, but
>> it's pretty dated.
> 
> There are a few answers.
> 
> One is <http://twistedmatrix.com/trac/ticket/288>, an enhancement 
> request for a nice, documented API for handling request bodies as they 
> arrive.
> 
> Another is to override Request.handleContentChunk, which is called each 
> time request body bytes are received (and decoded).
> 
> A third is to override Request.gotLength and initialize the `content` 
> attribute differently somehow.  The default implementation of 
> `handleContentChunk` just calls `self.content.write` with the content 
> chunk.
This works fine, just needs some care when errors happen.
We use this in production for some years now, replacing the self.content
with a python object that wraps a file object.

Looks basically like this:
class Request(server.Request):
  def gotLength(self, length):
    if self.channel._command == 'POST':
       try:
	  # plug in our own writer
          self.content = fileStore.getWriter()
          self.total_length = length
       except Exception, e:
          reason = "Exception in fileStore.getWriter"
          log.err(e, reason)
          # plug in an error writer class
          self.content = ErrorWriter(http.INTERNAL_SERVER_ERROR, reason)

class ErrorWriter(object):
    """ Fake writer object that takes the request body in case of error
        in PUT or POST, and throws it away. Done in this way, because
        there is no way to signal the client it should stop sending data
        (expect: 100-continue header processing missing in Twisted, see
        http://twistedmatrix.com/trac/wiki/TwistedWebClient)
    """
    def __init__(self, code, reason):
        self.code = code
        self.reason = reason

    def write(self, data):
        pass

    def read(self):
        return ''

    def seek(self, offset, whence=0):
        pass

    def close(self):
        raise Exception(self.reason, self.code)

Michael

-- 
Michael Schlenker
Software Architect

CONTACT Software GmbH           Tel.:   +49 (421) 20153-80
Wiener Straße 1-3               Fax:    +49 (421) 20153-41
28359 Bremen
http://www.contact.de/          E-Mail: msc at contact.de

Sitz der Gesellschaft: Bremen
Geschäftsführer: Karl Heinz Zachries, Ralf Holtgrefe
Eingetragen im Handelsregister des Amtsgerichts Bremen unter HRB 13215



More information about the Twisted-Python mailing list