[Twisted-web] rest webservice and big data.

Jean-Paul Calderone exarkun at divmod.com
Mon Aug 20 14:13:09 EDT 2007


On Mon, 20 Aug 2007 12:40:49 -0500, "L. Daniel Burr" <ldanielburr at mac.com> wrote:
>Hi Sébastien,
>
>On Mon, 20 Aug 2007 12:11:33 -0500, Sébastien HEITZMANN <2le at 2le.net> 
>wrote:
>>Hi
>>
>>I'm new to twisted programming and I'm wonder how to do the following 
>>thing.
>>
>>I would like to save in a file the content of a PUT method. But i need
>>this in a stream mode ( the data may be handred of MB )
>>
>>Here is a part of my code.
>>
>>class DataResource(resource.Resource):
>>     def __init__(self, dbConnection):
>>         resource.Resource.__init__(self)
>>
>>     def render_PUT(self, request):
>>         request.content.seek(0)
>>         file('data.dat','wb').write(request.content.read())
>>         request.write('OK')
>>         request.finish()
>>
>>
>>Thats inspired by an exemple of the oreilly book.
>>
>>Is there a way to get a coolback juste after the header was sended and
>>to handle the reading of the remaining data myself ?
>>
>>I use only web, not the new web2 api.
>
>You cannot stream large files using twisted.web unless you write your
>own mechanism.  On the other hand, web2 *does* support streaming file
>uploads, so I would advise you to think about using web2 instead, if
>you really want streaming.
>
>Someone with deeper knowledge of twisted.web may be able to propose
>a strategy for implementing streaming file uploads

Don't mind if I do ;)

HTTPChannel already notices the difference between when the headers have
all been received and when the body has been received entirely. When the
former occurs, allHeadersReceived is called.  In the base implementation
this sets up a file-like object into which the body will be written.  It
would be possible to do something slightly different here in order to
support streaming uploads: do resource traversal to find the IResource
the upload is being sent to and then let it deal with bytes received in
the body of the request.

The only other things which might not be obvious here.  Changes to
twisted.web should be backwards compatible so that existing twisted.web
applications continue to work without being modified.  Implementing
what I've described above without regard for backwards compatibility would
probably mean subjecting existing applications to two things:

  * resource traversal would be performed earlier than usual for the
    application.  This might have adverse consequences, or it might not.
    In the absense of any way to know for sure, we shouldn't change this
    behavior.  So, instead, the code might require a new kind of site, or
    an attribute to be set on the root resource, or something else of this
    sort which would allow new applications to indicate their preference
    for the new behavior while preserving the existing behavior for existing
    applications.

  * The body of a request is currently available in the request object
    itself.  Existing applications won't expect it to be elsewhere, nor
    will they expect to have to handle the upload as it is happening.  It
    should be required that resources indicate in some way that they are
    capable of handling streaming uploads.  This might be done by adding a
    new interface which they must implement (since they will need to provide
    methods for handling bytes from the upload, this is necessary anyway).

>but I expect it would be a fair amount of work, and end up looking similar
>to what is already in web2.

Well, "fair" is quite subjective, so maybe it is and maybe it isn't ;)  It
doesn't strike me as a massive undertaking, though.  I think an initial
patch could probably be done in a day or two.  Allow another couple of days
(not necessarily elapsed - there might be some latency in finding reviews,
etc) to get feedback and make whatever improvements are suggested, and that
would probably be it.

FWIW, what I described doesn't resemble the support for this functionality
in web2 at all, I think.

Jean-Paul



More information about the Twisted-web mailing list