[Twisted-web] streaming request (was: status of Twisted Web and Web 2)

Jean-Paul Calderone exarkun at divmod.com
Thu Mar 6 17:05:08 EST 2008


On Thu, 6 Mar 2008 14:02:53 -0700, Andrew McNabb <amcnabb at mcnabbs.org> wrote:
>On Thu, Mar 06, 2008 at 01:00:44PM -0500, Jean-Paul Calderone wrote:
>>
>> FWIW, IConsumer is a better match here than IProtocol (that's a more or
>> less irrelevant detail, though).
>
>I can see why IConsumer would be more appropriate than IProtocol, but
>I'm entirely unconvinced by the arguments that IStreamingRequestHandler
>would be better than IConsumer.
>
>To start with, I don't understand what an IConsumer has to do with an
>IResource.

Inherently?  Nothing.  I'm sure you're driving at something, but alas I
can't see what.

>
>> That said, here's where I see IStreamingRequestHandler as better than
>> re-using an existing interface like IProtocol or IConsumer:
>>
>>  * IConsumer is a very general interface.  It provides no clues about its
>>    interaction with the HTTP protocol implementation.  There's no way to
>>    deduce that if an IResource is also an IConsumer that uploads will be
>>    streamed to it instead of delivered all at once at the end.  On the
>>    other hand, IStreamingRequestHandler can have documentation which
>>    describes its purpose and behavior.
>
>I didn't understand this.
>

What I mean is that regardless of which approach is taken, it must be
documented.  The IStreamingRequestHandler approach gives you an interface
and a method to give docstrings.  The IConsumer approach doesn't, because
IConsumer encompasses far more than streaming upload handling.

>>  * IConsumer has existing semantics.  There's no reason an existing
>>    IResource implementation might not already be an IConsumer as well,
>>    for a completely unrelated reason.  If it suddenly starts to receive
>>    streaming upload data, it will almost certainly break.  On the other
>>    hand, IStreamingRequestHandler is a new interface, so it cannot be
>>    misinterpreted.
>
>If you have to consume two different things, shouldn't you have two
>objects?
>

Perhaps so!  What I tried to point out is that there may be /existing
applications/ which have IResources which are also IConsumers.  There
is no reason not to have done this.  We should avoid breaking those
applications.

>
>>  * IConsumer isn't actually sufficiently expressive for this case.  It
>>    would be quite useful if the resource had a chance to look at the
>>    request headers before streaming begins (okay, I didn't bring this up
>>    earlier, so it may have seemed like IConsumer was sufficient).  On the
>>    other hand, IStreamingRequestHandler can have a method which takes the
>>    request as an argument.
>
>In my opinion, if you need to be looking at request headers before
>streaming begins, you're not doing simple streaming, and you should be
>hooking in at a lower level.

That may be valid, but I'm not really convinced.  For example, you may
want to see query arguments from the request before deciding how to handle
the upload.  Or you may want to look at the value of a cookie or some
HTTP AUTH credentials.  Or there might be an If-Modified-Since header in
the request which needs to be respected.  I don't know for sure that the
correct way to handle any of these use cases is to hook into the stream
setup the way I've described, but the plethora of possibilities suggests
that it shouldn't be ruled out.

>
>> It may be the case that the right method for IStreamingRequestHandler to
>> have, though, is one which takes the request and returns an IConsumer to
>> which the request body will be streamed.
>
>All of the IStreamingRequestHandler stuff sounds more complicated than
>necessary.  I think I'm convinced that IConsumer is a good way to go.
>It uses existing mechanisms to do something simple.
>IStreamingRequestHandler, however, would create new mechanisms to do
>something complicated.

I think you're overestimating the complexity.  Here's a sample
implementation of what I'm talking about:

    class FileSaver(object):
        implements(IResource, IStreamingRequestHandler)

        def getChild(self, name):
            return FourOhFour()

        def consume(self, request):
            cc = ClientCreator(
                reactor,
                FTPClient,
                request.args['username'][0],
                request.args['password'][0])
            d = cc.connectTCP(request.args['hostname'], 21)
            def connected(ftp):
                return ftp.storeFile(request.args['path'][0])
            d.addCallback(connected)
            def storeInitiated((consumer, control)):
                # I'm too lazy to make this stateless, so FileSaver will
                # only be good for one request.  Also I don't really know
                # how to use FTPClient, so I'm probably making a mistake
                # by ignoring `control´
                self.consumer = consumer
                return consumer
            d.addCallback(storeInitiated)
            return d

        def render(self, request):
            # I don't really like IFinishableConsumer, but oh well.
            self.consumer.finish()
            return "File uploaded!"

If we jump straight to IConsumer and leave IStreamingRequestHandler out
then it'll look something like this:

    class FileSaver(object):
        implements(IRequest, IConsumer)

        def __init__(self, username, password, hostname, path):
            # Whoever implements getChild to return a FileSaver will
            # have to pull the args out of the request and pass them
            # in here, since I have no request here.  Although I'm
            # not sure if you can actually do that with twisted.web!

            self._buffer = []
            self._consumer = None
            self._finished = None
            self._request = None

            cc = ClientCreator(reactor, username, password)
            d = cc.connectTCP(hostname, 21)
            def connected(ftp):
                return ftp.storeFile(path)
            d.addCallback(connected)
            def storeInitiated((consumer, control)):
                self._consumer = consumer
                map(self._consumer.write, self._buffer)
                if self._request is not None:
                    self._consumer.finish()
                    self._request.write("File uploaded!")
                    self._request.finish()
                else:
                    self.write = self._consumer.write
            d.addCallback(storeInitiated)
            

        def getChild(self, name):
            return FourOhFour()

        def write(self, bytes):
            self._buffer.append(bytes)

        def render(self, request):
            if self._consumer is not None:
                self._consumer.finish()
                return "File uploaded!"
            else:
                self._request = request
                return NOT_DONE_YET


I actually had to think a bit harder about the 2nd version.  I wonder
if there's a simpler way to express it using that API.  Anyway, neither
of these is horrendously more complex than the other, so I don't think
the complexity argument carries much weight for me.

Jean-Paul



More information about the Twisted-web mailing list