[Twisted-web] Re: [Web-SIG] WSGI woes

Thu Sep 16 14:08:48 MDT 2004

At 08:29 PM 9/16/04 +0100, Alan Kennedy wrote:
>[Alan Kennedy]
> >> In an asynchronous situation, the application cannot simply do a
> >> blocking read on the input: that will tie up the server thread.
>
>[Phillip J. Eby]
> > What do you mean by "server thread"?  A truly asynchronous server (one
> > using "no threads") cannot serve multiple WSGI requests
> > simultaneously.  In the general case, a WSGI server can only serve as
> > many requests simultaneously as it has available threads for.
>
>Sorry, I should have paid more attention to phrasing in this context.
>
>By  "server thread" I mean the thread of execution that is running the 
>select/poll operation in the server (which needs at least *one* thread). 
>If the application did a blocking read of the input running in a simple, 
>single-threaded asyncore-style server, that single thread would block, 
>holding up event processing.

Right, which is (one reason) why a WSGI server can in the general case only 
serve as many WSGI requests simultaneously as it has available threads for, 
although it's possible to improve on that worst-case condition by 
appropriate use of iterators.

>But I don't see the need for pausing logic or queues? Why can't the server 
>simply call directly into the application, e.g. using a "process_input" 
>method, in effect saying "you have some input ready".
>
>And I'm not sure I see the need for the application to check that the 
>wsgi.input hasn't been replaced: if there were middleware further down 
>that stack that was intercepting and transforming the input stream, then 
>*it* should be the one receiving the asynchronous notification from the 
>server. This lower level component would then read some input, process it, 
>and then call a "process_input" method on the next component up in the 
>stack, etc, etc.
>
>I suppose I'm talking about the server "pushing" the input through the 
>middleware stack, whereas you're talking about the application at the stop 
>of the stack "pulling" the data up through the stack. Is that right?

That's correct, and that's what I'm trying to avoid if at all possible, 
because it enormously complicates middleware, to the sole benefit of 
asynchronous apps -- that mostly aren't going to be portable anyway.

So, going by STASCTAP theory (Simple Things Are Simple, Complex Things Are 
Possible), the pause/resume approach makes asynchronous applications 
*possible*, while keeping the nominal synchronous cases and middleware 
*simple*.

>And I'd be interested to see how your approach would handle a situation 
>where there is both streaming input and output. For example, a server that 
>takes strings of any length, say 10**9 bytes, and .encode('rot13')'s each 
>byte in turn, before sending it back to the client.

Presumably, the function to pause for input needs to take a minimum length, 
or have some way to communicate available length to the application.

I don't pretend to fully understand the needed use cases here, because I 
have little experience writing web applications that need to wait on other 
network services (other than databases) while a client is waiting.  And if 
I were writing an asynchronous server, I'd probably at least consider using 
Greenlets to context-switch blocking operations so that they wouldn't tie 
up an active thread.  Such an approach is conceptually easier to deal with, 
IMO, than writing everything in continuation-passing style.

But I *do* want WSGI to make it *possible* to meet async apps' use cases, 
which is why I'm seeking input from those that do have the relevant 
experience.  The trade-off is that it shouldn't excessively complicate 
nominal compliance with WSGI.  In particular, I'd prefer that the current 
"example CGI gateway" in PEP 333 not require any major changes or 
significant expansion.