[Twisted-Python] Synchronization techniques

Daniel Miller daniel at keystonewood.com
Wed Apr 4 22:25:12 MDT 2007


On Apr 4, 2007, at 1:43 PM, glyph at divmod.com wrote:

> On 04:35 pm, daniel at keystonewood.com wrote:
> >On Apr 3, 2007, at 8:42 PM, Itamar Shtull-Trauring wrote:
> >>On Tue, 2007-04-03 at 17:07 -0400, Daniel Miller wrote:
> >>>>twisted.internet.defer.DeferredLock and some of the related  
> classes
> >>>>are
> >>>>what you ought to be using.
> >>>
> >>>Unfortunately that only gets me half way there.  
> DeferredLock.acquire
> >>>() returns a deferred. How do I return the result of a deferred  
> from
> >>>a PB remote_xxx() function?
> >>
> >>Just return the Deferred from the remote_xxx() function.
> >
> >Thanks, I didn't know I could return a deferred from a PB  
> remote_xxx ()
> >method. That detail doesn't seem to be documented in the   
> Perspective Broker
> >documentation, which I have read quite a few  times.
>
> The PB documentation is not too great.  Perhaps this paper would be  
> helpful to you, if you haven't seen it:
>
> http://www.lothar.com/tech/papers/PyCon-2003/pb-pycon/pb.html#auto7
>
>     """
>     In addition, the remote method can itself return a Deferred  
> instead of an
>     actual return value. When that Deferreds fires, the data given  
> to the
>     callback will be serialized and returned to the original caller.
>     """

Thanks, I had not read that before, and that does explain it.  
Although it's such a short note that it could be easily missed. It  
would be much better to have a code example.

>
> > Maybe this could be
> >highlighted in the "Complete Example" [0]  section of the PB usage
> >documentation? The examples use the  TwistedQuotes application,  
> and the
> >IQuoter.getQuote() method always  returns a string (at least I  
> couldn't find
> >any implementations that  return a deferred).
>
> Please feel free to write some patches for the documentation, or  
> open a doc bug describing this issue in more detail.  It's  
> definitely an under-documented feature of PB.

I'll try to do that sometime soon.

>
> >However, that would require
> >rewriting most if not  all implementations of IQuoter to return  
> deferred's
> >and/or the code  that calls IQuoter.getQuote(), which demonstrates  
> the viral
> >nature of  twisted when used in conjunction with other libraries.
>
> I don't think that would really be the terrible burden that you  
> suggest, considering the relatively small amount of tutorial  
> documentation that implements or calls IQuoter.  One could also  
> propose a separate interface, IDeferredQuoter, to make the  
> distinction clearer.

Well of course it's no big deal to change IQuoter, but that specific  
case wasn't really my point. My point is that in the real world it's  
a BAD THING to have to rewrite perfectly good/working/tested code  
just because we want to use twisted. But this is exactly what  
happened to me when twisted was introduced into my project.

>
> >So anyway, I rewrote my server-side library to do it the twisted  
> way  and
> >return deferred's instead trying rig up some way of waiting for   
> them. I
> >still think it would be super useful to be able to pseudo- block on a
> >deferred (i.e. allow the reactor to process other events  while  
> waiting for
> >the deferred). It is very annoying to have to  rewrite many layers  
> of code
> >when twisted is introduced into a  program. I did find  
> gthreadless.py, and
> >maybe that would do it.  Unfortunately discussion on that seems to  
> have been
> >dropped some time  ago...
>
> I'm afraid that the feature you want doesn't make any sense and is,  
> in a broad sense, impossible.

Maybe it's impossible for you to see things the way I see them  
because you have become drunk on Twisted Kool-Aide. In my specific  
case I am running twisted in a single-threaded environment with a  
single synchronized resource where each request that needs to access  
that resource must gain an exclusive lock before doing anything with  
it (a classic locking scenario). This is not "I'm being lazy and I do  
not want to learn how to use Deferreds." Rather, it is a requirement  
that is dictated by the system with which I am communicating (it does  
not support concurrent access through the API provided by the  
vendor). Thus, my code would be much simpler (both to write and  
maintain) if I had blockOn(), and it would not have any risk of dead  
lock or other such concurrency bugs. You might ask why I bother to  
use Twisted? -- Perspective Broker is the most elegant way I could  
find to call remote methods in Python. If it were abstracted from  
Twisted to become a fully synchronous library I would use that  
instead, but at this point it seems that if I want PB I am stuck with  
Twisted too.

In short, this feature does "make sense" in my environment. Whether  
it's possible or not is another matter entirely.

>  There are some things like it which might be possible - for  
> example, http://twistedmatrix.com/trac/ticket/2545 - but the  
> reactor is not reentrant and in some sense could not be made  
> reentrant.
>
> Consider this innocuous looking block of code:
>
>     from twisted.internet.protocol import Protocol
>     from make_believe import magicallyBlockOn
>
>     class MagicalProtocol(Protocol):
>         def dataReceived(self, data):
>             commands = (self.buf + data).split()
>             self.buf = commands[-1]
>             for command in commands[:-1]:
>                 if command == 'QUIT':
>                     self.transport.loseConnection()
>                     return
>                 else:
>                     # Deferreds are hard, let's go shopping
>                     page = magicallyBlockOn(getPage("http:// 
> example.com/%s" %
>                                                     (command,)))
>                     self.transport.write("SIZE:"+len(page))

You're "Deferreds are hard" comment is an insult. You make it sound  
like I don't want to think. If I didn't want to think I wouldn't be  
be a software developer.

This code obviously won't work because the getPage() has to wait and  
another dataReceived() call could come in with a QUIT command while  
the first one is still waiting for getPage(). Instead you'd need to  
accumulate the data in a buffer and then do your command processing  
logic after all data has been received--that is, if you want to use  
blockOn(getPage(...))--it probably wouldn't be the smartest way to do  
this because it would be nice to start getting pages before we  
receive all of the data. But this is just one case that doesn't work  
with blockOn(). I've never said that it would magically make every  
case easier, it just makes some less complicated cases very much  
simpler.

Everything I've read about this issue suggests that the twisted  
developers just don't want to give people what they want because it  
would allow them to shoot themselves in the foot (for example, by  
using blockOn() in a multi-threaded environment or in inappropriate  
places such as the example above). But this is Python and we're  
consenting adults. With the proper warnings a feature like this could  
make twisted much more palatable for people with large existing  
projects that do not wish to rewrite entire sections of code just to  
work with deferreds. It would allow people to get the easiest thing  
working as quickly as possible, and then go back and write the  
optimal deferred implementation later when performance/blocking/etc.  
becomes an issue.

Most people that would use blockOn() would probably use it in an  
entirely synchronous fashion where there would only be one deferred  
being processed at any given time. In these cases blockOn() would  
work just fine (if inefficiently). From your point of view that  
probably totally defeats the purpose of using twisted, but as I have  
pointed out above there are other useful features in twisted beside  
its deferred mechanism (PB).

The concept that I am thinking of seems entirely possible, although I  
am sure it would require rewriting existing reactor implementations.  
However, in the long run that seems like a small cost if twisted  
could be more widely adopted because it would play nicer with  
existing non-async code.

>
> If you were using Deferreds to track the result of the 'getPage'  
> operation, you could cancel the callbacks that write to the  
> transport in connectionLost.  However, with magical blocking, one  
> dataReceived method might be interleaved with another.  That means  
> that every time through the loop, you have to check to see if the  
> transport has already been disconnected - the code as presented  
> here is buggy and will spuriously fail depending on the order of  
> the connection being lost and the remote page being retrieved.
>
> In this example I've been careful to accumulate all the buffer- 
> management and parsing logic at the top of the method, before any  
> potential re-entrancy can happen, but other code (most everything  
> in Twisted's existing protocol implementations, not to mention just  
> about all application code) would not be so lucky.
>
> It might be quite feasible to implement a microthreaded runtime  
> environment that lived on _top_ of Twisted and explicitly accounted  
> for issues like these, but that would not really be substantially  
> different than 2.5+inlineCallbacks.
>
> >For the record, I've included updated versions of the previously   
> posted
> >code below. I'd be happy if someone pointed out if I'm doing   
> anything wrong
> >(with respect to twisted) in this code.
>
> Nothing immediately jumps out at me.

Thanks for the review.

~ Daniel






More information about the Twisted-Python mailing list