[Twisted-Python] questions about twisted usage

Thu Mar 22 05:37:14 MDT 2012

On Thu, Mar 22, 2012 at 12:20:37PM +0200, Uri Okrent wrote:
> 1. Is deferToThread running the function in a real python thread?
> Should this be used (rather than a standard deferred) for any function
> that might block?

Yes, deferToThread() runs things in a real Python thread. If you have
code that runs very quickly, or is written in Twisted's asynchronous
style (with Deferreds), you should be fine; if you have something that
takes a while to run you should use deferToThread().

> 2. I understand that deferreds run "later".  However, once a deferred
> (or a deferToThread) is picked up and run, does it run from start to
> finish?  Can it be interrupted in the middle of the function?  How
> about its callback/errback?  Can another deferred jump in for
> processing in between a deferred and it's callback/errback, or in the
> middle of processing another deferred?

A Deferred can wait on the result of other Deferreds; while one Deferred
is waiting (say, waiting for a timer to go off, or waiting for network
activity), others may be running. Each individual callback/errback
function is run in its entirety, though.

> 3. Are there any guarantees regarding the order of execution of
> deferreds? (I.e., are deferreds processed in the order in which they
> are created?)

Maybe, but whenever a Deferred waits on the result of another Deferred,
you're at the mercy of whatever they're waiting for.

For example, say you use Twisted to retrieve the contents of two
web-pages:

    getPage("http://a.example.com").addCallback(process_data)
    getPage("http://b.example.com").addCallback(process_data)

The request for "http://a.example.com" will be launched first, but if
that server takes longer to respond, process_data() might receive the
response from server B first.

> 4. Related to #2, and #3, does it make sense to use twisted when
> requests that are serviced may depend on one another.  For example, a
> client makes a request 'add-A' which is deferred (so that the server
> can keep processing requests), and immediately afterwards makes a
> request 'modify-A' (which is also run as a deferred).  Can I count on
> add-A being done so that modify-A doesn't attempt to work on something
> which hasn't been created yet?

In general, no. For example, if you get a request from the network that
causes you to send "insert into table" to the database, and meanwhile
get another request from the network that causes you to send "update
table" to the database, you have no way of knowing whether one will
complete before the other. Twisted's asynchronous database wrapper uses
a connection pool, so the two requests might be sent down separate
connections in parallel, so the database could actively prevent the
"update" from seeing the results of the "insert".

If you're getting stateful requests without any kind of stateful
framing (say, 'begin transaction'/'commit transaction' messages, or some
kind of session ID, or something like that), you have a problem Twisted
cannot help with.

If you *do* have some kind of session, you can set up your server so
that you have a DeferredSemaphore per session, which will ensure that
the next Deferred until you've finished with the previous one. For
example:

    ds = DeferredSemaphore()

    def got_message(msg):
	newDefer = ds.acquire()
	newDefer.addCallback(lambda _: processMsg(msg))
	newDefer.addBoth(lambda _: ds.release())

> 5. Related to all of the above.  What If I want to modify a database
> inside a deferred?  Is that incorrect usage?  Specifically, if all my
> requests run as deferred, and they all start a transaction and the
> beginning, and commit the transaction at the end, will I run into
> problems due to context switching in the middle of deferreds? (such as
> one request committing for both requests, starting two transactions in
> a row, committing twice in a row, and so on.)

As somebody else mentioned, what you want is twisted.enterprise.adbapi.
It maintains a connection pool, and every database call you make
(usually via the .runQuery() or .runOperation() methods) will be run in
a separate connection, so .commit() or .rollback() will be run on the
correct connection, and you won't have problems with cross-talk between
concurrent requests.

Tim.