[Twisted-Python] Deferreds vs sys.getrecursionlimit()

Brian Warner warner at lothar.com
Fri Nov 14 23:06:06 EST 2008


> It's always something weird. This time, I took notes. I offer these hints to
> help future searchers find a starting point in their own debugging efforts.

And as a followup (since the problem I encountered today happened to be a
third case):


The first step to tracking down these problems is to temporarily apply the
following patch to your twisted/internet/defer.py:

Index: twisted/internet/defer.py
===================================================================
--- twisted/internet/defer.py	(revision 24958)
+++ twisted/internet/defer.py	(working copy)
@@ -325,6 +325,12 @@
                 try:
                     self._runningCallbacks = True
                     try:
+                        if len(traceback.extract_stack()) > 900:
+                            print "running", len(traceback.extract_stack())
+                            traceback.print_stack()
+                            print "running", len(traceback.extract_stack())
+                            import os
+                            os.abort()
                         self.result = callback(self.result, *args, **kw)
                     finally:
                         self._runningCallbacks = False
@@ -337,6 +343,12 @@
                         # self.callbacks until it is empty, then return here,
                         # where there is no more work to be done, so this call
                         # will return as well.
+                        if len(traceback.extract_stack()) > 900:
+                            print "chaining", len(traceback.extract_stack())
+                            traceback.print_stack()
+                            print "chaining", len(traceback.extract_stack())
+                            import os
+                            os.abort()
                         self.pause()
                         self.result.addBoth(self._continue)
                         break

That will let you know when the stack is getting close to exhaustion. By
looking at the trace that it prints out, you can find out what other code to
investigate. It is then useful to add the same traceback.extract_stack()
-using instrumentation to that code.

The two problems I described in my previous message were confined to the
methods of Deferred: even though the problems were set up by my application
code, the actual cycle/loop was entirely inside defer.py . The third problem
(that I just finished debugging) had a cycle that passed through my own
application code. In this case, the troublesome class looked like:


class ConcurrencyLimiter:
    """I implement a basic concurrency limiter. Add work to it in the form of
    (callable, args, kwargs) tuples. No more than LIMIT callables will be
    outstanding at any one time.
    """

    def __init__(self, limit=10):
        self.limit = limit
        self.pending = []
        self.active = 0

    def add(self, cb, *args, **kwargs):
        d = defer.Deferred()
        task = (cb, args, kwargs, d)
        self.pending.append(task)
        self.maybe_start_task()
        return d

    def maybe_start_task(self):
        if self.active >= self.limit:
            return
        if not self.pending:
            return
        (cb, args, kwargs, done_d) = self.pending.pop(0)
        self.active += 1
        d = defer.maybeDeferred(cb, *args, **kwargs)
        d.addBoth(self._done, done_d)

    def _done(self, res, done_d):
        self.active -= 1
        eventually(done_d.callback, res)
        self.maybe_start_task()

(you can safely ignore the eventually() call there.. that done_d callback was
not involved in this problem)

In this case, I had a Limiter instance with somewhere around 200 items in the
self.pending queue. All of those items were immediate functions: the call to
defer.maybeDeferred returns a Deferred that was already in the 'fired' state.
That means the d.addBoth() fires the callback right away, synchronously,
leading to a recursive cycle that looked like:

 self.maybe_start_task()
  d.addBoth(self._done, done_d)
   Deferred.addCallbacks(self._done,self._done)
   Deferred._continue
 self._done()
  self.maybe_start_task()

Giving 5 frames per cycle, so 200 items is enough to hit the 1000-frame
default recursion limit.


As before, the fix was to break up the stack by using Foolscap's
eventual-send operation:

    def _done(self, res, done_d):
        self.active -= 1
        eventually(done_d.callback, res)
        eventually(self.maybe_start_task)



hope someone eventually (hah!) finds this useful,
 -Brian




More information about the Twisted-Python mailing list