[Twisted-Python] Unhandled exceptions and observability

Svein Seldal sveinse at seldal.com
Wed Dec 27 21:18:32 MST 2017


Hi


I'm not sure how to write this email, but please let me try. I'd like to 
address something that I see as a limitation in Twisted. It might be 
that my use case is odd or that I'm outside the scope of Twisted, but 
non the less, I'd hope this could be a relevant topic.

Problem:

Unhandled exceptions can leave the application in a half-working state, 
and the in-app observability for them is difficult to obtain. Instead of 
terminating the whole application, the rest of the app can still keep 
running, and can be completely unaware of the failure.

This applies to unhandled errbacks in Deferred and principally to any 
other reactor callbacks. E.g. it can occur in Deferreds being used 
internally in Twisted, where direct access to the object isn't available 
to the caller.

As a user of Twisted, I would like to have the option to catch or fail 
my application completely when these unhandled exceptions occur, as 
would be expected in a sequential program.


Background:

I have a larger application using many simultaneous TCP, UDP and UNIX 
connections. As with Twisted, the app is grouped in functions, where 
most of the heavy lifting are done in black-box-ish modules. There is of 
course, no guarantee for everything to work smoothly and if something 
fails, the entire application stops as a clear indication of the 
failure. However, there have been some occasions where this application 
is found to be half-dead, due to a failure occurring in a reactor-based 
callback that can only be seen by reading the logs. The main application 
is unfortunately unaware of its own failure.

AFAIK Twisted has no direct mechanism for handling errors that might 
occur when user code is called from the reactor. Or even worse, the 
caller does not know about the occurred failure unless the caller has 
direct access to the failing object. I believe this is more dangerous to 
reliability than the plain failing applications is, due to lower 
observability.

Lets say the following code is used in a running application:

    from twisted.internet.task import LoopingCall
    class Foo:
      def __init__(self):
        self.loop = LoopingCall(self.cb)
        self.loop.start(2, False)
      def cb(self):
        self.count += 1

    # Main app does this:
    try:
      foo = Foo()
    except:
      print "Won't happen"
      raise

The code will fail due to the programmical error in cb, but the calling 
application won't fail and thinks everything is fine. The methodology in 
debugging errors like this is by looking through the logs.


The 0-solution:

Everywhere a function is being called from the reactor, the user is 
responsible to handling all exceptions. As is the current case.

However, this is not completely straight forward. try-expect are great 
to catch expected errors, but it's easy to forget and ignore the 
unexpected ones. Like in the example above. The safeguard for this would 
be something like:

    def cb(self):
      try:
         self.count += 1
      except:
         print "Whoops. Unexpected"
         signal_main_app()

And in a large application, there are many entrypoints (e.g. methods in 
a protcol handler), so the code becomes very cluttered. Plus it puts the 
responsibility for the user to implement the signal_main_app() framework.


Proposal:

The ideal solution would be if there were a way to configure Twisted to 
inform about unhandled exceptions. It can be a addSystemEventTrigger(), 
or a SW signal, or a process signal, or perhaps a global 
execute-last-errback function. Possibly in a debug-context.

With this one could inform the application that one deferred object has 
not handled its errbacks. Then the main application is given a choice to 
respond appropriately, like shutting down.


Is my concern about the non-observability of unhandled exceptions at all 
warranted? Is the thinking wrong? Are there any other types of solutions 
to this problem? (I would like to avoid having to patch Twisted to do it.)


Best regards,
Svein



More information about the Twisted-Python mailing list