Opened 11 years ago

Last modified 11 years ago

#2545 enhancement new

Multiple select reactors to support use of Twisted by non-Twisted libraries

Reported by: j1m Owned by: j1m
Priority: normal Milestone:
Component: core Keywords:
Cc: Jean-Paul Calderone, spiv, Jonathan Lange, therve Branch:
Author:

Description

Introduction

I would like to use Twisted in Zope Enterprise Objects (ZEO) which is the client-server component of the Zope Object Database (ZODB). After some discussion on the #twisted IRC channel and some follow-on private e-mail, I'll classify ZEO as a non-Twisted library that uses Twisted. It is a non-Twisted library because it provides operations that may block and are therefore inappropriate for use by code invoked directly from a Twisted reactor.

There are two issues which may potentially affect such libraries. First, such a library may need a reactor to be running. For example, to read data using ZEO, the reactor used by ZEO has to be running. Consider two cases:

  1. ZEO is used in an application that doesn't use Twisted.

In this case, the host application won't start a reactor. The use of Twisted is an internal implementation detail of ZEO, so it should be ZEO's responsibility to start the reactor.

  1. ZEO is used in an application that uses Twisted. (Note that this application may not be a Twisted application. For example, a web application that uses the Twisted Web 2 WSGI server is not a Twisted application.) The application will need to start the reactor to function. Normally, this is done at the end of the application's main entry point. If the application uses ZEO, this is likely to be too late. The application is likely to need to read database data during start up. Coordinating reactor management is problematic, at best, in a situation like this.

The second issue has to do with use of non-Twisted libraries that use Twisted in Twisted applications. For example, consider a Twisted application that requests data from ZEO while responding to a call from a reactor. If the needed data isn't available locally, then ZEO will need to download the data from the server. If ZEO uses the same reactor as the application, then it will block waiting for data that will never arrive because the reactor is blocked waiting for the return of the application code that requested data from ZEO and deadlock will occur. The Twisted application in this scenario is broken. The deadlock described here should not occur, because code called directly from a reactor should only invoke ZEO in a separate thread, however, the risk exists that someone will inadvertently call ZEO when they shouldn't, and, given the severity of deadlock, the risk seems worth mitigating. If ZEO had it's own reactor, there would be no deadlock in this scenario.

An apparent obstacle to having multiple reactors is the use of the twisted.internet.reactor global. Most code can avoid this, however, because reactors are available via transport reactor attributes. For example, a protocol used with a specialized reactor can access the reactor via self.transport.reactor. I'm using this mechanism now to allow use of a test reactor in a protocol I'm working on.

Another obstacle is the use of module globals in the SelectReactor implementation.

Minimal Proposal

I propose to:

  1. Add the reactor attribute to ITransport.

This will make it acceptable to use the attribute and will, hopefully, encourage use of the attribute rather than the global variable.

  1. Add the ability to create multiple SelectReactors.

It appears that this will entail: moving the reads and writes global dictionaries to instance variables.

With these changes, I believe that non-Twisted libraries will be able to manage and use their own reactors. When a library is imported, it will set up it's own internal reactor with code along the following lines:

   import threading
   import twisted.internet.selectreactor

   reactor = twisted.internet.selectreactor.SelectReactor()
   thread = threading.Thread(target=reactor.run, args=(False,))
   thread.setDaemon(True)
   thread.start()

It will then register servers or request connections by using the reactor's callFromThread method to call listenXXX or connectXXX methods.

I believe that this change will meet my needs and the needs of others who want to create non-Twisted libraries that use Twisted.

I volunteer to do this work.

Grand Proposal

Existing code that uses `twisted.internet.reactor' could be converted to use local reactors. This can be easily done for code, such as protocol implementations, that has access to transports.

I think this would provide a certain level of cleanliness and would make it easier to reuse existing software with private reactors, including test reactors.

I'd be happy to at least help with this, especially when the changes are mechanical and, therefore, easy. :)

Change History (12)

comment:1 Changed 11 years ago by Jean-Paul Calderone

Cc: Jean-Paul Calderone added

comment:2 Changed 11 years ago by spiv

Cc: spiv added

comment:3 in reply to:  description Changed 11 years ago by j1m

Loopback transports don't have reactor attributes.

I'm guessing that there should be a loopback reactor that provides: IReactorTime, and IReactorThreads.

Alternatively, loopbackAsync coud grow a reactor argument.

It looks like there are other things treated as transports that would need reactor attributes, including: protocols.TrafficLoggingProtocol, and conch.insults.ServerProtocol.

comment:4 Changed 11 years ago by Jonathan Lange

Cc: Jonathan Lange added

loopbackAsync should probably get a reactor argument.

By the way, it'll make the review process much smoother if you write tests for the things you change as you go along.

comment:5 Changed 11 years ago by therve

Cc: therve added

I think this can improve the quality of Twisted, but removing completely twisted.internet.reactor would break to much code to be actually done.

I think we could change this object to be some kind of property that returns the reactor of the current thread (using threading.currentThread). That may a little black magic, but would provide a good migration path.

comment:6 Changed 11 years ago by spiv

therve: I do not think having twisted.internet.reactor magically returning a different reactor per-thread is a good idea.

Legacy code will expect there to be one global reactor, accessible at twisted.internet.reactor. Such code may depend on twisted.internet.reactor.callFromThread to do what it does now. Newer, multi-reactor aware code can use other reactors without breaking this sort of behaviour.

If there's a use case for an automatic per-thread reactor feature, then add the feature somewhere other than twisted.internet.reactor.

comment:7 in reply to:  description ; Changed 11 years ago by Antoine Pitrou

Having to get the reactor from a transport sounds like a bizarre idea. There are lots of things you may want to do (like a callLater) which don't have anything to do with transports or protocols, so this solution is fairly artificial IMHO.

Also, I don't think the right solution to the ZEO problem is to run multiple reactors in parallel. It would be better for ZEO to start the reactor only if it not already running. For example, with a hypothetical reactor.runIfNotAlreadyRunning() or reactor.run(ignoreIfRunning=True).

comment:8 in reply to:  7 ; Changed 11 years ago by Glyph

Replying to antoine:

Having to get the reactor from a transport sounds like a bizarre idea. There are lots of things you may want to do (like a callLater) which don't have anything to do with transports or protocols, so this solution is fairly artificial IMHO.

Sure, IDelayedCall and IListeningPort should also have an official reactor attribute. There aren't that many other things that you can do with a reactor though.

It is generally a much better thing to get the reactor locally rather than globally. This is good practice in all kinds of ways - it makes testing easier (you don't need to make sure your tests are talking to the 'real' reactor), it makes controlling the code from a third-party library easier (give it a fake reactor which allows you to determine what events it is scheduling and what ports it's listening on if you need to shut down that system).

Also, I don't think the right solution to the ZEO problem is to run multiple reactors in parallel. It would be better for ZEO to start the reactor only if it not already running. For example, with a hypothetical reactor.runIfNotAlreadyRunning() or reactor.run(ignoreIfRunning=True).

I think there are a lot of subtle issues in the particular way that ZEO would use such an implementation based on what thread it's running in and so on, but this ticket probably isn't the right place to discuss them (unless you have a technique for ZEO which wouldn't require multiple reactors).

comment:9 in reply to:  8 Changed 11 years ago by Antoine Pitrou

Replying to glyph:

Sure, IDelayedCall and IListeningPort should also have an official reactor attribute.

Yes, but you only have an IDelayedCall after you have called reactor.callLater. Which is kind of circular (not to mention that the function passed to reactor.callLater won't receive the IDelayedCall anyway). Similarly, if you have a transport, you probably had to call reactor.connectTCP or friends.

It is generally a much better thing to get the reactor locally rather than globally.

Agreed, but I think the best practice is for the application to store the reactor instance on its own, rather than relying on the reactor being available as an attribute in a set of seemingly arbitrary places. There is no strong reason why the reactor should be made an attribute of transports rather than, say, protocols, credentials, etc. (or each and every Twisted class could grow a reactor attribute)

comment:10 Changed 11 years ago by Jean-Paul Calderone

I agree that it doesn't seem compelling aesthetically pleasing to have a reactor attribute on these objects. However, that's a pretty weak objection, and it doesn't override the problem this ticket exists to resolve.

comment:11 Changed 11 years ago by Jean-Paul Calderone

I created #2811 for a small portion of this task.

comment:12 Changed 11 years ago by Jean-Paul Calderone

(In [21216]) Merge isolated-state-2811

Author: exarkun Reviewer: therve Fixes #2188 Refs #2545

Remove global state from reactor implementations in favor of per-instance state.

Note: See TracTickets for help on using tickets.