[Twisted-Python] twisted compatibility with multiprocessing module in fork+execv mode

Glyph Lefkowitz glyph at twistedmatrix.com
Sat Oct 3 05:07:03 MDT 2015


> On Sep 30, 2015, at 03:25, Flavio Grossi <fgrossi at voismart.it> wrote:
> 
> I know the multiprocessing module is not properly supported by twisted apps because of the interactions among duplicated file descriptors and signal handling, as discussed other times.

To be fair, the multiprocessing module has most of these issues by itself :-).  The main reason Twisted didn't work with things like multiprocessing in the past was the fact that we didn't pass SA_RESTART to the SIGCHLD handler, and that has long since been resolved; you can now fork, os.system, popen, and multiprocess more or less like you can in any other Python program.

> But python 3.4 introduces a new mode to use that module by spawning (i.e. fork() followed by execv()) the new processes instead of simply forking it.

That is a definite improvement and will be far more reliable.

> So my question is how supported this is by twisted, and in general how safe it is to use subprocesses created by duplicating the parent immediately followed by the execv of a fresh interpreter.

This is what Twisted would do if it were spawning a subprocess, so... safe enough.

> What i'm thinking is something like this, to asynchronously process requests and delegate the cpu-bound work to some processes:

This will probably work, but it still has the drawbacks of multiprocessing:

1. you will be serializing 'work' via pickle, which is fraught with problems,
2. you will have no way to tell when 'work' has completed, so you will easily overload all of your worker processes under heavy load.

Instead, using something like ampoule <https://pypi.python.org/pypi/ampoule <https://pypi.python.org/pypi/ampoule>> would allow you to use twisted's spawnProcess facility to send and receive data via a more reliable serialization mechanism than pickle, and get straightforward feedback (Deferreds firing with results) when work is complete.

In fairness, even doing this with ampoule is altogether too much boilerplate, and we should probably have something for quick-and-dirty multiprocessing like a 'deferToProcess' that just uses pickle and presents a similarly convenient API, spawning python interpreters as necessary behind the scenes.  So I can understand why you're looking at multiprocessing; all I can tell you for now is that it is probably worth setting up all the necessary infrastructure to do this the "right way" because it will be more reliable and you will rapidly need to expand to do bi-directional communication.

Thanks for using Twisted,

-glyph

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://twistedmatrix.com/pipermail/twisted-python/attachments/20151003/a60853f4/attachment-0001.html>


More information about the Twisted-Python mailing list