[Twisted-Python] sharing a dict between child processes

Scott, Barry barry.scott at forcepoint.com
Wed Nov 6 10:07:17 MST 2019


On Wednesday, 6 November 2019 16:43:52 GMT Waqar Khan wrote:
> Hi Barry,
>         Thanks for the response. Where can I read more about (1). It seems
> like that is something I need to explore.
> As we already have (2) (cache for each process).
> Thanks again for your help.

We use the UDS (Unix domain sockets) to talk to a master process.
Twisted has support for this. But you need a small patch to avoid data lose.

UDS does not lose data and is message based, not bytes based. We
use pickle to encode requests and responses.

Barry

The patch is:

--- Twisted-18.4.0.orig/src/twisted/internet/unix.py.orig       2018-08-01 
12:45:38.711115425 +0100
+++ Twisted-18.4.0/src/twisted/internet/unix.py 2018-08-01 12:45:47.946115123 
+0100
@@ -509,11 +509,6 @@
                 return self.write(datagram, address)
             elif no == EMSGSIZE:
                 raise error.MessageLengthError("message too long")
-            elif no == EAGAIN:
-                # oh, well, drop the data. The only difference from UDP
-                # is that UDP won't ever notice.
-                # TODO: add TCP-like buffering
-                pass
             else:
                 raise

You then have to handle the EAGAIN error and do retries yourself.
As it stands the patch is not good enough to put into twisted as a
full fix would need to put the handling of the retries into twisted.

I guess (2) does not work for you as the cache hit rate is low
and you need to share the cache to get a benefit. Cache entries
only get used a few times?

In our case the hit rate is high (99%+) and we just pay the cost of
populating the caches on process start up, which ends up being
noise.

Barry

> 
> On Wed, Nov 6, 2019 at 8:39 AM Scott, Barry <barry.scott at forcepoint.com>
> 
> wrote:
> > On Wednesday, 6 November 2019 14:21:22 GMT Maarten ter Huurne wrote:
> > > On Wednesday, 6 November 2019 07:19:56 CET Waqar Khan wrote:
> > > > Hi,
> > > > So, I am writing a twisted server. This server spawn multiple child
> > > > processes using reactor spawnProcess that initializes a process
> > > > protocol.
> > > > 
> > > > Now, each of the childprocess receives some REST requests. Each
> > > > process has a dict that acts as cache.
> > > > Now, I want to share dict across processes.
> > > > In general, python has SharedMemoryManager in multiprocessing module
> > > > which would have helped.
> > > > https://docs.python.org/3/library/multiprocessing.shared_memory.html#m
> > > > ultiprocessing.managers.SharedMemoryManager.SharedMemory But since I
> > > > am using twisted internal process implementation, how do I share this
> > > > dict across the processes so that all the processes use this common
> > > > cache?
> > > 
> > > Keeping a dictionary in SharedMemoryManager seems far from trivial. I
> > > don't think you can allocate arbitrary Python objects in the shared
> > > memory and even if you could, you would run into problems when one
> > > process mutates the dictionary while another is looking up something or
> > > also mutating it.
> > > 
> > > It could in theory work if you implement a custom lock-less dictionary,
> > > but that would be a lot of work and hard to get right. Also having
> > > shared memory mutations be synced between multiple CPU cores could
> > > degrade performance, since keeping core-local CPU caches in sync is
> > > expensive.
> > > 
> > > Would it be an option to have only one process accept the REST requests,
> > > check whether the result is in the cache and only distribute work to the
> > > other processes if you get a cache miss? Typically the case where an
> > > answer is cached is pretty fast, so perhaps you don't need multiple
> > > processes to handle incoming requests.
> > 
> > We have used a couple of ways to cache.
> > 1. Use a singleton process to hold the cache and ask it, via IPC, for
> > answers
> > from the other process.
> > 2. have a cache in each process
> > 
> > Barry
> > 
> > > Bye,
> > > 
> > >               Maarten
> > > 
> > > _______________________________________________
> > > Twisted-Python mailing list
> > > Twisted-Python at twistedmatrix.com
> > > https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
> > 
> > _______________________________________________
> > Twisted-Python mailing list
> > Twisted-Python at twistedmatrix.com
> > https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python







More information about the Twisted-Python mailing list