[Twisted-Python] sharing a dict between child processes

Wed Nov 6 09:35:27 MST 2019

Hi Marteen,
  Thanks for the response.
When you say "when one process mutates the dictionary while another is
looking up something or also mutating it."  Do you mean that a key/value
pair is getting modified or is it that a dict, in general, is getting
modified.
The first one is not really a concern as the key to value mapping is
unique(so all the processes will require the same value for same key). So
read/write  to dict doesnt really have to be "threadsafe" or anything like
that.
But, dict getting modified and made available across rest of the processes
will be common.

The thing is, the major cost of our task is I/O. So, when a request comes
in we fetch some data and then cache it. Now, each processes has their own
cache and that is very inefficient. One idea is to share the cache across
processes.
Does that make sense?
Thanks for the help.

On Wed, Nov 6, 2019 at 6:22 AM Maarten ter Huurne <maarten at treewalker.org>
wrote:

> On Wednesday, 6 November 2019 07:19:56 CET Waqar Khan wrote:
> > Hi,
> > So, I am writing a twisted server. This server spawn multiple child
> > processes using reactor spawnProcess that initializes a process
> > protocol.
> >
> > Now, each of the childprocess receives some REST requests. Each
> > process has a dict that acts as cache.
> > Now, I want to share dict across processes.
> > In general, python has SharedMemoryManager in multiprocessing module
> > which would have helped.
> > https://docs.python.org/3/library/multiprocessing.shared_memory.html#m
> > ultiprocessing.managers.SharedMemoryManager.SharedMemory But since I
> > am using twisted internal process implementation, how do I share this
> > dict across the processes so that all the processes use this common
> > cache?
>
> Keeping a dictionary in SharedMemoryManager seems far from trivial. I
> don't think you can allocate arbitrary Python objects in the shared
> memory and even if you could, you would run into problems when one
> process mutates the dictionary while another is looking up something or
> also mutating it.
>
> It could in theory work if you implement a custom lock-less dictionary,
> but that would be a lot of work and hard to get right. Also having
> shared memory mutations be synced between multiple CPU cores could
> degrade performance, since keeping core-local CPU caches in sync is
> expensive.
>
> Would it be an option to have only one process accept the REST requests,
> check whether the result is in the cache and only distribute work to the
> other processes if you get a cache miss? Typically the case where an
> answer is cached is pretty fast, so perhaps you don't need multiple
> processes to handle incoming requests.
>
> Bye,
>                 Maarten
>
>
>
> _______________________________________________
> Twisted-Python mailing list
> Twisted-Python at twistedmatrix.com
> https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/twisted-python/attachments/20191106/434483aa/attachment-0002.html>