[Twisted-Python] In memory cache in twisted

Thu Sep 26 21:48:35 MDT 2019

Hi Maarten,
   I think you have hit the problem in the head. I do think this is
feasible as I have observed that as size of cache increases, things do get
better which might support your theory.

Is there a simple example you can add on "put a Deferred for the fetch
operation ". I am really just getting started with twisted.
Thanks for all the help.

On Thu, Sep 26, 2019 at 11:40 PM Maarten ter Huurne <maarten at treewalker.org>
wrote:

> On Friday, 27 September 2019 04:38:46 CEST Waqar Khan wrote:
> > Hi,
> >   What's a good way to use a simple dictionary as a cache in twisted
> > framework?
> > Basically, I have this callback chain where I ultimately make a rest
> > call (in non-blocking way using treq) to fetch some data. But before
> > I make the call, I am using a dictionary to see if the value is
> > available or not. But, I have noticed that the event loop gets pretty
> > busy(sometimes, things get stuck and twisted server stops) as soon as
> > I add this logic.. Which is pretty much
> >
> > @defer.inlinecallbacks
> > def fetch(key):
> >       if key in cache:
> >                return cache[key]
> >       # else call back to treq to fetch value
> >        cache[key] = value
> >        return value
> >
> > This dict can grow to around 50k.. What's a good way to solve this
> > issue?
>
> If it gets stuck, then the cause for that is probably in the part of the
> code you omitted. So it would help to elaborate on how the value is
> fetched exactly.
>
> I can see two other problems with this caching mechanism though:
>
> 1. Items are never removed from the cache, so unless there is a limit to
> the number of different keys that can be used, the cache can grow
> indefinitely. You might want something like an LRU cache rather than a
> plain dictionary.
>
> https://docs.python.org/3/library/functools.html#functools.lru_cache
>
> 2. If a lot of clients are requesting the same thing, you won't see any
> benefits from caching until the first request completes. So you could
> get a pattern like this:
>
> T=0: key A requested, A is not cached, start fetch #1 of A
> T=1: key A requested, A is not cached, start fetch #2 of A
> T=2: key A requested, A is not cached, start fetch #3 of A
> T=3: key A requested, A is not cached, start fetch #4 of A
> T=4: key A requested, A is not cached, start fetch #5 of A
> T=5: fetch #1 of A completes and is added to the cache
> T=6: key A requested, A is cached, return value immediately
>
> In this example, the value for A is fetched 5 times despite the caching
> mechanism. If the fetching takes a long time compared to the rate at
> which requests are coming in, this effect gets worse at a quadratic
> rate: the total time spent fetching is the number of requests that come
> in during the fetching of the first request times the duration of the
> fetch.
>
> To avoid this, you could put a Deferred for the fetch operation in the
> cache or in a separate dictionary and if you get another request for the
> same key before the fetch completes, return that Deferred instead of
> starting another fetch.
>
> Bye,
>                 Maarten
>
>
>
> _______________________________________________
> Twisted-Python mailing list
> Twisted-Python at twistedmatrix.com
> https://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/twisted-python/attachments/20190926/295541ed/attachment-0002.html>