[Twisted-web] html cache with timeout

Valentino Volonghi aka Dialtone dialtone at divmod.com
Tue Feb 1 13:02:27 MST 2005


On Tue, 1 Feb 2005 20:09:57 +0100, Andrea Arcangeli <andrea at cpushare.com> wrote:
> > Anyway the patch is below:
> 
> Looks a great start.
> 
> I'll give it a spin overnight to see what happens.
> 
> > +_CACHE = {}
> 
> Shouldn't this be stored in the respective classes?

There are MANY ideas on where to put this _CACHE.
jamwt volunteered for writing a memcache like http://www.danga.com/memcached/
Which will probably be one of the backends. 

There are also other problems to solve about this cache, but it is working with this patch (that is now committed in the caching branch) and people can test it or provide  different behaviours.

> > +def CachedSerializer(original, context):
> > +    cached = _CACHE.get(original.name, None)
> > +    life = now()-original.lifetime
> 
> Can we execute only one single gettimeofday? gettimeofday is one of the
> biggest kernel costs of twisted in general (modulo poll). I will deploy
> initially on x86 (on x86-64 with vsyscalls gettimeofday is zerocost).
> 
> Could you also keep it similar to my patch where a timeout <= 0 means
> "cache forever"?

Yep, this is just a first attempt. Further work will be done on the caching branch.

> > +    if cached and cached[0] > life:
> > +##         print "="*20
> > +##         print cached[0]
> > +##         print life
> > +##         print "="*20        
> > +        yield cached[1]
> > +        return
> 
> Why yield if you return immediatly? Why not return cached[1]?

Try it yourself :)

In python you cannot have a return statement with arguments when inside a generator.
CachedSerializer is in fact a generator (because of the yield keyword inside the func body) and can't have a return statement with arguments.

> > +    _CACHE[original.name] = (now(), result)
> 
> what is contained in original.name? How to identify exactly which object
> is being cached? (just to understand how should I use this exactly)

original name is the first argument of the tag instance.

t.cached(name="foobar")

this will create an empty cached tag with name foobar, you can also do:

t.cached(name=(IFoo, IBar)) 

as was suggested if you need. No check is done on the type of name but it must be hashable.

> Do I understand correctly this more finegriend cache doesn't obsolete
> the other cache?

I think it does and will surely do if someone will write the flatsax stuff to use it with xhtml templates.

with stan you can do:

    docFatory = loaders.stan(t.cached(name="MainPage", lifetime=10)[t.html[....]])

Which will do the same thing as the first ancient patch. I also get similar performances with the new patch: 26 req/sec and it shouldn't be any slower.
 
> The other cache is probably the fastest we can get, and it pretty much
> solves my problem for the high traffic part.

I still think 250 req/sec are too much. Are you sure that is not the redirect page in guard?
 
> PS. Still I would like to see compy removed, since not everything will
> be cached. There are parts where I will not cache anything.

I've talked to dp and he said that compy will be there only to not depend on twisted but it will definately directly use zope.interface if present.



More information about the Twisted-web mailing list