[Twisted-Python] caching suggestions

Jean-Paul Calderone exarkun at divmod.com
Thu Jun 14 15:43:57 EDT 2007


On Thu, 14 Jun 2007 15:29:08 -0400, Jonathan Vanasco <twisted-python at 2xlp.com> wrote:
>note: this is less about twisted than a project i've built on twisted  that 
>is having some issues.  everyone here is smart, so i thought  this would be 
>a good list to ask on.
>
>i've got a small twisted daemon that is working as a proxy for  hosting 
>content that we store on amazon-s3
>long story short: archive to s3 (redundancy, cheap storage), twisted  daemon 
>on our network fetches to cache & serve.  we have cheaper  bandwidth, plus 
>do a lot of file monitoring / name abstraction that  amazon's network won't 
>support.  currently there are 10k documents on  s3, which are accessible via 
>150k+ 'keys'.   the cache basiscally  proxies the right doc for each key
>
>I'm running into 2 issues with it:
>         short term- keys are mapped to s3 files via a ton of 'hints' that i 
>store in bdb after fetching from postgresql.  the hints need to be 
>refreshed every 1-5 minutes or so.  does anyone have a good  suggestion on 
>how to do that?  i could do this really easily with  sqlite, but i need to 
>use bdb -- as sqlite isn't nearly fast enough;  while bdb is.
>
>         long term-    everything is fine for now -- we only have about 3gb 
>of data.  but thats going to be growing to about 20gb soon, and we  want to 
>limit the cache to an active 10gb using an Adaptive  Replacement Cache 
>algorithm with a bdb datastore.  has anyone done  something similar in 
>twisted or python in general?
>
>any input would be appreciated.  thanks!

Sounds like an interesting project.  With regards to bdb, in my experience
the Python bindings are very flaky, and the bdb API isn't very friendly in
the first place.  It is probably possible to write something that works on
bdb with Python, but it's pretty hard.  bdb does handle that scale well enough
though - I have a Twisted-based server running against a 21GB bdb database
at the moment (although we're trying to phase it out in favor of something
based on SQLite ;P).

Jean-Paul




More information about the Twisted-Python mailing list