[Twisted-Python] twisted.web and MySQLdb

Glyph Lefkowitz glyph at twistedmatrix.com
Wed Oct 29 09:43:36 EST 2003

Nathan Seven wrote:

> Hmmmm I would qualify that- I dont think the
> filesystem is the place to be handling dynamic data. 

The filesystem is a fine place.

For example, in the Prevayler persistence model, you just write logfiles 
to disk, and synchronize your state at a checkpoint.  For highly dynamic 
applications, especially ones which require failover, (You can re-play 
the transaction log live, after all) this works quite well.

> Databases were created *specifically* for this
> purpose.

I think that databases were specifically designed to store accounting 
information, actually.

> Sure, storing all your static blobs in your
> database is a really quick way to grind shit to a
> halt, but locking and concurrency?
> If you're doing things properly, and your http server
> is just serving static objects, then these are
> non-issues.

Databases can be amazingly slow, especially if you have a lot of updates 
to do.  (Even a very fast database can be made slow by I/O bottlenecks 
if you are trying to make it remote for scalability reasons.)  This has 
an easy solution: you can cache everything!  Of course, then you need to 
be able to easily access the cache from all of the machines, because it 
may have been updated.  Now you have problems with coherency.  Then you 
need to lock the cache, because it could have been updated, and then you 
need to read from it.

Pretty soon you're talking to your caching server as if it were a 
database.  This is _great_ if you are Livejournal:

> Yeah- through my line of work I deal with a *lot* of
> different infrastructures.  Everything from "Joe's BBQ
> Sauce Garage" to Amazon.  Literally the only
> organization I can think of that can keep anything
> coherent with MySQL is Livejournal- and then I believe
> only because Brad seems to be a cache-god with
> memcached and such.

because then you don't have to worry about computation, mutable data, 
etc - you're basically just storing data and then spitting it back out, 
and you don't care if the timestamps are a little off.

This is the important point about LAMP and Twisted:

There are applications which can connect to HTTP which are not blogs.

If you are writing a multiplayer game which wants to support lots of 
concurrent users, you can't afford to spawn a thread and do a database 
request every time a player picks something up.  (Python is quite slow 
enough already, thanks.)  You can't just use a cache because the data 
changes _all the time_, and you have to care about it from everywhere 
that you care about your data.  Working with your objects directly in 
memory is close to the only option.

If you're writing a real-time financial data system, you do want to use 
a database, but you want to very carefully control your access to it. 
Certainly, you don't want to equate 'web hit' with 'database query', as 
the LAMP model is wont to do.

Or maybe you're writing an application that has to operate as a 
client-side proxy, and you don't have the leisure of a DBA at every 
desk, so you can't require that an RDBMS gets set up with each 
installation.  This might require some hackish workarounds with the 
filesystem that you'd rather not do, but nevertheless, it's better than 
having the user editing pg_hba.conf themselves.

More information about the Twisted-Python mailing list