[Twisted-web] Generating cache headers for dymanic data

Mary Gardiner mary-twisted at puzzling.org
Sat Aug 20 23:57:14 MDT 2005


I think this is round 2 of this discussion, but I'd appreciate thoughts
anyway.

I have a Nevow application. It has the following properties:
 1. it is the same for all users, there is no user specific output
 2. part of the data source, and hence some of the pages (say 5-10 out
    600 odd) changes about twice a week
 3. the vast majority of pages remain the same for periods of weeks or
    months

As far as I can tell, this makes it a pretty good candidate for the
various cache headers (Last-Modified and ETag). In fact I'm sure it's a
pretty good candidate, because I've seen how much bandwidth the
Googlebot and the RSS aggregators suck if I don't use them.

The problem is giving a sufficient guarentee of byte-equality. As I
understand the purpose of those headers, if a page has a Last-Modified
unchanged from the last time it was accessed, or has an ETag unchanged
from the last time it was accessed, the body of the response should be
guarenteed to be byte-for-byte identical in so far as that is possible.
(I think foom told me that in practice, Apache regards mtime, file size
and inode numbers as a sufficient guarentee of this.)

As best I can tell, a realistic and reasonably bullet-proof way for me
to tell if there's been a change in my output would be:
 - the same version of Nevow
 - the same version of my Nevow application
 - the same template (as checked by the above logic of Apache's)
 - the same data source (mine are files on disk, so again I can use the
   above logic)

So that's all well and good for the last two, but does anyone have any
ways to detect the first two in any other way than storing a version
variable in my source code and in Nevow source code?

-Mary



More information about the Twisted-web mailing list