mark nottingham

Expires vs. max-age

Tuesday, 15 May 2007

HTTP Caching

I occasionally get a question from readers of the caching tutorial about whether to use the Expires header or Cache-Control: max-age to control a response’s freshness lifetime.

Some people claim that Expires is better, because it’s defined by HTTP/1.0, whereas Cache-Control only came about in HTTP/1.1. Since there are still HTTP/1.0 agents out there, the reasoning goes, this is the safer path.

The problem with that line of reasoning is that HTTP versions aren’t black and white like this; just because something advertises itself as HTTP/1.0, doesn’t mean it doesn’t understand HTTP/1.1 (see RFC2145 for more). In fact, since Cache-Control was one of the earlier mechanisms in HTTP/1.1, virtually every cache implementation out there understands max-age, whatever version of HTTP they advertise.

So, this puts the two on even footing. What tips the scales in favour of Cache-Control: max-age is its relative simplicitly. Consider:

Cache-Control: max-age=3600

as opposed to

Expires: Tue, 15 May 2007 07:19:00 GMT

CC: max-age is just a straight integer number of seconds, while Expires has a somewhat complex date format. From what I’ve seen in various implementations, this makes a difference; even small errors in generating the Expires value (e.g., omitting the leading ‘0’ from the hour) can cause downstream caches to misinterpret it. It happens more often than you think.

Compounding these errors are the caches themselves; in testing a variety of commercial and open source implementations, I found that a large number flout this requirement;

HTTP/1.1 clients and caches MUST treat other invalid date formats, especially including the value “0”, as in the past (i.e., “already expired”).

…which means that an errors you make might have unpredictable results, potentially allowing your response to be cached for longer than you intended.

Furthermore, if you forget to update your Expires time, or get the time zone conversions wrong (source of many an error), you’ll end up with unpredictable results as well.

So, my recommendation is to either use a well-tested library (e.g., mod_expires) to generate both Expires and Cache-Control: max-age for you, or to only generate CC: max-age if you’re doing it yourself, to reduce the chance of messing things up.


7 Comments

PJ said:

Van Jacobsen makes some good points about the value of immutability in addressing in the his talk about the next level of networking (http://video.google.com/videoplay?docid=-6972678839686672840)

Maybe reason enough? There’s already at least one half-way implementation of some of his ideas: web proxies that set themselves up into a p2p network and share a distributed cache. I forget the name of the implementation though :(

–pj

Wednesday, May 16 2007 at 4:31 AM

Patrick Mueller said:

I did a quick scan through the Cache-Control section of the HTTP 1.1 spec, and didn’t see any mention of how to handle immutable objects. The most recent case I’ve run into are the version-specific javascript and css files that go with some of the larger ajax-y, web 2.0-y frameworks, like yui.

I think there’s value in being able to mark these as immutable, because in fact they won’t ever change. When they do change, a new copy of the files will exist in some other URL on the web. Or you could arrange for that to happen.

What sort of headers would you recommend in this case?

Wednesday, May 16 2007 at 12:22 PM

Jon Hanna said:

Immutability is indeed very useful in a variety of case and immutable-at-this-version adds to them.

I think there is still a serious advantage in not believing anything is immutable (hence the rules about not advertising more than 1 year before expiry and not believing more than 1 year). RFCs (to use Mark’s example) are immutable by agreed convention, not physical law. Agreed conventions become disagreed-upon conventions and finally obsolete conventions. This might never happen in any one given case, but you can guarantee that some things considered immutable today will be gone in a few years time.

As for the risks of setting Expires headers wrong if you roll your own; this is true but the main risks fall down to failure to format dates correctly and you may well have to use that format in other headers (e.g. Last-Modified). I would say that if you are going to have to write code for that formatting then you might as well use it again in Expires.

Thursday, May 31 2007 at 3:27 AM

Andrew Hsu said:

Sorry to nitpick, but I believe there is a typo in the example you gave where ‘Expires’ occurs twice:

Expires: Expires: Tue, 15 May 2007 07:19:00 GMT

Cheers, Andrew

Friday, September 21 2007 at 10:05 AM

Sam said:

Mark, thanks for the tutorial. Hopefully I didn’t miss this, but something I don’t fully understand is what happens after a max-age is reached? Will the user-agent then start pinging every time to see if-modified? Or will it reset the timer. E.g., say image.gif has a max age of 30 days. UA visits the site every day for 30 days and makes no request because max-age is still fresh. Now on Day31 it checks and server returns 304 status, not-modified. Does the 30-day max-age “timer” restart or is it now always past the max-age and thus will check each time forward? If the later, once past, it seems like a fair amount of latency will appear once a bunch of cached objects have passed their max-age. Hopefully that’s not the case.

Thursday, August 20 2009 at 7:37 AM