2008-12-09
This is a set of functional tests for determining how client-side HTTP caches operate.
Note that while they are implemented using XMLHttpRequest (a JavaScript HTTP client), most implementations should use the same cache as for "normal" requests. However, there may be variations.
Each group of tests explains what is being tested and what the implications of failure are. Although many of the tests are automated, some may require user interaction, via a "run test" button. Be sure to follow any instructions carefully.
These tests may have unpredictable results if you instructed your browser to force-reload the cache (e.g., by hitting shift-reload or pressing return in the address field), or if there is a proxy cache between your browser and the server. For best results, clear your cache before loading this page.
See this blog entry for more information.
Cache-Control headers in HTTP requests allow clients to control how downstream caches operate. In APIs like XMLHttpRequest, they can also be used to control how the local (i.e., in-browser) cache operates.
Request cache-control headers can be used by page authors to control the behaviour of the local as well as downstream caches. For example, an author may want to bound the freshness of a response based on the request context, or invalidate the cache from code.
If the browser cache does not pay attention to these directives, it can't be controlled by authors.
Some implementations try to automatically bust upstream caches for XmlHttpRequests, because many people use XHR as an RPC mechanism. However, doing so takes control away from calling code that might want to take advantage of the cache, and reduces cache efficiency.
Implementations that fail this test have automatically appended Cache-Control request headers.
Validation is one of the primary caching mechanisms in HTTP; it allows a cache to see if it can reuse an entity it already has, by asking the server if it's still fresh.
HTTP defines request headers that can be used to make conditional requests, which are used to validate cached representations on the server. Implementations that fail these tests don't send the appropriate request headers that trigger validation.
The 304 Not Modified status code indicates that a cached representation is still fresh. Generally, it isn't useful to expose this to authors; a 200 OK response should instead be constructed from the cache.
Another key caching technique is allowing the server to specify that a representation is fresh for a given amount of time, so that caches can avoid round-trips to check if it's fresh altogther.
Servers can instruct caches to use a stored response without validation. Additionally, clients can use a heuristic (usually based on the Last-Modified header) if no explicit freshness information is present.
Implementations that fail this test do not take advantage of these hints.
HTTP requires that URIs with a query string (i.e., those containing a "?") not be cached, unless the server gives explicit freshness information.
Note that failing the first test is an indication that a freshness heuristic may not be used, which may be desireable behaviour for some applications.
Certain 3xx redirects are allowed to be cached. 301 Moved Permanently is cacheable by default, while 302 Found and 307 Temporary Redirect both need explicit information. 303 See Other is not cacheable.
HTTP distinguishes between private and public caches; a browser cache is private, and should cache responses marked as such. Implementations that fail this test don't recognise that they're a shared cache, and therefore can't be targetted as one (which is a useful technique for separating browser-cacheable content from proxy-cacheable content).
HTTP has many caching directives that might conflict, such as HTTP 1.0's Pragma and
Expires, as well as HTTP 1.1's Cache-Control: max-age.
Sometimes, it is useful to direct caches that understand HTTP/1.1 caching directives (like Cache-Control) to do one thing, while directing those that don't to do something else. For example, if you're taking advantage of advanced features, you might want to allow more capable devices to cache something, while making sure that older ones don't.
HTTP accommodates this by specifying that the Cache-Control: max-age directive takes precedence over the Expires header; a response that contains both of them should be cached according to the Cache-Control header by a device that understands it, while those that don't will honour the Expires header.
There are a number of other situations where directives may conflict; in most cases, caches should follow the most conservative directive (i.e., something that says not to cache) present.
Browsers that don't correctly handle directive precedence will make it difficult to target directives at caches with different levels of conformance.
Tests where there are two results test each combination with a different ordering of headers; the results should be the same.
HTTP allows responses to requests for the same URI to vary based on the values of request headers; this is called server-driven content negotiation. This has special implications for caches, because they usually store representations based on their URIs, but HTTP 1.1 requires them to also consider the content of the Vary header, which indicates what other headers form part of the cache key.
The first test checks to see that a negotiated response will be cached; the second makes sure that two different variants aren't cached as the same thing. The third test sees if the cache is smart enough to ignore the case of the request header-names specified by the Vary response header. Finally, the fourth test specifically checks to see if responses that vary on the Accept-Encoding header (which commonly implements compression in HTTP) work.
This last test is important, because if compressed content isn't cached by the browser, this popular technique may become less useful.
Contrary to conventional wisdom, HTTP does allow caching of the response to a POST request. From RFC2616, section 9.5;
Responses to this method are not cacheable, unless the response includes appropriate Cache-Control or Expires header fields. However, the 303 (See Other) response can be used to direct the user agent to retrieve a cacheable resource.
This means that if a POST response includes Cache-Control or Expires (or, given a strict reading of section 13, even a validator like Last-Modified or ETag), the response can be used to satisfy future GET requests.
For example, a blog entry page could accept comments by taking a POST with the comment text; if the response were cacheable, the commenter would immediately see their comment on the page, even if they reloaded from cache (because their cache would contain the POST response).
Implementations that pass this test will cache a POST response.
Even if an implementation doesn't actively cache POST responses, it needs to invalidate the cache when a POST is made. Otherwise, the wrong response may be sent from cache.
For example, if the blog entry page above used the same URI for POSTing comments and GETting the latest version of the entry, an implementation that doesn't invalidate the cache upon a POST will show an old version of the page (out of cache) upon a subsequent GET.
HTTP talks about this in RFC2616, section 13.10;
Some HTTP methods MUST cause a cache to invalidate an entity. This is
either the entity referred to by the Request-URI, or by the Location
or Content-Location headers (if present). These methods are:
- PUT
- DELETE
- POST
In order to prevent denial of service attacks, an invalidation based
on the URI in a Location or Content-Location header MUST only be
performed if the host part is the same as in the Request-URI.
A cache that passes through requests for methods it does not
understand SHOULD invalidate any entities referred to by the
Request-URI.
Implementations that pass this test will invalidate the appropriate cache entry (the Request-URI, the Content-Location and Location) upon a POST. Implementations that fail this test are not conformant to RFC2616, and will serve the incorrect entry from cache.
Implementations that pass this test will invalidate the appropriate cache entry (the Request-URI, the Content-Location and Location) upon a DELETE. Implementations that fail this test are not conformant to RFC2616, and will serve the incorrect entry from cache.
Implementations that pass this test will invalidate the appropriate cache entry (the Request-URI, the Content-Location and Location) upon a DELETE. Implementations that fail this test are not conformant to RFC2616, and will serve the incorrect entry from cache.
RFC2616, section 13.10 goes on to say how unrecognised methods should be handled;
A cache that passes through requests for methods it does not understand SHOULD invalidate any entities referred to by the Request-URI.