Caching Entries
Monday, 24 September 2012
One of the changes in Apple's release of iOS6 last week was a surprising new ability to cache POST responses. Lots has been said about this, but some people reading RFC2616 have come away scratching their head about whether this is actually a bug or not. The HTTP spec says this about POST: Responses to this method are not cacheable, unless the response includes appropriate Cache-Control or Expires header fields. Which, on the face of it, seems to say that...
Friday, 21 October 2011
More than ten years ago, I was working at Akamai and got involved in the specification of Edge Side Includes (ESI), sort of a templating language for intermediaries. In that time, interest in ESI has grown, waned and been reborn. As far as I can tell, it's implemented not only by Akamai and Oracle (the main forces behind it), but also in Varnish, Squid, and lots of other places too. Back then, I had a strong suspicion that it'd die because...
Sunday, 28 August 2011
In discussing my whinge about AppCache offline with a few browser vendory folks, I ending up writing down my longstanding wishlist for making browser caches better. Without further ado, a bunch of blue-sky ideas; Cache Contexts Most current HTTP caches — whether browser or proxy — put everything they cache into a single “bucket” that shares configuration and storage space. While that makes sense for a HTTP proxy run by your ISP, it doesn’t make as much for browsers, because...
Sunday, 19 June 2011
HTML5’s AppCache mechanism is one confused little puppy. Purporting to be for taking web applications offline — a compelling and useful thing — it’s more often used by performance-hungry sites that want to use it as an online cache. Unfortunately, this means that those sites will be popping up dialogues like this: Even more unfortunately, if you click “Don’t Allow”, you’ll just be asked to approve it again next time you navigate to the site. And again. And again....
Friday, 27 May 2011
After designing and deploying Cache Channels, it quickly became apparent that one Web cache invalidation mechanism wasn’t able to cover the breadth of use cases. In a nutshell, Cache Channels trades off immediacy for reliability; that is, while cache invalidations don’t take place right away (there’s a 10-30 second window), you know that they’ll be respected, because of how the protocol is designed. That’s great if you need to (for example) invalidate news articles with months of TTL because...
Wednesday, 30 June 2010
Patricia Clausnitzer has kindly translated the Caching Tutorial to Belarusian. Thanks!...
Thursday, 3 June 2010
A while back we used an absurd amount of reward points from our credit card to get some Myer gift certificates, and on the weekend these miraculously turned into a new TV, the Sony 32EX600. Overall, we really like it; while I’m still trying to find the exact recipe to successfully encode video to feed to it via DLNA, it’s beautiful to look at, and (unlike most TVs these days), the UI is a pleasure to use. Except. One...
Wednesday, 10 March 2010
Thomas Hühn has graciously translated the caching tutorial into German. Thanks! See also the Chinese, Czech and French translations. To help the translators keep up with changes, I've started hosting the raw document on Github, which can also be used to log issues....
Friday, 30 October 2009
A long time ago*, the word in high-performance proxy-caching was Inktomi’s Traffic Server. It was so fast it was referred to being “carrier grade” and this could be said without people smirking, and it was deployed by the likes of AOL, when AOL was still how most people accessed the Internet. Then, in 2001, Yahoo! bought Inktomi. They did this because they wanted to be a player in Search, and they happened to get a proxy/caching product for free....
Thursday, 25 June 2009
A (very) long time ago, I wrote the Cacheability Engine to help people figure out how a Web cache would treat their sites. It has a few bugs, but is generally useful for that purpose. However, as I’ve got more involved in using HTTP for non-browser things, it’s become apparent that more than caching is important when you’re examining a resource to see how it behaves; things like partial content, syntax checking and other esoteric but important details. Very...
Wednesday, 17 June 2009
The caching tutorial is now available in Chinese, courtesy of Che Dong (and apologies for taking so long in linking to it!). Norwegian should be coming soon......
Friday, 12 June 2009
Part of my job is maintaining Yahoo!’s build of Squid and supporting its users, which use it to serve everything from the internal Web services that make sites go to serving Flickr’s images. In that process, I often am asked “what about X?”, where X is another caching or load balancing product (yes, Squid can be used as a load balancer). For example, Varnish, or lighttpd. Generally, these comparisons come down to three factors; performance, features and manageability. Almost...
Friday, 29 May 2009
Everybody’s atwitter (yeah, sue me) about the Google Wave developer preview. Lots of new stuff there, but for me the most revealing comment, almost a throwaway, was here: Did we mention we use Squid? In other words, Google’s hot new, Web 2.0-on-steroids HTML5 XMPP++ open social platform buzzword generator works well with, indeed relies upon (for the purposes of making a nice demo) a more than 15-year-old piece of nearly-forgotten (in some quarters) Web infrastructure. That’s pretty cool....
Tuesday, 24 February 2009
There’s a rule of thumb about when a HTTP response can be cached; the Caching Tutorial says: If the response’s headers tell the cache not to keep it, it won’t. If the request is authenticated or secure, it won’t be cached. If no validator (an ETag or Last-Modified header) is present on a response, and it doesn’t have any explicit freshness information, it will be considered uncacheable. And, generally, this is true; most implementations won’t both caching something that...
Monday, 27 October 2008
Ryan Tomayko announces Rack::Cache, a HTTP cache for Ruby’s generic Web API; The basic goal is standards-based HTTP caching that scales down to the early stages of a project, development environments, light to medium trafficked sites, stuff like that. HTTP’s caching model is wildly under-appreciated in the Ruby web app community and my hope is that making its benefits more accessible will lead to wider understanding and acceptance. I commented and, of course, focused solely on one aspect of...
Friday, 4 January 2008
The stale-while-revalidate and stale-if-error extensions aren’t the only fiddling we’ve been doing with the HTTP caching model. Now that Squid 2.7 is starting to see daylight, I can explain about a much more ambitious project — Cache Channels. In HTTP, there are generally two ways to keep something in a cache fresh; using a freshness lifetime (e.g., Cache-Control: max-age), and validation (e.g., If-Modified-Since). Together, they do a really good job of making the Web seem faster and reducing load...
Wednesday, 12 December 2007
We use caching extensively inside Yahoo! to improve scalability, latency and availability for back-end HTTP services, as I’ve discussed before. However, there are a few situations where the plain vanilla HTTP caching model doesn’t quite do the trick. Rather than come up with one-off solultions to our problems, we tried going in the other direction; finding the most general solution that still met our needs, in the hopes of meeting others’ as well. Here are two of them (with...
Tuesday, 7 August 2007
I’ve been hoping to avoid this, but ETags seem to be popping up more and more often recently. For whatever reason, people latch onto them as a litmus test for RESTfulness, as the defining factor of HTTP’s caching model, and much more. So, let me counter: they’re not all that. In fact, there are a number of pitfalls you need to be wary of if you use them. First, depending on how they’re generated, you might find different boxes...
Wednesday, 20 June 2007
A while back I wrote up the state of browser caching, after writing a quick-and-dirty XHR-based test page, with the idea that if people know how their content is handled by common implementations, they’d be able to trust caches a bit more. The other half that they need to know about, of course, is proxy caching; depending on who you listen to, somewhere between 20% and 50% of clients on the internet are behind some kind of proxy, and...
Tuesday, 15 May 2007
I occasionally get a question from readers of the caching tutorial about whether to use the Expires header or Cache-Control: max-age to control a response’s freshness lifetime. Some people claim that Expires is better, because it’s defined by HTTP/1.0, whereas Cache-Control only came about in HTTP/1.1. Since there are still HTTP/1.0 agents out there, the reasoning goes, this is the safer path. The problem with that line of reasoning is that HTTP versions aren’t black and white like this;...
Sunday, 29 April 2007
The QCon presentation (slides) was ostensibly about how we use HTTP for services within Yahoo’s Media Group. When I started thinking about the talk, however, I quickly concluded that everyone’s heard enough about the high-level benefits of HTTP and not nearly enough details of what it does on the ground. So, I decided to concentrate on one aspect of the value that we get from using HTTP for services; intermediation, as an example. If your service is struggling to...
Monday, 21 August 2006
There have been some interesting developments in Web caching lately, from a performance perspective; event loops are becoming mainstream, and there are lots of new contenders on the scene. Fortuitously, I’ve been benchmarking proxies with an eye towards the raw performance of their HTTP stacks (i.e., how well they handle connections and parse headers, serving everything from memory), for work, so that we can select a platform for internal service caching. In particular, I’ve been looking at the maximum...
Tuesday, 16 May 2006
I just finished my XTech presentation, “Web 2.0 on Speed”. here are the slides [pdf]; I’m going to try to s5 them soon. There isn’t much new in this talk; it’s just a synthesis of a few different observations; 1. Static Web servers are fast. You might say ‘duh’, but people often overlook the obvious; with event-looped Web servers like lighttpd (see c10k for more information), you can get 2-3 orders of magnitude more requests/second over LAMP and similar...
Thursday, 11 May 2006
Updated 2006-06-03 One of the big problems that Web developers have with HTTP caching is that they don’t know how the caches behave; while the specs say one thing, the actual behaviour of the cache often significantly deviates — usually because the cache’s developer or operator thinks they can do better. The easiest way to overcome this obstacle is to measure the behaviour of the caches. In particular, thanks to XmlHttpRequest, it’s fairly easy to test a browser’s cache by...
Saturday, 18 February 2006
Have you ever posted a comment to a blog, found it missing, so you re-posted it, only to find two entries? Annoying, huh? Aaron pinged me the other day with this problem, and I responded that the Right way to do this is to POST to the same resource (i.e., the blog entry), so that the POST invalidates the cache. HTTP has this to say about the matter; Some HTTP methods MUST cause a cache to invalidate an entity....
Saturday, 26 November 2005
The first in an occasional series about the real-world benefits of REST and the Web architecture, as applied to HTTP. I used to work for a fairly huge company as a Web/Internet guru. One day, I got sucked into a meeting with a visiting executive who was talking about rolling out a new set of servers to allow customer service reps access internal documentation. The requirement was to make large-ish PDF files available on the internal network world-wide nearly...
Sunday, 22 May 2005
There’s been quite a kerfuffle over Google’s Web Accelerator, because it prefetches Web content. It’s amusing to see these issues recycle over time; in the late nineties, prefetching was one of the biggest areas of research in Web caching. There were lots of papers written (search for “prefetch”), and even a commercial implementation; CacheFlow was the prefetching cache (they called it “active content”), and it was pretty controversial in the industry. After a long period of back-and-forth and considerable...
Thursday, 12 May 2005
I happened to look at the HTTP headers returned from Google News just now (what can I say, I’m a HTTP geek), and I noticed something unusual; Last login: Thu May 12 16:52:59 on console Welcome to Darwin! mnot-laptop:~> telnet news.google.com 80 Trying 64.233.161.147... Connected to news.l.google.com. Escape character is '^]'. HEAD / HTTP/1.1 Host: news.google.com HTTP/1.1 200 OK Set-Cookie: PREF=ID=d33570687b199641:TM=1115954884:LM=1115954884:S=EKAAPop2tSe_wM0T; expires=Sun, 17-Jan-2038 19:14:07 GMT; path=/; domain=.google.com Content-Type: text/html; charset=ISO-8859-1 Server: NFE/0.8 Cache-Control: private, x-gzip-ok="" Content-Length: 0 Date: Fri,...
Sunday, 15 February 2004
I’ve published a revision of the Caching Tutorial for Web Authors and Webmasters, the first non-trivial edit in some time almost since I wrote it in 1998. That said, there aren’t any substantial changes; this is mostly tweaking and incorporation of new information. The bigger modifications include: - Replacing “Object” with “Representation” as appropriate - New information on Gateway Caches and Content Delivery Networks - Incorporation of various corrections and additional information received - Creative Commons licensing I’m sure there...
Saturday, 28 June 2003
I feel compelled to respond to Norm Walsh's thoughts on caching. It's important to distinguish between the capabilities of a specific product (such as WWWoffle) and the technology that it implements (caching). I would agree that the general state of cache implementation leaves much to be desired, both in clients and in proxies. However, I would not write off the technology wholesale. For example, Norm claims that you can't populate a cache. Untrue; it's simple to populate any cache, simply...