Leveraging the Web: Caching

Saturday, 26 November 2005

The first in an occasional series about the real-world benefits of REST and the Web architecture, as applied to HTTP.

I used to work for a fairly huge company as a Web/Internet guru. One day, I got sucked into a meeting with a visiting executive who was talking about rolling out a new set of servers to allow customer service reps access internal documentation. The requirement was to make large-ish PDF files available on the internal network world-wide nearly instantaneously, with access control.

An external vendor had quoted a solution; it involved rolling out a pair of Windows NT servers (for redundancy) to each location around the world, each with its own database and custom-designed software that client applications on the reps’ desktops would connect to. The whole thing would be tied together with message queues and centrally managed.

Our exec wasn’t happy because the deployment cost for this was huge; developing software, rolling out and maintaining Windows NT boxes with databases to over fifty sites around the world is no picnic. The cost of the servers and software alone was prohibitive, and the ongoing maintenance was a very healthy chunk of change. And, the complexity of the proposed system lead us to believe that it would be pretty flakey.

Furthermore, he was frustrated because the same information was already available on an internal Web site, but it just wasn’t fast enough for his purposes (after all, if you’re a rep in Dubai, you can’t wait around with a customer on the phone for five minutes while the PDF comes down).

So, when I wondered aloud why they didn’t just use Web caches, he got very interested.

After a prototype using Squid, we got buy-in to go further. The app had some pretty specific requirements; for example, each and every request had to be authenticated, but we still needed to get the PDF from local cache. We took care of that with a Cache-Control: public, must-revalidate. Then, they wanted the PDF to be in cache, even for the first person to request it. So, we had a small script on the Web server that pushed the PDFs into the caches as they were published, effectively pre-fetching them. They wanted it to be reliable, so we designed a two-level hierarchy with fail-over between both co-located and remote caches. Even in a complete failure, the original Web site could be used, so that the data would still be available (albeit slow).

Caching isn’t just about saving bandwidth; it’s also about distributing an application, improving reliability and improving user experience.

We ended up deploying a large-ish number of Network Appliance Netcaches around the world. The NetApps were fantastic; because they were off-the-shelf appliances, they were very easy to configure, and once running, they didn’t require any but the most basic monitoring. The startup cost was, IIRC, nearly an order of magnitude less than the original quote, and the maintenance for the NetApps was, comparatively, a pittance. The project took about six months, start to finish, and that was mostly working out the deal with NetApp and getting the deployment plan together; there wasn’t any development beyond the thirty or so lines of Perl to get the database to ping the caches.

Our exec was very happy.

I went to headquarters to do the final integration into the Web site, and give a demo or two. At one point, some senior IT execs came in and were very sceptical about the value of caching; while it might do good in tiny, remote offices, it wouldn’t help there (where they had some impossibly big pipes straight into the Internet). Needless to say, their eyes pretty much popped out of their heads when I showed them the difference between surfing from the net and surfing from the cache, and they were immediate converts.

Lessons Learned

The biggest surprise to many involved was that we were able to scale the Web site out with basically no code, using off-the-shelf components. Caching brought both scalability and reliability to the application very cheaply and easily, despite the requirement for authentication.

It was also quite eye-opening to see how using a message queue and other “Enterprise” mechanisms just plain weren’t necessary, despite experts’ insistence that they were. The constraints of the Web (as REST explains) makes it very easy and simple to do very powerful things.

Lastly, this is a nice demonstration that caching isn’t just about saving bandwidth; it’s also about distributing an application, improving reliability and improving user experience.

Mark Nottingham

other HTTP Caching posts

Leveraging the Web: Caching

Lessons Learned