mark nottingham

Dev-Friendly Web Caching

Monday, 27 October 2008

HTTP Caching

Ryan Tomayko announces Rack::Cache, a HTTP cache for Ruby’s generic Web API;

The basic goal is standards-based HTTP caching that scales down to the early stages of a project, development environments, light to medium trafficked sites, stuff like that. HTTP’s caching model is wildly under-appreciated in the Ruby web app community and my hope is that making its benefits more accessible will lead to wider understanding and acceptance.

I commented and, of course, focused solely on one aspect of his announcement (performance) while missing the big picture—getting real HTTP caching into developers’ hands in as easy a fashion as possible.

As Ryan points out, this is absolutely crucial, because if they don’t consider it from the start, it can be really hard to tack caching on later (after the resources and their representations and identifiers have been baked).

At Yahoo!, I’ve spent a lot of time working on this problem from the other end; Squid. Even its most fervent admirers (yes, there are a few) will admit that Squid isn’t exactly developer-friendly; it’s aimed more at the sysadmin / network geek crowd. And while the best Web folks have a good understanding of that world (if they don’t come from it), this still leaves out the bulk of people who can benefit from HTTP caching.

Fixing this means packaging up, tweaking and documenting the hell out of Squid so that it’s easy for Yahoos to install and use (and some more of this may see the light of day eventually; after all, I hear that we’re becoming open).

However, the Y! use case is not typical, as Mark occasionally reminds me; most developers don’t need the massive scalability of the Web just yet, they’re more interested in getting past the first few milestones with very little time or money available. Ryan’s approach is a great way to offer them low-hanging scalability fruit, gently pushing them in the right direction, while giving them a nice migration path if more is needed.

Really great stuff—kudos again, Ryan! Now, if only it were in Python…


4 Comments

Ryan Tomayko said:

Oh, Rack::Cache steals some ideas from Django’s caching framework:

http://docs.djangoproject.com/en/dev/topics/cache/

It’s very similar. Rack is Ruby’s WSGI. I don’t imagine it would be all that hard to extract the Django specific stuff out into a separate library. There may even be a pure, WSGI-based solution out there already.

Tuesday, October 28 2008 at 1:39 AM

Ian Bicking said:

I played around (in Python) a little bit at one time but didn’t finish: http://svn.pythonpaste.org/Paste/CacheMiddleware/trunk and the Repoze guys have done something similar: http://svn.repoze.org/repoze.accelerator/trunk/ (and though not caching, this is kind of related: http://svn.repoze.org/repoze.squeeze/trunk/)

In combination with something like Deliverance (http://deliverance.openplans.org) it is interesting if you put caches at different levels of the page composition. That’s someplace where it’s nice to have a fairly light cache. In the case of Deliverance I might put the WSGI middleware inside the Deliverance process, wrapping the applications that just proxy the HTTP request to other (non-Python) processes.

Tuesday, October 28 2008 at 2:55 AM

Ian Bicking said:

Oh, and related to concurrency, while it’s a little tricky and I haven’t really tried it very much, Spawning (http://pypi.python.org/pypi/Spawning) has the potential to run a synchronous-looking WSGI application/middleware in an async manner. This isn’t applicable to every application (at all), but for particular applications (like a cache) where you pay particular attention, it could be applied. (Though right now while sockets are handled asynchronously by eventlib, the library underlying Spawning, I believe file operations are not.)

Tuesday, October 28 2008 at 5:55 AM

Mark Mansour said:

This reminds me of the Enterprise Java space where developing and testing EJBs was painful and difficult until Spring/Hibernate stepped in a made developing for app servers bearable.

I would guess that Rack::Cache should have a secondary benefit of making developers more literate of good HTTP practices in order to use caching properly. Hopefully this is a win for the interwebs.

Thursday, October 30 2008 at 6:19 AM