mark nottingham

Why ESI is Still Important, and How to Make it Better

Friday, 21 October 2011

HTTP Caching

More than ten years ago, I was working at Akamai and got involved in the specification of Edge Side Includes (ESI), sort of a templating language for intermediaries.

In that time, interest in ESI has grown, waned and been reborn. As far as I can tell, it’s implemented not only by Akamai and Oracle (the main forces behind it), but also in Varnish, Squid, and lots of other places too.

Back then, I had a strong suspicion that it’d die because people would see it as locking them into Akamai (or some other vendor). Why, then, is this limited, funny, embarrassingly simple little templating language still around?

In a word, it’s concurrency.

In the last couple of years, it’s become hot to build massively scalable Web servers by re-thinking how they handle concurrency; often using asynchronous, non-blocking single-process servers, rather than threads or multiple processes.

The benefits of this approach have been known for a long time; way before Dan Kegel wrote the C10K page, Web proxy servers like Squid (and its predecessor, Harvest) were using this approach because it’s the only sensible way to scale for them.

However, as folks are finding out when they use newer tools that implement these methods (e.g., Twisted, Node.JS), writing event-driven code is something you either love or hate. Many developers can’t stand it, especially for debugging (personally, I love it, but that’s just me).

So, ESI is a way to offer the massive concurrency of non-blocking, asynchronous servers in a way that’s easy to digest. Since fetching a URI doesn’t block, the only overhead is in stitching the page together, and you can control the overhead of that by limiting the language’s capability.

This makes ESI a great tool for building highly scalable dynamic Web sites without writing and debugging new code. Win.

Making ESI Better

ESI is, as mentioned, more than a decade old, and the Web has changed a lot in the intervening time. Even putting that aside, ESI isn’t exactly what we’d call Web-friendly. We can do better.

Over that time, I’ve had a number of thoughts about how to improve ESI as a language, which I’ve shared with some interested people privately. One of my back-burner projects has been to implement this, but I have to admit that this isn’t going to happen soon, since I’m busy doing several other things.

Instead, I’m going to dump those ideas here, and hope someone runs with them. Here are a few:

The biggest single way I can see to improve ESI is to make it possible to source variables from a URI. In other words, it should be possible to fetch a URI, parse the response (probably in JSON), and then reference the data returned when evaluating the template.

This would enable some really exciting things. Because variables are now just state, you can do things like cache user preferences – using plain old HTTP caching – and have that state be local to where it’s needed. When you update that state, it can be invalidated. ESI expressions now can have arbitrary, application-relevant input, instead of being limited to a few paltry request headers.

This could be what it looks like:

<esi:load name=”user_prefs” src=”{request.cookie.userid}”/> <!– … –> <esi:include src=”/{user_prefs.top_left_module}”/>

Here, you see some JSON being loaded into the user_prefs variable, form a URI that’s templates using a cookie that identifies the user, to drive how the page loads. This is very similar to a set of techniques I discussed a while back for composing services “RESTfully”, and it still works.

JSON also presents a way to clean up the variable model generally; instead of the random collection of variables, ESI 2.0 could instantiate a request object, with appropriate members like .method, .cookie, .headers, and so forth. It also brings about the possibility of making response attributes available as well, at least in the context of an include.

Going even further, JavaScript presents an opportunity to rally around a common, well-understood syntax for things like variable references, operators, and even common functions (e.g., string manipulation).

ESI:include desperately needs a timeout parameter, and a sensible means of specifying fallback content (probably as a child of the include element).

Deeper integration with HTTP is necessary; not only should it be possible to access arbitrary aspects of the incoming request, but it should be possible to affect more of the outgoing response; e.g., the status code. Likewise, finer-grained control over outgoing requests (generated by include as well as load) would be good (e.g., via attributes on the element).

There are lots of smaller, easier wins. Not requiring valid XML is an obvious one; integrating URI Templates is likewise a no-brainer. Cleaning up some of the cruft in the syntax would be nice; there are some elements that people just don’t need in there (e.g., esi:inline, the alt attribute).

Anybody up for it?


Jan Algermissen said:

Great ideas, Mark!

The biggest problem I see is finding the right balance between enabling a useful amount of templating features (you mention string manipulation) on the one hand and shoving in a full blown scripting engine (e.g. JavaScript) on the other.

While limiting the templating capabilities provides for a simpler spec and faster implementations it also (can) lead to one undesired dependencies between services. For example, when service A (sending the ESI template) requires a (ESI included) service B to produce representations in a certain way (e.g. send address strings in a certain format) due to limitations in the templating capabilities.


Friday, October 21 2011 at 11:20 AM

Erik Mogensen said:

Varnish implements a tiny (?) subset of ESI (the Good Parts?) I propose that ESI should be stripped down. I think it’s esi:include, esi:remove, based on the most-bang-for-the-buck principle.

inline, try, choose/when, vars and so on can be implemented by an origin server, returning the appropriate esi instructions to the intermediary, so when sponsoring Varnish Cache to implement ESI we chose to focus on the juicy bit, namely include and remove.

Maybe a “surrogate capability” which defined this minimal useful subset would be an idea. ESI-minimal/1.0 or something.

Friday, October 21 2011 at 11:52 AM

Ilya Grigorik said:

Mark, but don’t we effectively get most of the same benefits? It seems like the core benefit of ESI is the fact that if forces you to decompose your page to multiple services. These endpoints, in turn, can implement their own smart caching and can be backed by CDN’s, etc. In other words, all we’ve done is we’ve moved the ESI templating to the client.. and that has both pro’s and cons.

Pro: we know we want and need AJAX so having an intermediary ESI service means replicating the template in both places, which is not a great experience. Con: the burden is on the client to assemble the page, which means many outbound requests.. but that’s while not ideal seems like a reasonable tradeoff.

In other words, if we know we need to decompose at UI layer, why bother with an intermediary? My own personal objection until recently has been: yes, but all of this JS coordination at UI layer breaks apart when we’re not JS enabled (ex: crawlers). Having said that, using a templating system like Closure (or similar), we can effectively render the same templates on server-side or client-side.. which gets us the benefit of both.

I do still see a place for ESI in specialized use cases.. but for general use, it seems like JSON endpoints + server/client templates is the right answer for the most part?

Saturday, October 22 2011 at 10:40 AM

Stefan Tilkov said:

I’d be extremely interested in this. Currently, ESI is very much tied to caching (even if only in peoples’ perception), but the more general use case is extremely interesting: The general aggregation of content, in my particular case from a large Web app where each page is dependent on a bunch of loosely coupled modules (I know, suspiciously sounds like “SOA” or “portal”, but anyway). “A templating language for intermediaries” sounds exactly like what the world needs.

Saturday, October 22 2011 at 12:56 PM

Jan Algermissen said:


mobile devices are also something where ESI is beneficial because a) there can be transformation to representations suited for the device after the ESI processing and b) letting the server determine what processing to happen on the client is problematic for mobile devices. [1]


[1] Incidentally I just came across this piece of Mark (Baker): (from 2007, but spot on as allways)

Sunday, October 23 2011 at 7:46 AM

Ilya Grigorik said:

Mark, Jan fair points, I guess I’m just reflecting on how my own thinking has shifted with respect to ESI over the past few years.

While I was a huge fan of the spec, in large part due to the scatter/gather architecture it implied, it does also introduce a pretty high cost: the front-end developer now needs to learn yet another language. Given that this was originally conceived in server-side context, it seems that recent trends are only making the case worse for ESI.. We want more reactive applications, which means same architecture but client-side composition. This may make for sub-optimal load-time performance, but that’s a different story. In any case, I see where you’re coming from, but I’m still of the mind that ESI may be relegated to some niche use-cases. I doubt any attempt at standardization of this would bear much fruit: it seems like if you need ESI, then you’re probably in a rather special case (scale, distribution, etc), at which point a customized solution has likely been built already.

Re: obfuscation / hiding - that’s a fair concern.. you can’t do everything in the client. Having said that, for majority of the use cases obfuscating the JS source seems to get you 95% of the way there. Many of Google’s services are great examples: in theory, we do have “unofficial API’s”, in practice relying on obfuscated endpoints and variables is brittle at best, and impractical in real life (been there, done that :))

Jan: It seems like we’re finally moving away from the “mobile” version of the web. All modern smartphones are perfectly capable of rendering the full experience. Granted, I’m not saying that’s necessarily optimal experience on a much smaller screen, but at least from what I’ve seen.. in practice it yields better results than those uninspiring “mobile versions” of web apps (which are usually thrown together after the fact).

Monday, October 24 2011 at 5:44 AM