mark nottingham

Web API Versioning Smackdown

Tuesday, 25 October 2011

HTTP APIs

A lot of bits have been used over on the OpenStack list recently about versioning the HTTP APIs they provide.

This over-long and rambling post summarises my current thoughts on the topic, both as background for that discussion, as well as for review in the wider community.

The Warm-up: Software vs. Web Versioning

Developers are used to software versioning; e.g., for every release, you bump an identifier. There are usually major versions, minor versions, and sometimes things like package identifiers.

This fine level of granularity is useful to both developers and users; each of these things has precise semantics that helps in figuring out compatibility and debugging.

For example, on my Fedora box, I can do:

cloud:~> yum -q list installed httpd
Installed Packages
httpd.x86_64 2.2.17-1.fc14 @updates

… and I’ll know that Apache httpd version 2.2.17 is installed, and it’s the first package of that version for Fedora 14.

This lets me know that any modules I want to use with the server will need to work with Apache 2.2; and, that if there are security bugs found in httpd 2.2.15, I’m safe. Furthermore, when I install software that depends upon Apache, it can specify a specific version — and even packaging — to require, so that if it wants to avoid specific bugs, or require specific features, it can.

These are good and useful things to use software versioning for; it’s evolved into best practice that’s pretty well-understood. See, for example, Fedora’s package versioning guidelines.

However, they don’t directly apply to versioning on the Web. While there are similar use cases — e.g., maintaining compatibility, enabling debugging, dependency control — the mechanisms are completely different.

For example, if you throw such a version identifier into your URI, like this:

http://api.example.com/v2.2.17-1.fc14/things/foo

then every time you make a minor change to your software, you’ll be minting an entire new set of resources on the Web;

http://api.example.com/v2.2.17-2.fc14/things/foo

Moreover, you’ll need to still support the old ones for old clients, so you’ll have a massive footprint of URIs to support. Now consider what this does to caches in the middle; they have to maintain duplicates of the same thing — because it’s unlikely that foo has changed, but it can’t be sure — and your cache hit rate goes down.

Likewise, anybody holding onto a link from the previous version of the API has to decide what to do with it going forward; while they can guess that there’ll be compatibility between the two versions, they can’t really be sure, and they’ll still need to be rewriting a bunch of APIs.

In other words, just sticking software versions into Web URL removes a lot of the value we get from using HTTP, and if you do this, you might as well be using a ‘dumb’ RPC protocol.

So what does work, on the Web?

The answer is that there is no one answer; there are lots of different mechanisms in HTTP to meet the goals that people have for versioning.

However, there is an underlying principle to almost any kind of of versioning on the Web; not breaking existing clients.

The reasoning is simple; once you publish a Web API, people are going to start writing software that relies upon it, and every time you introduce a change, you introduce the potential to break them. That means that changes have to happen in predictable and well-understood ways.

For example, if you start using the Foo HTTP header, you can’t change its semantics or syntax afterwards. Even fixing bugs in how it works can be tricky, because clients will start to work around the bugs, and when you change things, you break the workarounds.

In other words, good mechanisms are extensible, so that you can introduce change without wiping the slate clean, and it means that any change that doesn’t fit into an extension needs to use a new identifier, so it doesn’t confuse clients expecting the old behaviour.

So, if you want to change the semantics of that Foo header, you can either take advantage of extensibility (if it allows it; see the Cache-Control headers extensibility policy for a great example), or you have to introduce another header, e.g., Foo2.

This approach extends to lots of other things, whether they be media types, URI parameters, and potentially URIs themselves (see below).

Because of this, versioning is something that should not take place often, because every time you change a version identifier, you’re potentially orphaning clients who “speak” that language.

The fundamental principle is that you can’t break existing clients, because you don’t know what they implement, and you don’t control them. In doing so, you need to turn a backwards-incompatible change into a compatible one.

This implies that API versioning absolutely cannot be tied to software versioning in any way; doing so will needlessly limit (and often break) your clients, and generally upset people.

There’s an interesting effect to observe here, by the way; this approach to versioning is inherently non-linear. In other words, every time you mint a new identifier, you’re minting a fundamentally new thing, whether it be a HTTP header, a format identified by a media type, or a URI. you might as well use “foo” and “bar” as “v1” and “v2”. In some ways, that’s preferred, because people read so much into numbers (especially when there are decimal points involved).

The tricky part, as we’ll see in a bit, is what identifiers you nominate to pivot interoperability around.

An Aside: Debugging with Product Tokens

So, if you don’t put minor version information into URIs, media types and other identifiers, how do you debug when you have an implementation-specific problem? How do you track these minor changes?

HTTP’s answer to this is product tokens. The appear in things like the User-Agent, Server and Via headers, and allow software to identify itself, without surfacing minor versioning and packaging information into the protocols “core” identifiers (whether it’s a URI, a media type, a HTTP header, or whatever).

These sorts of versions are free — or even encouraged, delta the security considerations — to contain fine-grained identifiers for what version, package, etc. of software is running. It’s what they’re for.

The Main Event: Resource Versioning

All of that said, the question remains of how to manage change in your Web application’s interface. These changes can be divided into two rough categories; representation format changes and resource changes.

Representation format changes have been covered fairly well by others (e.g., Dave), and they’re both simple and maddeningly complex. In a nutshell, don’t make backwards-incompatible changes, and if you do, change the media type.

JSON makes this easier than XML, because it has both a simpler metamodel, as well as a default mustIgnore rule.

Resource changes are what I’m more interested in here. This is doing things like adding new methods, changing the URIs that clients use (including query parameters and their semantics), and so forth.

Again, many (if not most) changes to resources can be accommodated by turning them into backwards-compatible changes. For example, rather than bumping a version when you want to modify how a resource handles query parameters, you mint a new, sibling resource with a different name that takes the alternate query parameters.

However, there comes a time when you need to “wipe the slate clean.” Perhaps it’s because your API has become overburdened with such add-on resources, or you’ve got some new insights into your problem that benefit from a fresh sheet. Then, it’s time to introduce a new API version (which again, shouldn’t happen often). The question is, “how?”

In this Corner: URI Versioning

The most widely accepted way to do version resources of Web APIs currently is in the URI. A typical example might be:

http://api.example.com/v1/things/foo

Here, first path segment is a major version identifier, and when it changes, everything under it does as well. Therefore, the client needs to decide what version of the API it wants to interact with; there isn’t any correlation between URIs between v1 and v2, for example.

So, even if you have:

http://api.example.com/v2/things/foo

There isn’t necessarily any correlation between the two URIs. This is important, because it gives you that clean slate; if there were correlation between v1 and v2 URIs, you’d be tying your hands in terms of what you could do in v2 (and beyond).

You can see evidence of this in lots of popular Web APIs out there; e.g., Twitter and Yahoo.

However, it’s not necessary to have that version number in there. Consider Facebook; their so-called old REST API has been deprecated in favour of their new Graph API. Neither has “v1” or “v2” in them; rather, they just use the hostname to name space the different interfaces (“api.facebook.com” vs. “graph.facebook.com”). Old clients are still supported, and new clients can get new functionality; they just called their new version something less boring than “v2”.

Fundamentally, this is how the Web works, and there’s nothing wrong with this approach, whether you use “v1” and “v2” or “foo” and “bar” — although I think there’s less confusion inherent in the latter approach.

The Contender: HATEOS

However, there is one lingering concern that gets tied up into this; people assume — very reasonably — that when you document a set of URIs and ship them as a version of an interface, clients can count on those URIs being useful.

This violates a core REST principle called “Hypertext As The Engine of Application State”, or HATEOS for short.

RESTafarians have long searched for signs of HATEOS in Web APIs, and Roy has lamented its absence in the majority of them.

Tying your clients into a pre-set understanding of URIs tightly couples the client implementation to the server; in practice, this makes your interface fragile, because any change can inadvertently break things, and people tend to like to change URIs over time.

In a HATEOS approach to an API, you’d define everything in terms of media types (what formats your accept and produce) and link relations (how the resources producing those representations are related).

This means that your first interaction with an interface might look like this:

GET / HTTP/1.1
Host: api.example.com
Accept: application/vnd.example.link_templates+json

HTTP/1.1 200 OK
Content-Type: application/vnd.example.link_templates+json
Cache-Control: max-age=3600
Connection: close

{
  "account": "http://accounts.example.com/{account_id}",
  "server": "/servers/{server_id}",
  "image": "https://images.example.com/{image_id}"
}

Please don’t read too much into this representation; it’s just a sketch. The important thing is that the client uses information from the server to dynamically generate URIs at runtime, rather than baking them into the implementations.

All of the semantics are baked into those link relations — they should probably be URIs if they’re not registered, by the way — and in the formats produced. URIs are effectively semantic-free.

This gives a LOT of flexibility in the implementation; the client can choose which resources to use based upon the link relations it understands, and changes are introduced by adding new link relations, rather than new URIs (although that’s likely to be a side effect). The URIs in use are completely under control of the server, and can be arranged at will.

In this manner, you don’t need a different URI for your interface, ever, because the entry point is effectively used for agent-driven content negotiation.

The downsides? This approach requires clients to make requests to discover URIs, and not to take shortcuts. It’s therefore chatty — a fairly damning condemnation.

However, notice the all-important Cache-Control header in that response; it may be chatty without caching, but if the client caches, it’s not that bad at all.

The main issues with going HATEOS for your API, then, are the requirements it places upon clients. If client-side HTTP tools were more widely capable, this wouldn’t be a big deal, but currently you can only assume a very low-level, bare HTTP API without caching, so it does place a lot of responsibility on your client developer’s shoulders — not a good thing, since there are usually many more of them than there are server-side.

So, there are arguments for and against HATEOS, and one could say the trade-offs are somewhat balanced; both are at least reasoned positions. However, there’s one more thing…

Enter Extensibility

Extensibility and Versioning are the peanut butter and jelly of protocol engineering. Sure, my kids’ cohort in Australian primary schools are horrified by this combination, but stay with me.

OpenStack has an especially nasty extensibility problem; they allow vendors to add pretty much arbitrary things to the protocol, from new resources to new representations, as well as extensions inside their existing formats.

Allowing such freedom with “baked-in” URIs is hard. You have to carve out extension prefixes to avoid collisions, and then hope that that’s good enough. For example, what if an API uses URIs like this:

http://api.example.com/users/{userid}

and HP wants to add a new subresource to the users collection? Does it become

http://api.example.com/users/hp

? No, that’s bad, because then no userid can be “hp”, and special cases are evil, especially when they’re under the control of others.

You could do:

http://api.example.com/users/ext/hp

and special-case only one thing, “ext”, but that’s pretty nasty too, especially when you can still potentially add “hp” to any point in the URI tree.

Instead, if you take a HATEOS approach, you push extensibility into link relations, so that you have something like:

GET / HTTP/1.1
Host: api.example.com
Accept: application/vnd.example.link_templates+json

HTTP/1.1 200 OK
Content-Type: application/vnd.example.link_templates+json
Cache-Control: max-age=3600
Connection: close

{
  "users": "http://api.example.com/users/{userid}",
  "hp-user-stuff": "http://api.example.com/users/{userid}/stuff"
}

Now, the implementation has full control over the URIs used for extensions, and it’s responsible for avoiding collisions. All that HP (or anyone else wanting an extension) has to do is mint a new link relation type, and describe what it points to (using existing or new media types).

This isn’t the whole extensibility story, of course; format extensions are independent of URIs, for example. However, the freedom of extensibility that taking a HATEOS approach gives you is too good to pass up, in my estimation.

The key insight here, I think, is that URIs are used for so many things — persistent identifiers, cache keys, bases for relative resolution, bookmarks — that overloading them with versioning and extensibility information as well makes them worse for all of their various purposes. By pushing these concerns into link relations and media types using HATEOS, you end up with a flexible, future-proof system that can evolve in a controllable way, without giving up the benefits of using HTTP (never mind REST).

UPDATE: see more in Evolving HTTP APIs.


17 Comments

Stefan Tilkov said:

Excellent post. Very interesting to see you using the term link relation to refer to something that’s not a link rel=… element.

One question: Where is the must ignore rule for JSON you mention specified?

Tuesday, October 25 2011 at 5:45 AM

isaacs said:

Have you seen how Restify handles versioning? I’d love to hear your thoughts on it.

http://mcavage.github.com/node-restify/restify-versions.7.html

The restify client may send an x-api-version header, which is a semver range (of the sort that npm packages use for their dependencies – it’s using the same code, actually.)

The server then can be started with a specific version configuration, and will only respond to requests that accept that version, responding with a specific x-api-version header (not a range), so that the client knows what it’s getting.

It would be trivial to have two versions of the API running at the same time, and route requests to different app servers based on the API version that they accept, such that a single URL refers to a single conceptual resource, albeit perhaps in slightly different forms.

The principle is quite nice. “Tell me what things you are ok receiving, if I can give it to you, I will, and I’ll tell you what you’re getting.” The whole dance is very simple and http-ish.

Tuesday, October 25 2011 at 6:57 AM

karl dubost said:

There is another issue with “requires clients to make requests to discover URIs, and not to take shortcuts.”

Links in a human distributed network. Humans create bookmarks all the time, creates links in the pages, etc. And by doing that they distribute and solidify the minted URIs. Once it has been distributed if the URI is not maintained anymore because it has been generated on the fly with the client exploring the API, there will be massive breakage.

For sure, Cool URIS don’t change. :) but given our history of the Web, it seems there is poor understanding about that. That said, I think in many circumstances it is the way to ensure in a real-time interaction, it addresses resources evolution quite well.

Tuesday, October 25 2011 at 8:16 AM

Jan Algermissen said:

Mark, Karl,

I have been wondering recently[1] on what grounds clients make the decision which URIs received in a response are bookmarkable.

I think bookmarkability is orthogonal to cacheability because cacheability refers to the state of the mappiing function rather then the suitability of the URI to act as an entry point in general.

Thoughts on [1] very welcome.

[1] http://tech.groups.yahoo.com/group/rest-discuss/message/17846

P.S. @Mark, possible to bribe you to replace ‘HATEOS’ (shiver) with ‘hypermedia constraint’? I offer a beer (nah, two beers) should we ever meet f2f :-)

Tuesday, October 25 2011 at 11:32 AM

Stefan Tilkov said:

Great discussion. The main point in introducing this extra level of indirection is that the client is no longer forced to assume that all the URIs it interacts with are provided by a single server (or at least a single load balancer) – i.e., if it follows the links it’s tied to the URIs that are used for the link names, not to the some server they’re related to. In addition, link relations (the ones with a “link” element) are a way to modularize hypermedia formats, i.e. extract some behavior common to more than one particular representation.

Wednesday, October 26 2011 at 5:19 AM

Nick Gall said:

Mark,

Great post. It helps make much clearer the mysteries of versioning and extensibility. But it also highlights an issue with link relations that has been bothering me for a while.

All of the semantics are baked into those link relations — they should probably be URIs if they’re not registered, by the way — and in the formats produced. URIs are effectively semantic-free.

The issue is that link relations require names just like everything else. And the semantics of ANY name can change over time, which means that all you’ve done by using link relations is to create another level of indirection (ie another level of naming). And even though indirection can solve all problems, it creates them as well.

I like how you emphasize that link relations should be URIs. Agreed. But note what that means: You’ve simply “baked the semantics” into a NEW set of URIs (the link relation ones) so that the original URIs (the ones you’ll be dereferencing) can be semantic-free. You’ve just shifted the semantics from one set of URIs to another! That’s why I underlined URIs in my quote above.

So now when you need to change the semantics of, say, the “users” link relation, or the “account” link relation, you’re back to square one! Do I create a new link relation “users2”, etc.

The root problem here is that EVERY interface, web or not, creates some kind of NAMESPACE shared among the entities sharing the interface. The identifiers that constitute this shared namespace have “semantics baked into them”. So the question this always begs is “What should I do when the semantics (expectations) baked into an identifier changes?”

All you’ve done with the use of link relationships is to create a NEW namespace of link relationship URIs, and baked your semantics into those. IE you’ve published documentation on your developer web site defining the meaning of each link relationship. But how is that any better than baking your semantics into the original URIs, ie the ones that you’ll actually be dereferencing to get representations to drive your application?

In other words, why is a web page documenting the meaning of link relationship URIs any better than a web page documenting the original URIs? Why is the meaning of a link relationship URI, once it’s documented on a web page, any LESS prone to change that the meaning of any other URI?

This seems like such an obvious question that I feel like I must be misunderstanding something because I don’t see anyone asking it. Thanks for any light you can shed on the subject.

– Nick

[PS Your comments system doesn’t seem to like HTML tags like b, strong, cite, etc.]

Wednesday, October 26 2011 at 12:48 PM

Dave Duggal said:

Great post.

I really like how you distill the approach in these two points -

“About using HATEOS and link relations for versioning: In this manner, you don’t need a different URI for your interface, ever, because the entry point is effectively used for agent-driven content negotiation.”

“By pushing these concerns into link relations and media types using HATEOS, you end up with a flexible, future-proof system that can evolve in a controllable way, without giving up the benefits of using HTTP (never mind REST).”

We happen to work in a controlled enterprise system with a registry of internally-defined media-types server-side so our job of distributing versions/semantics may be slightly easier than getting people to agree on types out in a greater community. We understand we leave the world of ‘pure’ REST in that regard and describe ourselves as Resource-Oriented.

Our Framework features a Smart Client, with a ‘RESTful’ Agent that performs content-negotiation akin to your description. In our system the goal-oriented Agent fetches and transforms Resources as directed by a reflective protocol (or ‘contract’).

Each of our ‘media-types’ contains the expected syntax and semantic of a resource, is in an xml-based format so our agent can adapt the use of a resource based on its link relations (metadata tag, not link rel).

We extend API evolvability with rich governance. Our version control captures deltas for detailed audit history and rollback. This includes the media-type descriptions so we can encode link relations to automatically point at the description which was active at the time of the resources creation (or conversely, the current applicable version of the media-type definition).

The Framework is in production so we are proof positive that this approach can work well, in fact we use it canonically for real-time app, data and process integration.

Best, Dave

Thursday, October 27 2011 at 5:41 AM

theamiableapi.com said:

A very thoughtful post, thank you.

As a relative newcomer to the world of Web APIs, I’m a bit puzzled by apparent contradictions between theory and practice. On one hand, I’m reading about sensible best practices both in books and online. I think I understand why such best practices exist and how they can help. I can also see that in general people tend to agree with them.

However, when I turn around and look at some actual APIs out there, reality seems quite different. In the concrete case of API versioning, URL-based versioning seems to be the most popular by far, followed by custom version headers as a close second. I don’t remember seeing media type versioning at all, even though this seems to be the one recommended by most.

I did not conduct a formal API survey, of course, just informally looked at a few Web APIs for my own edification. There may be also some bias in my selection because I picked relatively simple and fairly popular Web APIs to look at.

I’m wondering if you have an explanation or at least a personal opinion why I’m seeing this? My own guess would be that some best practices are avoided in popular Web APIs out of concern not to increase the barrier of entry and impact adoption. This is just a guess, though.

Thursday, October 27 2011 at 12:42 PM

http://openid.open.ac.uk/oucu/jk5837 said:

Hi Mark, thanks for the post, it’s a great summary.

I think one thing is understated, and it’d be a part of the response to Nick Gall’s question: when we bake the semantics to link relationship URIs rather than to the actual URIs used in a particular application, we don’t simply create a new namespace — we create a smaller namespace.

Yes the issue of changes in that namespace remains, but the changes are less likely because more thought went into creating the set of link relationships.

Cheers, Jacek Kopecky

Tuesday, November 29 2011 at 11:14 AM

peter dapkus said:

late to the party, but seems like there are lots of reasons to have indirection at the front door of your API, but I’m not sure versioning is one of them. the rigidity isn’t forced on the URIs by the designer, but by the client code that’s written against them. Adding a step of indirection just moves the problem around, imho. the client is still going to have the same quirky dependencies on the behavior of the app behind the URI, and now it has to understand the link templates and know how to populate them.

the issue of cacheability seems like a red herring for many APIs, where, by the time something becomes stable enough to be cached its no longer very interesting. I’m not sure how much real harm is done by adding more URIs to your cache.

as a practical matter, on the server side, it’s very easy to see how I would hang a different version of the application code off of a different URL – the server containers make this easy and do some heavy lifting for me (e.g. isolating app versions in their own classloaders, for example). Its certainly possible to do this based on media types, but not as well supported.

I like the idea of having a client and a server that support multiple versions of the API being able to negotiate the best choice. In the big picture, I’d worry less about this kind of client then say scripts someone dashes off in a day to solve some problem.

Thursday, July 12 2012 at 3:34 AM

aniket-patil.myopenid.com said:

Hi Mark,

I have a couple of questions.

  1. You suggested 2 approaches to versioning resources: i) Create a new resource that supports the new behavior. For e.g. if I have an existing resource called ‘users’ that supports GET, PUT, POST and DELETE, and if I need to remove support for creating users through POST, I would create a new resource that supports the other 3 HTTP verbs, for e.g. ‘users-2’.

ii) The other approach seems to be: ‘Bump up a version’. Did you mean bumping up the global API version number?

For e.g. if my current API version is v1, and the URI for users is http://api.example.com/v1/users, once I stop supporting POST for users, is your recommendation to change v1 to v2? The new URI would be http://api.example.com/v2/users

Or did you mean that there should be a separate version number that is tied to the resources themselves (which indicates the capability of the resource such as supported HTTP verbs, query params etc.?) If so, where would this resource version number go? Would it be part of the URL, such as http://api.example.com/v1/users/v2 (which overloads URLs with version information making them unstable). Or does it go in a different place?

2) It seems to me that there’s not much point in having a global API version number (for e.g. the v1 in http://api.example.com/v1). In general, you would wait for a significant number of changes to happen to the resources or representation formats, and then increment the API version. However, I’m not sure if there’s any point in having a global version number.

API clients are programmed against the media types and the resource capabilities, both of which are indicated through their own separate version identifiers (e.g. through media types), and so the global version number plays no real part in versioning. Do you see any other use in maintaining a global version number (other than simply providing a name to differentiate an old API from a new one)?

Monday, July 1 2013 at 9:27 AM