Thinking about Namespaces in JSON
Wednesday, 12 October 2011
Since joining Rackspace to help out with OpenStack, one of the hot topics of conversation I’ve been involved in has been extensibility and versioning.
I think most of my readers (yes, all six of you) are fairly familiar with, if not tired of (hi, Dave!) the various arguments and counter-arguments in this space. However, there is one new-ish bit; how to do distributed extensibility in JSON.
That’s because OpenStack’s API allows vendors to add extensions in various ways, in an uncoordinated fashion. And while that’s a well-understood (if still somewhat tricky) problem in XML, it hasn’t been approached at all in JSON, which has fast become the format of choice for data-bearing APIs.
JSON has a head start in that it embodies the mustIgnore rule; if you put extra data in a JSON document (for example, an extra property on an object), all implementations will just ignore it. Great. However, the problem comes in when multiple people want to extend a document, but avoid collisions.
For example, given this straw-man JSON document:
{
"foo": "bar",
"version": 1
}
and you both FooCorp and BarProject add a “widget” property, they’ll be fighting over who owns it. Bad luck.
So, some way to coordinate these parties and assure that they don’t conflict is necessary. In XML, this is done with Namespaces in XML, and so solutions to this problem are generally called Namespaces too, even though they don’t have to look or work the same way.
Prior Art
I’m not the first person to wonder in this direction, of course.
Yaron made the first proposal, as far as I can tell. His approach looks like this:
{
"org.goland.schemas.projectFoo.specProposal" : {
"title": "JSON Extensions",
"author": { "firstName": "Yaron",
"com.example.schemas.middleName":"Y",
"org.goland.schemas.projectFoo.lastName": "Goland",
}
}
}
It’s sort of a Java-ish approach, based on the DNS like URIs, but without the syntactic awkwardness of putting URIs in JSON. he also states that there’s an implicit name space for descendants; e.g., here, “title” is also in the org.goland.schemas.projectFoo name space.
There was another proposal in the JSON-schema mailing list in 2008. It looks very, very similar to XML schemas, except that the namespaces, as far as I can figure out, are bound inside the schema itself, rather than the document. It seems to have been shot down, because it required schema parsing to be able to identify things; never a good idea, especially in the JSON world.
Some Observations
Starting with the obvious, I’d say that if you can use JSON without namespaces, you really, really should. In other words, if you really need distributed extensibility, you need something like namespaces, but for all other purposes, they should be avoided like the plague; they make it too complex, and simplicity is the name of the game in JSON.
A bit more subtly, I think this isn’t just a document-by-document decision, but an node-by-node one in the document. I.e., you should identify the specific places in a document that need extensibility and allow namespaces there, but they shouldn’t pollute the rest of the document, if they aren’t needed there.
I suppose what I’m saying is that namespaces should be a purely syntactic convention to avoid collisions where distributed extensibility is allowed, rather than some magical thing that allows you to uniquely and globally identify every bit of data in the document. I know that’s going to rile up some of the linked data and semweb folks, but we’re talking JSON here, not Turtle or RDF.
This implies that Yaron’s inheritance is unnecessary; the very fact that the “title” property is a member of “org.goland.schemas.projectFoo.specProposal” is sufficient to assure lack of collisions (unless he wants to allow extensibility at that level too, in which case they should be explicit at that level).
Another Straw-Man
Given all of that, I wonder if the problem can be simplified enough to make some progress. I think Yaron’s proposal makes a certain amount of sense, with a few modifications:
- JSON-based formats need to define which objects require namespaced members explicitly. I.e., it’s opt-in and constrained to only those nodes nominated for distributed extensibility.
- No inheritance is assumed.
- Non-namespaced property names won’t have the delimiter character in them (here, ‘.’)
- Prefixes are defined by the format; they can either be in the DNS-based style that Yaron advocates, or if there’s some level of coordination, you could set up a registry (JSON-based, of course :) of shorter prefixes.
This would tweak Yaron’s sample to something like (assuming that a registry were used):
{
"FOO.specProposal" : {
"title": "JSON Extensions",
"author": { "firstName": "Yaron",
"EXAMPLE.middleName":"Y",
"lastName": "Goland",
}
}
}
I like this because it’s not very painful, it doesn’t require schema to process, and it gets the job done; it allows distributed extensibility. The important thing is to stop looking at namespaces as something you should slather over your format like butter — more is better! — and start seeing them as a specialised tool that should only be used when it can do some good.
11 Comments
https://me.yahoo.com/a/TexiesAZsNIThc_3YLLThR4ADxVB11WWgu_m#e8386 said:
Wednesday, October 12 2011 at 6:11 AM
Mark Nottingham said:
Wednesday, October 12 2011 at 7:03 AM
Kenneth Falck said:
Wednesday, October 12 2011 at 8:39 AM
Patrick Mueller said:
Wednesday, October 12 2011 at 9:43 AM
David Carver said:
Thursday, October 13 2011 at 1:09 AM
Manu Sporny said:
Thursday, October 13 2011 at 3:26 AM
Mark Nottingham said:
Thursday, October 13 2011 at 9:40 AM
Manu Sporny said:
Thursday, October 13 2011 at 10:02 AM
paulehoffman said:
Friday, October 14 2011 at 8:24 AM
Manu Sporny said:
Saturday, October 15 2011 at 12:41 PM
Mark Nottingham said:
Monday, October 17 2011 at 10:56 AM