JSON and XML

Monday, 24 January 2005

I’m intrigued by the JSON effort. While many people (and vendors) have chosen XML for data interchange because it’s not platform- or vendor-specific, these folks have chosen the other path; by leveraging the serialisation of data structures in ECMAScript (nee JavaScript) — a nearly ubiquitous language, on every desktop that has a browser — they get an automatic installed base and at least one API for free.

Then, by defining mappings to other languages (e.g., Java, Perl and C#; by coincidence or design, Python doesn’t require anything extra), they suddenly get a data interchange format that’s pretty darn useful for what’s becoming a very common task — turning those browsers into an application platform.

Some XML people will scoff, whilst others will have fear in their eyes; as discussed before, XML isn’t so great for data modeling.

It Always *Starts* Simple…

So, will XML be a distant memory in a few years? Will world-wide inventories of angle brackets shoot up thanks to JSON? Not quite. While on the face, it’s a very attractive solution, I have a feeling JSON is going to run into a few problems.

First of all, it’s still a tree; there isn’t any way to represent a graph in JSON. This isn’t a big loss over XML — also a tree — but it does present a problem in some situations. It would be better if a reference mechanism were built in.

More seriously, JSON also doesn’t have a language-neutral schema mechanism; while you might be able to describe something in prose, or in a language-specific way, it would be really nice to be able to validate data and generate code, and it’s critical to have well-described interfaces.

Next, JSON’s type system is fairly limited; for example, there are no time or date types. You can say that it’s an integer offset from an epoch, but then you get into implementation-specific concerns. Whoops.

All of these problems can be addressed by extending JSON, which leads us to our final issue; JSON doesn’t have any mechanism for extension or versioning. In other words, how does one change the data structure you’re pushing across the wire over time, whilst still remaining compatible with processors that are expecting or generating the old version? How do you disambiguate fourteen different “item” structures when you want to combine different data sources? For bonus points, what do you do when one of those extensions isn’t valid JavaScript?

Don’t Get Cocky, XML

That’s not to say that XML is so hot either; while these problems have been recognised, XML can’t represent a graph, XML Schema is not exactly user-friendly or even implementer-understandable, and while the XML Schema type system is pretty good, extending and versioning XML is still dangerous territory, and Namespaces in XML are trickier than they appear.

So, although JSON clearly has shortcomings and limitations, XML shares some of them, and extracts an arguably high tax for those it doesn’t.

Considering that it’s tangentially associated with oh-so-cool technologies like GMail and Google Suggest, and is a sop to the XML-is-too-slow-and-bloated contingent, I wouldn’t be surprised if, once mature, JSON takes a bite out of a lot of the “low-end” (translate: non-enterprise) projects out there, because XML will fail to justify its cost in non-markup applications. In short, some developers won’t care about the limitations above, because they don’t think they’ll push the envelope that much, or if they do, they can fudge it. Fair enough.

That said, right now I still think of JSON more as an expression of frustration with XML for data modeling and representation than the ultimate solution; while it’s extremely attractive to couple it closely with existing languages, more is required to interchange data robustly between distributed systems. YMMV.

P.S.

The “O” stands for “Object.” Here we go again…

Mark Nottingham

other XML posts

JSON and XML