Friday, 7 April 2006
Are Namespaces (and mU) Necessary?
It’s become axiomatic in some circles — especially in WS-* land, as well as in many other uses of XML — that the preferred (or only) means of offering extensibility is through URI-based namespaces, along with a flag to tell consumers when an extension needs to be understood (a.k.a. mustUnderstand).
The reasoning is that extensibility should be as easy as possible. By leveraging one registry — DNS — you can use URIs to allow anyone to create your own uniquely identified vocabulary, without any overhead of co-ordination.
This is often contrasted (and deemed superior) to the approach of the IETF, which uses IANA to manage many a namespace, requiring prospective registrants to jump through a variety of hoops to get in.
I didn’t question the conventional Web wisdom for a long time. Two things make me reconsider it today; first of all, this message from Roy Fielding to the atom-syntax mailing list last year, responding to my request for a mustUnderstand flag in Atom;
One problem is that the “must understand” feature is intended to prevent dumb software from performing an action when it doesn’t know how to do it right. In reality, software tends to have bugs and developers tend to be optimistic, and thus there is no way to guarantee the software is going to do it right even if it claims to know how. In the end, we just waste a bunch of cycles on unimplemented features and failed requests.
Another problem is that the features that benefit from a must-understand bit tend to be socially reprehensible (and thus the only way they could be deployed is via artificial demand). As soon as one of those features get deployed, the hackers come out and turn off the “must understand” bit for that feature, defeating the protocol in favor of their own view of what is right on the Internet…
In fact, “must understand” has no value in a network-based application except as positive guidance for intermediaries, which is something that can still be accomplished under mustIgnore with a bit of old-fashioned advocacy.
The second is repeated exposure to the Microformats folks, especially Tantek’s arguments that social co-ordination brings the value in shared data and protocols, not infinite extensibility, which tends to encourage duplication of effort.
Applying this to protocols is interesting. Henrik Frystyk Nielsen has maintained that some of HTTP’s biggest flaws are its lack of namespaces and mandatory extensions. He tried to introduce them in PEP, and when that didn’t get traction he took the idea to SOAP. Don Box picked this thread up recently;
On the topic of extensibility mechanisms and the hell they inherently allow, it’s fun to imagine the world that might have been had Paul Leach and Henrik Frystyk Nielsen been successful in getting HTTP Extension Framework adopted… we might see GET requests that look something like this:
M-GET / HTTP/1.1
Opt: “http://www.xmlsoap.org/ws/reliablemessage”; ns=14
Man: “http://www.xmlsoap.org/ws/security”; ns=15
Man: “http://www.xmlsoap.org/ws/secconv”; ns=16
Man: “http://www.xmlsoap.org/ws/trust”; ns=17
Opt: “http://www.xmlsoap.org/ws/timestamp”; ns=18
15-Signature: “hash=eaffab36ca…, referencedParts=…,”
Had we arrived here instead, would we now be referring to HTTP as the Slip-n-slide to Hades?
The answer, obviously, is “yes.” Roy’s thoughts about forced extensibility being “socially reprehensible” ring very true here; it’s not the syntax or the technology that are bad about the message above, its that Microsoft (in this case) is allowed to unilaterally introduce new extensions without engaging the rest of the community in good faith.
In this view, making the points of extensibility into scarce, community-managed resources — e.g., as media types do — is a good thing. It has positive political and social effects; it forces (or at least inclines) people to co-operate, whether they’re a multi-billion dollar behemoth, or a sole engineer who wants his fifteen minutes of fame.
Namespaces aren’t completely evil, of course. If you want to explicitly allow anybody to walk up and add data to your format, they’re a fine way to make sure there’s no ambiguity, and give nice leverage for versioning, and perhaps for separating different concerns. I think this will tend to make sense for formats where truly disconnected, uncoordinated data is collected, like RDF.
However, they don’t automatically make sense for situations where you need tight co-ordination between different entities (e.g., things we tend to call “protocols”); allowing anybody to rock up and extend a protocol with no overhead is inviting interoperability problems and abuse.
I have a harder time coming up with any valid use cases for mustUnderstand. Requiring that unrecognised extensions be ignored (mustIgnore), combined with different identified languages (using namespaces if you must, or better yet, media types), should be enough.