mnot’s blog

Design depends largely on constraints.” — Charles Eames

XML Entries

Friday, 13 April 2012

JSON or XML: Just Decide

When people create HTTP APIs, one of the common decisions is about what format to use, usually revolving around “JSON or XML?” The thinking often goes like this: JSON is simple, easy to use, and “cool”; clients using dynamic languages will love it BUT, many people (especially those using static languages) are invested in XML So, I’ll just support both! Unfortunately, it’s not that easy; just because HTTP allows you to negotiate for formats doesn’t mean it’s a good idea...

this entry’s page (19 comments)

Monday, 21 January 2008

Watching WADL (and other rambling thoughts)

I’m following the discussion of RESTful Web description in general, and WADL in particular, with both difficulty and interest (see Dare, Patrick and Joe’s thoughts for a nice contrast). Difficulty because there’s so much of it, and it’s hard to give each piece the attention it deserves. Interest because it goes to the heart of the harder parts of REST — e.g., “hypertext as the engine of application state.” On top of that, I was one of the people...

this entry’s page (1 comment)

Friday, 2 November 2007

WADL Documentation XSLT Updated

I've updated the WADL documentation stylesheet, primarily to; Fix a bug with finding and displaying XML Schema Make it compatible with xsltproc (and hopefully most other XSLT1.0 processors that understand EXSLT node-set) Generate valid XHTML The hard part was getting support for xsltproc; while switching over to the node-set function was easy, it uncovered other bugs around how libxml handles copying namespace nodes; an ugly and tiresome hack was required to work around it, but it seems to work....

this entry’s page

Thursday, 5 April 2007

WWW2007 Developers’ Track

We’ve announced the program for this years’ Developers’ Track, and I’m very excited about the lineup. For example, Ryan Boyd from Google will be presenting about GData right before Pasha Sadri talks about Yahoo! Pipes. These are two cutting-edge uses of feeds, and with a little luck we might even be able to get them to field some joint questions in the middle. Also on the topic of feed syndication, Elias Torres is scheduled to talk about Apache Abdera...

this entry’s page

Wednesday, 7 February 2007


Yahoo! (finally!) released Pipes as a beta today; congrats to the very talented team that put this together. Niall gives the geeks-eye view, and to be clear, this is not going to be the next great consumer Web site; your grandmother is not going to go out and build pipes. However, I do think it’s going to be a big wake-up call for the “Enterprise” software industry. This tool does more to deliver on their promises of non-programmers slicing...

this entry’s page (8 comments)

Thursday, 30 November 2006

Schema for JSON

One of the perceived deficiencies of JSON is that it doesn’t have a schema language. I say “perceived” because the problems that a schema language brings often outweigh the benefits; after all, look at the mess that XML Schema is in. Even that said, schemas are useful for documentation and QA. So, I’m finding the work that Robert Cerny has done very interesting; it’s basically schemas for JSON, in JSON (or very nearly so). For example, here’s the schema...

this entry’s page (12 comments)

Friday, 7 April 2006

Are Namespaces (and mU) Necessary?

It’s become axiomatic in some circles — especially in WS-* land, as well as in many other uses of XML — that the preferred (or only) means of offering extensibility is through URI-based namespaces, along with a flag to tell consumers when an extension needs to be understood (a.k.a. mustUnderstand). The reasoning is that extensibility should be as easy as possible. By leveraging one registry — DNS — you can use URIs to allow anyone to create your own...

this entry’s page (13 comments)

Monday, 23 January 2006

How Web-Ready is XMLHttpRequest?

I’ve been playing around with some ideas that use XMLHttpRequest recently, but I keep on bumping up against implementation inconsistencies on IE vs. Safari vs. Opera vs. Mozilla. Although the interface exposed is pretty much the same, what it does in the background is very different, especially with regards to HTTP. For example, some implementations will handle redirects and cache validation for you, while others will pass through the HTTP status codes, expecting you to pick up the pieces....

this entry’s page (29 comments)

Tuesday, 18 October 2005

XSLT for the Rest of the Web

I’ve raved before about how useful the XSLT document() function is, once you get used to it. However, the stars have to be aligned just so to use it; the Web site can’t use cookies for anything important, and the content you’re interested in has to be available in well-formed XML. While that’s all fine and good on some higher-plane, utopian, RESTful, stateless, DTD- and Schema- described, Cool URIish Web, it’s not the useful on the Web that most of...

this entry’s page (12 comments)

Monday, 5 September 2005

Feed History -04

Feed History draft -04 is out, with the only major change being the replacement of fh:stateful with fh:incremental, with corresponding changes throughout the document, to make the concepts a bit clearer. This revision also makes cardinality, relative URIs and white space handling more explicit, and adds an acknowledgements section as promised. On the implementation front, here’s a quick-n-dirty Python script that demonstrates reconstruction of an incremental feed (RSS or Atom); while it’s more prototype code than something you’d want...

this entry’s page (1 comment)

Monday, 15 August 2005

Feed History -03

Draft -03 of Feed History: Enabling Stateful Syndication is now available. Significant changes include: Added fh:archive element, to indicate that an entry is an archive Allow subscription feed to omit fh:stateful if fh:prev is present Clarified that fh doesn’t add ordering semantics, just allows you to reconstruct state Cleaned up text, fixed examples, general standards hygiene See this site’s feed for an example. There’s going to be at least one more draft, as I neglected to acknowledge people who...

this entry’s page

Saturday, 13 August 2005

Adding Semantics to Excel with Microformats and GRDDL

When I worked in the financial industry, I quickly noticed that Excel spreadsheets contain the bulk of the data in the enterprise. It may make IT execs tear their hair out, but having the data nearby and ready for analysis is sloppy, but oh-so-effective. The challenge is to make the data reusable elsewhere. Unfortunately, spreadsheets are a mish-mash of structured but meaningless data; there’s no easy way to tell which columns contain data and which ones are headers. To...

this entry’s page (1 comment)

Wednesday, 10 August 2005

Separating the Data Model from its Serialisation

For some time, I’ve noticed that people defining XML formats spend an inordinate amount of time talking about the structure of the format. This is especially apparent in standards working groups, where hours — no, days — can be spent agonizing over whether to make something an attribute or an element. Part of this is obviously stylistic; people have different thoughts on what makes good XML, and they’re fight the same battles over and over again. I’ve often thought...

this entry’s page (11 comments)

Friday, 8 July 2005

One Description to Bind them All? Nah.

You can describe just about anything with sufficient precision in plain English, given enough words. In practice, this doesn’t happen; specialised fields — whether science, finance or art — develop specialised jargon as a shorthand for concepts that are well-understood in that field. It gives greater precision, easier flow of ideas, and yes, it raises the bar to entry for newcomers. The trade-off is worth it, usually; although it would be genuinely useful if a layman could understand the...

this entry’s page

Tuesday, 14 June 2005

Getting Rid of QNames in Content

Or, What’s Wrong with XInclude? QNames are evil (at least in content), so I never really liked the WSDL convention of using them to name and refer to constructs. It makes much more sense to refer to things on the Web as TimBL intended; using URIs. Using URIs — including fragment identifiers — to refer to portions of documents is an intuitive, scalable, and much less intrusive way to modularise XML formats. XML Base — being unevil — allows you...

this entry’s page (5 comments)

Tuesday, 24 May 2005

Web Description at the W3C

The W3C has just started a mailing list for discussion of Web description formats; This mailing list is dedicated to discussion of Web description languages based on URI/IRI and HTTP, and aligned with the Web and REST Architecture. Unlike WSDL (Web Services Description Language), such languages are not targeted towards description of Web Services. What’s interesting is that many Web Services people — such as David Orchard, Marc Hadley and myself (although I always think of myself as an...

this entry’s page (2 comments)

Saturday, 21 May 2005

XML Base: Evil?

If you accept that QNames in content are evil, the next logical question is whether XML Base is any better. In fact, if you turn your head a certain way, it appears that there’s very little difference between a default namespace and XML Base. Why? XML Base requires someone to know when element or attribute content is a URI, because it has to be applied to them before they can be used. This leaves you in an uncomfortable spot;...

this entry’s page (5 comments)

Wednesday, 18 May 2005

WADLing towards Web Description

Marc Hadley has released WADL in the wild, and I’m intrigued; based on a first look, I’d say it’s the most promising Web (as opposed to Web Services) description language yet. Why? First of all, it’s very resource-oriented; you can clearly and easily see the deliniation between different described resources. It encourages good practices, and supports generative URIs both in query strings and in path segments, which I think is crucial. It’s simple and easy to read. Initial Feedback...

this entry’s page (5 comments)

Tuesday, 17 May 2005

OxygenXML, Now with Visual Schema Editing

OxygenXML 6.0 is out, and it sucks even less. The biggest news is — finally! — a visual Schema editor. This may be the biggest threat yet to Gudge’s job security, as Human Schema Editor. :) I’ve only played with it a bit (as an Eclipse plug-in), but so far, so good; hopefully, the Syncrosoft guys weren’t tempted into XML Spy “compliance”; its visual Schema editor is the root of a lot of problems, and if Oxygen’s implementation is...

this entry’s page

Friday, 29 April 2005

Data Modeling and Abstraction

Today’s release of Tiger includes a new but little-discussed framework for developers, CoreData. What’s most interesting to me is its similarities — and differences — to SDO, IBM and BEA’s* effort to abstract away the specifics of how data is stored. Will we see an über-framework encompassing all of these? Will Apple get on board with IBM and BEA (unlikely, but hey, who knows)? How does it relate to the Semantic Web (which I believe most people should be...

this entry’s page

Sunday, 24 April 2005

Syntax for Distributed Computing

XML is arguably one of the bigger things to come onto industry’s radar for a while, and as a result programming languages (e.g., ECMAScript, Comega, Java) are changing to accommodate it. This isn’t just happening in libraries; the syntax of the languages is changing. This could be just because of the importance of XML, but I also think that it’s because XML is foreign to most programming models; it doesn’t fit well into data structures, objects and functions, and...

this entry’s page (3 comments)

Friday, 1 April 2005

Can Somebody Explain to Me...

RDF has a simple, usable, universal model; everything’s nodes and arcs, so it avoids the problems of the Infoset, which IMO are brought by its complexity and special cases. Years of disquiet about attributes by portions of the XML cognoscenti support this view unintentionally, I think. So, WHY DOES RDF HAVE A SPECIAL CASE, THEREBY LOSING ITS SIMPLICITY? I’m talking about RDF datatypes, of course. As far as I can see, they’re a special case to the data model;...

this entry’s page (4 comments)

Wednesday, 2 March 2005

Using XML in Data-Oriented Applications

So, you’ve got some data that you need to give to somebody else, and you want to use XML to do it; good for you, you’ve seen the light / hopped on the bandwagon / drunk the Kool-Aid. At first glance, this seems like a pretty straightforward task; after all, it’s just angle brackets, right? Not so fast. If you’re the only person who every has to look at the XML or write software to work with it, you’re...

this entry’s page (5 comments)

Tuesday, 22 February 2005


I love the XSLT document function. With it, you can access the whole Web from a stylesheet; this gives a lot of flexibility, in the right situation. For example, my local library’s online system is based upon iPac (now sold as the Horizon Information Portal, I think), a common packaged library management system. One of its nifty features is letting you keep a list of books (“My List”) that you’d like to eventually check out of the library. In...

this entry’s page (2 comments)

Monday, 24 January 2005


I’m intrigued by the JSON effort. While many people (and vendors) have chosen XML for data interchange because it’s not platform- or vendor-specific, these folks have chosen the other path; by leveraging the serialisation of data structures in ECMAScript (nee JavaScript) — a nearly ubiquitous language, on every desktop that has a browser — they get an automatic installed base and at least one API for free. Then, by defining mappings to other languages (e.g., Java, Perl and C#;...

this entry’s page (16 comments)

Friday, 17 December 2004

Tufte would be Proud

The Australian Bureau of Statistics has released an SVG-based "animated population pyramid" that very nicely visualises the change in that country's population over time.

this entry’s page (2 comments)

Thursday, 5 August 2004

The ‘Document’ in Document-Oriented Messaging

(Another instalment in “XML Heresies.”) One of the foundations of most vendors’ approach to Web services is called document-oriented messaging. This is the notion that interoperability is improved by describing a protocol in terms of the artefacts that are exchanged on the wire, rather than how the code that handles them is written. As far as it goes, that’s good advice. Implementation-specific specifications lead to brittleness, because you can’t swap out the implementation; the message is too tightly coupled to...

this entry’s page (10 comments)

Monday, 26 July 2004

Dictionary as API?

From the Daily Python URL comes another noteworthy API for XML; XMLFragment. I haven’t tried it yet (it doesn’t appear to be separately available, hint, hint), but I like the look of it. There are two interesting things going on here. First of all, XMLFragment basically gives up on modelling the complexity of XML in the language, instead punting to XPath. I think that’s a reasonable choice; it’s arguably more intuitive and simple than anything you could do with an...

this entry’s page (6 comments)

Wednesday, 23 June 2004

XML Language Bindings Done Right

John Schneider was in the office last week and gave me a demo of something he’s been working on for a while, E4X — by far one of the coolest technologies I’ve seen in some time. I think that every language is going to want one when they see this stuff. In a nutshell, E4X is a native XML binding for Javascript (sorry, ECMAScript); it makes XML a first-class datatype, rather than stuffing it into an object model. John explains...

this entry’s page (9 comments)

Friday, 28 May 2004

XML Infoset, RDF and Data Modelling

I’ve been talking with a few people about my previous assertion that the Infoset is a bad abstraction for data modelling, and my subsequent post about the informational properties of the Infoset. The feedback has been positive, especially regarding the notion that the Infoset offers great tools for document markup, but presents more problems than solutions when directly used in non-markup applications; i.e., those that are data-oriented. The best examples of the kind of unneeded complexity I’m talking about are...

this entry’s page (5 comments)

Wednesday, 12 May 2004

Informational Properties of Infosets

Recently, I’ve been thinking about the influences that using the Infoset has on the information you place in it. To put it another way: if you work with XML at the Infoset level, what tools are you given to express information with? As an informational channel, the structures that XML gives you can express pretty much anything, of course, but they lend themselves to some things better than others. As such, using the Infoset encourages data to be moulded to...

this entry’s page (6 comments)

Tuesday, 11 May 2004

OxygenXML is Good Enough

I’ve been playing around with the new OxygenXML 4.0 plug-in for Eclipse M8. Overall, it's very good; much better than the competition, although a lot of the slickness can be attributed to Eclipse. While it isn’t everything I want in an XML editor (don’t get lazy, guys), it’s pretty close, and cross-platform licensing is a bonus, so I’m about to take the plunge, pay the fee and switch from BBEdit, at least for XML tasks. Hints to the Oxygen guys:...

this entry’s page (1 comment)

Friday, 7 May 2004 0.2

To help inform discussion of XOP (and to save Sam the trouble ;), I’ve put together a quick-and-dirty (we’re talking two hours) XOP parser in Python. It isn’t particularly efficient, nor is it well-tested or robust; it’s only to demonstrate how a XOP parser might behave. On the command line, it can regurgitate XOP Packages as XML 1.0 serialisations of the Infoset; mnot-laptop:~/Desktop> ./ -t <?xml version="1.0" encoding="iso-8859-1"?> <soap:Envelope xmlns:soap="" xmlns:xop="" xmlns:xop-mime=""> <soap:Body> <m:data xmlns:m=""> <m:photo xop-mime:content-type="image/png"> Ly8gYmluYXJ5IG9jdGV0cyBmb3IgcG5nCg= </m:photo> <m:sig...

this entry’s page

Wednesday, 5 May 2004


Without pointing fingers, some people have a bee in their collective bonnet about the dangers of allowing binary content to be represented in XML, care of XOP. Others are up in arms about re-inventing HTTP in SOAP, courtesy of the Representation Header. Both of these are products of the XML Protocol WG, of which I’m a member, so I’d like to share my viewpoint (which is not that of either my employer nor the working group, etc., ad nauseam). XOP...

this entry’s page (4 comments)

Friday, 9 April 2004

xml:id is Coming

This is a good idea for so many reasons. The media type registration will have to be changed to take advantage of it, but I believe that RFC3023 is under review anyway....

this entry’s page

Sunday, 7 March 2004

The Problem With Infosets

An interesting issue poked its head up at the W3C Technical Plenary last week. XML Protocol (known as SOAP to mere mortals) is defined in terms of XML Infosets — it describes how to move Infosets around and process them, as the basis of Web services. Now, the working group could have chosen to describe SOAP in terms of XML 1.0 angle brackets, but the Infoset provides a nice abstraction; instead of saying “a QName followed by the equals character,...

this entry’s page (3 comments)

Saturday, 7 February 2004

XPointer: Friend or Foe?

One of the uglier corners in the Web architecture is the relationship between fragment ids (the bit of the URI at the end, after the “#”) and content negotiation. In a nutshell, because dereferencing a single URI can return multiple formats, and because the fragID is interpreted by the client based on the format, it’s possible to have a fragID mean wildly different things across representations of a single resource. For example, consider this URI: If both XHTML and...

this entry’s page (2 comments)

Monday, 12 January 2004

XQuery on the Web

There’s a lot of interest out there about exposing XQuery 1.0 / XPath 1.0 / XPath 2.0 in Web interfaces. On the face of it, this is quite a compelling idea; it allows you to reuse a generic query mechanism (goodness) to access arbitrary data based on the client’s needs (more goodness) and only the bits of data that you want go across the wire (yet more goodness). However, as many have noted, there’s a security problem; if you let...

this entry’s page

Saturday, 6 December 2003

QNames are Evil

How's this analogy: Putting QNames into your XML content is like using TCP packets as delimiters in an application protocol. Both can be technically done, but they force an awareness of the special problems they bring up in software layers and intermediaries that could otherwise function in a generic fashion. Anybody got a better one?...

this entry’s page (3 comments)

Wednesday, 26 November 2003

Hoping for Better XML Editors

I’m getting a few requests for clarification and additional information from 3rd party vendors regarding my previous rant on XML editing. With any luck, XML editing will get much more interesting soon…...

this entry’s page (2 comments)

Thursday, 2 October 2003

Why do XML editors suck so much?

I'm seriously sick of using programs that call themselves "XML editors" because they colourize markup. I'm talking about XML Spy, Oxygen, BBEdit, and thousands of lesser programs. All of them are just glorified text editors - they still operate on the level of characters, not information items. This is what I want to see: Element selection - the primary selection mechanism should be per-element, not per-character. I want soft boundries on each element - maybe even go so far that...

this entry’s page (11 comments)

Saturday, 14 June 2003


Sean McGrath, Macintouch and others point out OxygenXML, a pretty slick-looking XML editor. Either it's pretty new and only now coming onto the scene, or I've had my head deeper in the sand than is typical. To put it through its paces, I shoved the source for the WS-I Basic Profile through it. Pretty impressive; it does XML Schema validation (and apparently RNG too), as well as XSLT (doesn't seem to recognize the appropriate PI, tho). The place where it...

this entry’s page

Creative Commons