XML Entries
Friday, 13 April 2012
When people create HTTP APIs, one of the common decisions is about what format to use, usually revolving around “JSON or XML?” The thinking often goes like this: JSON is simple, easy to use, and “cool”; clients using dynamic languages will love it BUT, many people (especially those using static languages) are invested in XML So, I’ll just support both! Unfortunately, it’s not that easy; just because HTTP allows you to negotiate for formats doesn’t mean it’s a good idea...
Monday, 21 January 2008
I’m following the discussion of RESTful Web description in general, and WADL in particular, with both difficulty and interest (see Dare, Patrick and Joe’s thoughts for a nice contrast). Difficulty because there’s so much of it, and it’s hard to give each piece the attention it deserves. Interest because it goes to the heart of the harder parts of REST — e.g., “hypertext as the engine of application state.” On top of that, I was one of the people...
Friday, 2 November 2007
I've updated the WADL documentation stylesheet, primarily to; Fix a bug with finding and displaying XML Schema Make it compatible with xsltproc (and hopefully most other XSLT1.0 processors that understand EXSLT node-set) Generate valid XHTML The hard part was getting support for xsltproc; while switching over to the node-set function was easy, it uncovered other bugs around how libxml handles copying namespace nodes; an ugly and tiresome hack was required to work around it, but it seems to work....
Thursday, 5 April 2007
We’ve announced the program for this years’ Developers’ Track, and I’m very excited about the lineup. For example, Ryan Boyd from Google will be presenting about GData right before Pasha Sadri talks about Yahoo! Pipes. These are two cutting-edge uses of feeds, and with a little luck we might even be able to get them to field some joint questions in the middle. Also on the topic of feed syndication, Elias Torres is scheduled to talk about Apache Abdera...
Wednesday, 7 February 2007
Yahoo! (finally!) released Pipes as a beta today; congrats to the very talented team that put this together. Niall gives the geeks-eye view, and to be clear, this is not going to be the next great consumer Web site; your grandmother is not going to go out and build pipes. However, I do think it’s going to be a big wake-up call for the “Enterprise” software industry. This tool does more to deliver on their promises of non-programmers slicing...
Thursday, 30 November 2006
One of the perceived deficiencies of JSON is that it doesn’t have a schema language. I say “perceived” because the problems that a schema language brings often outweigh the benefits; after all, look at the mess that XML Schema is in. Even that said, schemas are useful for documentation and QA. So, I’m finding the work that Robert Cerny has done very interesting; it’s basically schemas for JSON, in JSON (or very nearly so). For example, here’s the schema...
Friday, 7 April 2006
It’s become axiomatic in some circles — especially in WS-* land, as well as in many other uses of XML — that the preferred (or only) means of offering extensibility is through URI-based namespaces, along with a flag to tell consumers when an extension needs to be understood (a.k.a. mustUnderstand). The reasoning is that extensibility should be as easy as possible. By leveraging one registry — DNS — you can use URIs to allow anyone to create your own...
Monday, 23 January 2006
I’ve been playing around with some ideas that use XMLHttpRequest recently, but I keep on bumping up against implementation inconsistencies on IE vs. Safari vs. Opera vs. Mozilla. Although the interface exposed is pretty much the same, what it does in the background is very different, especially with regards to HTTP. For example, some implementations will handle redirects and cache validation for you, while others will pass through the HTTP status codes, expecting you to pick up the pieces....
Tuesday, 18 October 2005
I’ve raved before about how useful the XSLT document() function is, once you get used to it. However, the stars have to be aligned just so to use it; the Web site can’t use cookies for anything important, and the content you’re interested in has to be available in well-formed XML. While that’s all fine and good on some higher-plane, utopian, RESTful, stateless, DTD- and Schema- described, Cool URIish Web, it’s not the useful on the Web that most of...
Monday, 5 September 2005
Feed History draft -04 is out, with the only major change being the replacement of fh:stateful with fh:incremental, with corresponding changes throughout the document, to make the concepts a bit clearer. This revision also makes cardinality, relative URIs and white space handling more explicit, and adds an acknowledgements section as promised. On the implementation front, here’s a quick-n-dirty Python script that demonstrates reconstruction of an incremental feed (RSS or Atom); while it’s more prototype code than something you’d want...
Monday, 15 August 2005
Draft -03 of Feed History: Enabling Stateful Syndication is now available. Significant changes include: Added fh:archive element, to indicate that an entry is an archive Allow subscription feed to omit fh:stateful if fh:prev is present Clarified that fh doesn’t add ordering semantics, just allows you to reconstruct state Cleaned up text, fixed examples, general standards hygiene See this site’s feed for an example. There’s going to be at least one more draft, as I neglected to acknowledge people who...
Saturday, 13 August 2005
When I worked in the financial industry, I quickly noticed that Excel spreadsheets contain the bulk of the data in the enterprise. It may make IT execs tear their hair out, but having the data nearby and ready for analysis is sloppy, but oh-so-effective. The challenge is to make the data reusable elsewhere. Unfortunately, spreadsheets are a mish-mash of structured but meaningless data; there’s no easy way to tell which columns contain data and which ones are headers. To...
Wednesday, 10 August 2005
For some time, I’ve noticed that people defining XML formats spend an inordinate amount of time talking about the structure of the format. This is especially apparent in standards working groups, where hours — no, days — can be spent agonizing over whether to make something an attribute or an element. Part of this is obviously stylistic; people have different thoughts on what makes good XML, and they’re fight the same battles over and over again. I’ve often thought...
Friday, 8 July 2005
You can describe just about anything with sufficient precision in plain English, given enough words. In practice, this doesn’t happen; specialised fields — whether science, finance or art — develop specialised jargon as a shorthand for concepts that are well-understood in that field. It gives greater precision, easier flow of ideas, and yes, it raises the bar to entry for newcomers. The trade-off is worth it, usually; although it would be genuinely useful if a layman could understand the...
Tuesday, 14 June 2005
Or, What’s Wrong with XInclude? QNames are evil (at least in content), so I never really liked the WSDL convention of using them to name and refer to constructs. It makes much more sense to refer to things on the Web as TimBL intended; using URIs. Using URIs — including fragment identifiers — to refer to portions of documents is an intuitive, scalable, and much less intrusive way to modularise XML formats. XML Base — being unevil — allows you...
Tuesday, 24 May 2005
The W3C has just started a mailing list for discussion of Web description formats; This mailing list is dedicated to discussion of Web description languages based on URI/IRI and HTTP, and aligned with the Web and REST Architecture. Unlike WSDL (Web Services Description Language), such languages are not targeted towards description of Web Services. What’s interesting is that many Web Services people — such as David Orchard, Marc Hadley and myself (although I always think of myself as an...
Saturday, 21 May 2005
If you accept that QNames in content are evil, the next logical question is whether XML Base is any better. In fact, if you turn your head a certain way, it appears that there’s very little difference between a default namespace and XML Base. Why? XML Base requires someone to know when element or attribute content is a URI, because it has to be applied to them before they can be used. This leaves you in an uncomfortable spot;...
Wednesday, 18 May 2005
Marc Hadley has released WADL in the wild, and I’m intrigued; based on a first look, I’d say it’s the most promising Web (as opposed to Web Services) description language yet. Why? First of all, it’s very resource-oriented; you can clearly and easily see the deliniation between different described resources. It encourages good practices, and supports generative URIs both in query strings and in path segments, which I think is crucial. It’s simple and easy to read. Initial Feedback...
Tuesday, 17 May 2005
OxygenXML 6.0 is out, and it sucks even less. The biggest news is — finally! — a visual Schema editor. This may be the biggest threat yet to Gudge’s job security, as Human Schema Editor. :) I’ve only played with it a bit (as an Eclipse plug-in), but so far, so good; hopefully, the Syncrosoft guys weren’t tempted into XML Spy “compliance”; its visual Schema editor is the root of a lot of problems, and if Oxygen’s implementation is...
Friday, 29 April 2005
Today’s release of Tiger includes a new but little-discussed framework for developers, CoreData. What’s most interesting to me is its similarities — and differences — to SDO, IBM and BEA’s* effort to abstract away the specifics of how data is stored. Will we see an über-framework encompassing all of these? Will Apple get on board with IBM and BEA (unlikely, but hey, who knows)? How does it relate to the Semantic Web (which I believe most people should be...
Sunday, 24 April 2005
XML is arguably one of the bigger things to come onto industry’s radar for a while, and as a result programming languages (e.g., ECMAScript, Comega, Java) are changing to accommodate it. This isn’t just happening in libraries; the syntax of the languages is changing. This could be just because of the importance of XML, but I also think that it’s because XML is foreign to most programming models; it doesn’t fit well into data structures, objects and functions, and...
Friday, 1 April 2005
RDF has a simple, usable, universal model; everything’s nodes and arcs, so it avoids the problems of the Infoset, which IMO are brought by its complexity and special cases. Years of disquiet about attributes by portions of the XML cognoscenti support this view unintentionally, I think. So, WHY DOES RDF HAVE A SPECIAL CASE, THEREBY LOSING ITS SIMPLICITY? I’m talking about RDF datatypes, of course. As far as I can see, they’re a special case to the data model;...
Wednesday, 2 March 2005
So, you’ve got some data that you need to give to somebody else, and you want to use XML to do it; good for you, you’ve seen the light / hopped on the bandwagon / drunk the Kool-Aid. At first glance, this seems like a pretty straightforward task; after all, it’s just angle brackets, right? Not so fast. If you’re the only person who every has to look at the XML or write software to work with it, you’re...
Tuesday, 22 February 2005
I love the XSLT document function. With it, you can access the whole Web from a stylesheet; this gives a lot of flexibility, in the right situation. For example, my local library’s online system is based upon iPac (now sold as the Horizon Information Portal, I think), a common packaged library management system. One of its nifty features is letting you keep a list of books (“My List”) that you’d like to eventually check out of the library. In...
Monday, 24 January 2005
I’m intrigued by the JSON effort. While many people (and vendors) have chosen XML for data interchange because it’s not platform- or vendor-specific, these folks have chosen the other path; by leveraging the serialisation of data structures in ECMAScript (nee JavaScript) — a nearly ubiquitous language, on every desktop that has a browser — they get an automatic installed base and at least one API for free. Then, by defining mappings to other languages (e.g., Java, Perl and C#;...
Friday, 17 December 2004
The Australian Bureau of Statistics has released an SVG-based "animated population pyramid" that very nicely visualises the change in that country's population over time.
Thursday, 5 August 2004
(Another instalment in “XML Heresies.”) One of the foundations of most vendors’ approach to Web services is called document-oriented messaging. This is the notion that interoperability is improved by describing a protocol in terms of the artefacts that are exchanged on the wire, rather than how the code that handles them is written. As far as it goes, that’s good advice. Implementation-specific specifications lead to brittleness, because you can’t swap out the implementation; the message is too tightly coupled to...
Monday, 26 July 2004
From the Daily Python URL comes another noteworthy API for XML; XMLFragment. I haven’t tried it yet (it doesn’t appear to be separately available, hint, hint), but I like the look of it. There are two interesting things going on here. First of all, XMLFragment basically gives up on modelling the complexity of XML in the language, instead punting to XPath. I think that’s a reasonable choice; it’s arguably more intuitive and simple than anything you could do with an...
Wednesday, 23 June 2004
John Schneider was in the office last week and gave me a demo of something he’s been working on for a while, E4X — by far one of the coolest technologies I’ve seen in some time. I think that every language is going to want one when they see this stuff. In a nutshell, E4X is a native XML binding for Javascript (sorry, ECMAScript); it makes XML a first-class datatype, rather than stuffing it into an object model. John explains...
Friday, 28 May 2004
I’ve been talking with a few people about my previous assertion that the Infoset is a bad abstraction for data modelling, and my subsequent post about the informational properties of the Infoset. The feedback has been positive, especially regarding the notion that the Infoset offers great tools for document markup, but presents more problems than solutions when directly used in non-markup applications; i.e., those that are data-oriented. The best examples of the kind of unneeded complexity I’m talking about are...
Wednesday, 12 May 2004
Recently, I’ve been thinking about the influences that using the Infoset has on the information you place in it. To put it another way: if you work with XML at the Infoset level, what tools are you given to express information with? As an informational channel, the structures that XML gives you can express pretty much anything, of course, but they lend themselves to some things better than others. As such, using the Infoset encourages data to be moulded to...
Tuesday, 11 May 2004
I’ve been playing around with the new OxygenXML 4.0 plug-in for Eclipse M8. Overall, it's very good; much better than the competition, although a lot of the slickness can be attributed to Eclipse. While it isn’t everything I want in an XML editor (don’t get lazy, guys), it’s pretty close, and cross-platform licensing is a bonus, so I’m about to take the plunge, pay the fee and switch from BBEdit, at least for XML tasks. Hints to the Oxygen guys:...
Friday, 7 May 2004
To help inform discussion of XOP (and to save Sam the trouble ;), I’ve put together a quick-and-dirty (we’re talking two hours) XOP parser in Python. It isn’t particularly efficient, nor is it well-tested or robust; it’s only to demonstrate how a XOP parser might behave. On the command line, it can regurgitate XOP Packages as XML 1.0 serialisations of the Infoset; mnot-laptop:~/Desktop> ./XopParser.py -t <?xml version="1.0" encoding="iso-8859-1"?> <soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xop="http://www.w3.org/2003/12/xop/include" xmlns:xop-mime="http://www.w3.org/2003/12/xop/mime"> <soap:Body> <m:data xmlns:m="http://example.org/stuff"> <m:photo xop-mime:content-type="image/png"> Ly8gYmluYXJ5IG9jdGV0cyBmb3IgcG5nCg= </m:photo> <m:sig...
Wednesday, 5 May 2004
Without pointing fingers, some people have a bee in their collective bonnet about the dangers of allowing binary content to be represented in XML, care of XOP. Others are up in arms about re-inventing HTTP in SOAP, courtesy of the Representation Header. Both of these are products of the XML Protocol WG, of which I’m a member, so I’d like to share my viewpoint (which is not that of either my employer nor the working group, etc., ad nauseam). XOP...
Friday, 9 April 2004
This is a good idea for so many reasons. The media type registration will have to be changed to take advantage of it, but I believe that RFC3023 is under review anyway....
Sunday, 7 March 2004
An interesting issue poked its head up at the W3C Technical Plenary last week. XML Protocol (known as SOAP to mere mortals) is defined in terms of XML Infosets — it describes how to move Infosets around and process them, as the basis of Web services. Now, the working group could have chosen to describe SOAP in terms of XML 1.0 angle brackets, but the Infoset provides a nice abstraction; instead of saying “a QName followed by the equals character,...
Saturday, 7 February 2004
One of the uglier corners in the Web architecture is the relationship between fragment ids (the bit of the URI at the end, after the “#”) and content negotiation. In a nutshell, because dereferencing a single URI can return multiple formats, and because the fragID is interpreted by the client based on the format, it’s possible to have a fragID mean wildly different things across representations of a single resource. For example, consider this URI: http://www.example.org/news#fire If both XHTML and...
Monday, 12 January 2004
There’s a lot of interest out there about exposing XQuery 1.0 / XPath 1.0 / XPath 2.0 in Web interfaces. On the face of it, this is quite a compelling idea; it allows you to reuse a generic query mechanism (goodness) to access arbitrary data based on the client’s needs (more goodness) and only the bits of data that you want go across the wire (yet more goodness). However, as many have noted, there’s a security problem; if you let...
Saturday, 6 December 2003
How's this analogy: Putting QNames into your XML content is like using TCP packets as delimiters in an application protocol. Both can be technically done, but they force an awareness of the special problems they bring up in software layers and intermediaries that could otherwise function in a generic fashion. Anybody got a better one?...
Wednesday, 26 November 2003
I’m getting a few requests for clarification and additional information from 3rd party vendors regarding my previous rant on XML editing. With any luck, XML editing will get much more interesting soon…...
Thursday, 2 October 2003
I'm seriously sick of using programs that call themselves "XML editors" because they colourize markup. I'm talking about XML Spy, Oxygen, BBEdit, and thousands of lesser programs. All of them are just glorified text editors - they still operate on the level of characters, not information items. This is what I want to see: Element selection - the primary selection mechanism should be per-element, not per-character. I want soft boundries on each element - maybe even go so far that...
Saturday, 14 June 2003
Sean McGrath, Macintouch and others point out OxygenXML, a pretty slick-looking XML editor. Either it's pretty new and only now coming onto the scene, or I've had my head deeper in the sand than is typical. To put it through its paces, I shoved the source for the WS-I Basic Profile through it. Pretty impressive; it does XML Schema validation (and apparently RNG too), as well as XSLT (doesn't seem to recognize the appropriate PI, tho). The place where it...