mark nottingham

Use Cases for Web Description Formats

Monday, 14 June 2004

HTTP APIs

One thing about Web description formats that hasn’t seen much discussion yet is how people intend to use them.

The WSDL Working Group has a Usage Scenarios document and a Requirements document, but unfortunately they only talk about the kinds of Web services they want to describe, not how the descriptions themselves are to be used.

So, here are the use cases for Web description formats that I’m aware of, along with examples and some of my conclusions (skip ahead if you like) at the end. Please discuss and/or add to them below.

Server Configuration

Apache’s configuration files — e.g., httpd.conf and .htaccess — are easily the most widely-deployed Web description formats. For example:

DocumentRoot /www/example/htdocs
ServerName example.org
TransferLog /www/example/logs/access_log
<Directory /www/example/htdocs/>
    DAV On
    umask 002
</Directory>
<Location /private>
   AuthType Digest
   AuthName "Share"
   AuthDigestFile /www/example/web_users.digest
   AuthDigestDomain /
   Require valid-user
</Location>
<LocationMatch "/stuff/*.gif$">
  Header append Cache-Control max-age=3600
</LocationMatch>

Server configuration formats are easy to distinguish because there is an implied trust relationship between the format and its consumer; in other words, if you have the authority to configure the server, it’s assumed you know what you’re doing. They’re imperative; they say “map this directory to this URL,” “require authentication here,” and “append this header here.”

This one-sidedness leads to a common claim that server configuration is implementation-specific, and therefore isn’t in need of a standard, shared description format. I don’t think this is a good argument, though. There are several scenarios that require de-coupling the server’s configuration from its control mechanism; consider the Atom API and WebDAV, as well as people who want their Web site’s (or services’) configuration to be portable.

Look at it another way; try telling someone how to configure a feature — say, HTTP compression or authentication — on their Web server, and see how quickly you get frustrated. This survey of HTTP server capabilities (very old, but still relevant, unfortunately) is testament to this.

I’ve talked about problems with current approaches to server configuration in the past; part of the solution will need to be Web description format.

Client Configuration

On the other side, Web clients sometimes need to get configuration information about resources in a manner that’s external to requests to the resources themselves. Sometimes, this is for efficiency, such as with WebDAV multistatus responses, and sometimes it’s because some information is needed before the decision to make a request can be made, such as with robots.txt and P3P.

For example, P3P’s Policy Reference File format allows clients to find out about the different privacy policies associated with different parts of a Web site:

<META xmlns="https://www.w3.org/2002/01/P3Pv1">
 <POLICY-REFERENCES>
    <EXPIRY max-age="172800"/>
    <POLICY-REF about="/P3P/Policies.xml#first">
      <INCLUDE>/*</INCLUDE>
      <EXCLUDE>/catalog/*</EXCLUDE>
    </POLICY-REF>
    <POLICY-REF about="/P3P/Policies.xml#second">
      <INCLUDE>/catalog/*</INCLUDE>
      <METHOD>GET</METHOD>
    </POLICY-REF>
 </POLICY-REFERENCES>
</META>

In some ways, client configuration is more straightforward than server configuration; because it can only conceptualise resources as a structured space of URIs, the description must be in terms of URIs and other aspects of the request, such as HTTP methods.

On the other hand, client configuration is more subtle, because it lacks the implied trust between the format and who produces it. Here, most assertions are about the characteristics of various resources on the site, whether regarding the privacy policies they will follow, what robots should do with them, and so forth; it’s akin to negotiation.

Intermediary Configuration

When I worked at Akamai, one of my responsibilities was designing the interfaces that we offered for our customers to control the 10,000 or so intermediaries we put at their disposal. Initially, they had specified control metadata in-URL; this is why so many Akamai URLs look something like this:

http://a772.g.akamai.net/7/772/51/766f27fbad0b26/www.apple.com/t/2003/us/en/i/3.gif

Each one of those path segments is a piece of control information for Akamai’s servers.

That worked for a while, but we quickly came across requirements for control mechanisms that were easier to manage and less prone to tampering. After a fair amount of work, we came up with what would become URISpace, along with a vocabulary of controls for Akamai-specific behaviour.

This allowed customers to describe their desired behaviours in an XML file that they could either upload or edit with a Web UI. As a result, it was much easier for them to control the behaviour of an entire site, while still maintaining granularity; instead of setting HTTP response headers or rewriting URIs in HTML, they could simply edit a file and not even touch their production systems.

In many ways, intermediary configuration is similar to that for servers and clients, except that it acts as both. Add to that the fact that intermediaries can represent many parties interests; sometimes they’re operating on behalf of the server, sometimes the client, and sometimes the network in between. This can have a large influence on how they’ll behave and what they’ll be willing to do.

Resource Modelling and Implementation

People implementing Web-based servers, clients and intermediaries often want to be able to describe the interface provided.

There are many benefits of doing so; it becomes possible for tools to aid in the modelling and visualisation of Web resources and their state, and frameworks can generate code (both for the server and client sides) so to lighten the load on developers.

In comparison to the configuration cases, resource modelling is more about the application-specific semantics, rather than about configuring the standard substrates that they use. In geek-talk, design-time tasks instead of runtime ones.

A nice real-world demonstration of a need here is in Jon Udell’s LibraryLookup system, where he’s gone and hand-crafted an interface into a number of library Web systems. If those libraries had described the interfaces they’ve exposed with a Web description format, they’d have saved Jon a lot of effort.

WSDL is one obvious solution in this space. I’ll spare you the XML, as it’s quite long and not human-readable, but I am encouraged by progress in the W3C Web Description Working Group on WSDL 2.0.

Describing Resource Relationships

The next logical step beyond modelling and configuring the acceptable messages resources accept and produce is describing the relationships between multiple resources, so that Web agents can work with whole Web sites, rather than just a resource at a time.

For example, RDDL gives you a way to talk about the resources associated with a namespace URI:

<rddl:resource xlink:type="simple"
        xlink:title="RDDL Natures"
        xlink:role="http://www.rddl.org/"
        xlink:arcrole="http://www.rddl.org/purposes#directory"
        xlink:href="http://www.rddl.org/natures"
>
<div class="resource">
        <p>It is anticipated that many related-resource
        natures will be well known. A list of well-known
        natures may be found in the RDDL directory <a
        href="http://www.rddl.org/natures">
        http://www.rddl.org/natures</a>.</p>
</div>
</rddl:resource>

In the Web services space, this area seems to be staked out by “Choreography” specifications like BPEL and WS-Choreography. I’m not familiar enough with these specifications, so I can’t comment on how applicable they are to the general Web.

In a more general sense, describing relationships between resources is the domain of the Semantic Web as well, of course.

In many ways, describing the relationships between resources is an extension of the design-time tasks in the previous use case.

Can (and Should) One Format Do it All?

I won’t go as far as to assert that one format can address all of these use cases, but I strongly suspect that there are more defined than we need. The use cases here seem to have a lot of common requirements, with only two significant axes of variability;

  1. Who the description is used by (i.e., client, intermediary, server)
  2. When the description is used (e.g., design time, deployment time, runtime)

Interestingly, when you look at how the different formats in wide use (especially WSDL and httpd.conf) cover this continuum, you see that they specialise quite a bit; while http.conf concentrates on server-side configuration (only natural), WSDL tends to focus more on client-side stuff, which IMO is short-sighted.

I’m also struck at how the design-time work is so well-modelled by things like WSDL, even while its mechanism dedicated to runtime — the binding — is notoriously difficult to work with, and how the runtime machinery of pure configuration formats like http.conf is so well-tuned to its task, while not addressing modelling at all. Could it be that we’d be able to get the best of both worlds by combining them?

If we could, it would allow you to visualise, design and control your Web resources from one place in a co-ordinated fashion, and save a lot of frustrating incompatibility in the meantime. Done right, it would also enable new applications – like more intelligent offline browsing in clients – and promote good resource modelling. These would be big wins.

Even if we don’t get that far, I’ll argue strongly that we should co-ordinate the vocabularies that we use with them; for example, wouldn’t it be nice to use WebDAV properties in WSDL where appropriate? The same goes for P3P, robots.txt, RDDL and the rest. Even when the horses have already bolted and a format has wide adoption, we can try to rationalise the vocabularies with a mapping, or v2.