mnot’s blog

Design depends largely on constraints.” — Charles Eames

Saturday, 28 July 2007

URI Templates Redux

URI Templates -01 is now an Internet-Draft.

After sitting on the spec for a while and trying to figure out an elegant solution to the encoding problem, we decided to take the simple route and see how it sticks. The only encoding that the template spec itself does is for characters that are outside the range of those allowed in URIs. Everything else is application-specifc.

So, for example, this variable;

foo = "/thing?@ <--stüff-->"

with this template:

http://www.example.com/{foo}

will get expanded to this:

http://www.example.com//thing?@%20%3c--st%c3%bcff--%3e

Notice how the space, angle brackets and non-ASCII characters were encoded, while the forward slash, ampersand and question mark were not.

If you want additional encoding, this needs to happen before the variable is fed to the template processor. This allows template processing to be simple and deterministic; you always know what you're going to get, and pretty much any legal URI can be templated. However, it does put more responsibility on the application to do the encoding to assure that the output is a legal URI.

The hope here is that toolkits will develop sensible patterns for this that will enable template definitions to easily trigger the right pre-encoding. The form that this will take is still up in the air, but I think that's OK for now.

There are a few other changes; Joe has a diff on his site; discussion welcome on the URI list.


Filed under: HTTP Protocol Design Web Web Services

5 Comments

Ian Bicking said:

How do you avoid double-quoting? If you want to quote /'s, you can't actually turn them into %2f, because they'll be quoted again as %252f. I don't see any way to force / to become %2f as a result, given a generic URL templating library and any kind of pre-quoting before you pass into that library.

Sunday, July 29 2007 at 2:00 PM +10:00

Mark Nottingham said:

Hey Ian,

Good catch. That's what happens when you try to beat the I-D submission deadline :)

I've been thinking of a few ways to do this; it was discussed *way* back and forgotten, IIRC. My current favourite is to exempt the '%' character from auto-escaping, telling people to escape it themselves if it occurs in data. Thoughts?

Tuesday, July 31 2007 at 7:19 AM +10:00

Ian Bicking said:

Yikes, not quoting % sounds really... eclectic.

Also, I get the impression you want to allow /{path_stuff}/, where path_stuff can be a variable number of segments. The analog would be ?{args}, where args can be multiple query arguments (e.g., args could be q=test&num=10). I don't know if this is an argument that you should allow both kinds of variability or neither. But they seem very similar to me.

I'm inclined to have explicit quoting rules in the template itself. The alternative feels too much like the evil of shell quoting. If you allow both kinds of non-quoting, I think it's a little like the distinction in a Python function call of func(value), func(arg=value), func(*values), and func(**kwargs). That is, the clear syntactic distinction is important and valuable.

How to actually express this, I'm not sure. You could say "put in this variable, but don't quote /, or don't quote & and =". E.g., {path_args:/}. But then how do you put into place a segment with %2f? Maybe this isn't a good line of thought.

Maybe it is better to rely on the object model of the languages to make the distinction clear. That is, pass in a sequence or mapping object for the two cases. The libraries must be intelligent and apply url quoting to the segments/variables before turning the sequence or mapping into a single value to be substituted. You can't really spec this out, because it's up to the individual bindings to figure out what the appropriate kinds of objects are for these cases in those environments.

Tuesday, July 31 2007 at 8:42 AM +10:00

Mike Schinkel said:

Good call.

Now what about optional variables?

http://lists.w3.org/Archives/Public/uri/2007Jul/0024.html

Tuesday, July 31 2007 at 9:43 PM +10:00

Sam McCall said:

Optional variables seem to be neatly handled by Ian's suggestion, including the distinction between a parameter not given, and one with an explicitly empty value.

Given the template
http://host/path?{params}
{'foo' => 'bar'} produces http://host/path?foo=bar
{'foo' => 'bar', 'opt' => nil } produces http://host/path?foo=bar&opt=
The implementation is elegant - you just escape associative arrays, and lists, in different ways from strings.

But cases like http://host/path?{params}&foo={foo}, http://host/path/{morepath}/file
produce ugliness when the parameters are empty:
http://host/path?&foo=1
http://host/path//file

Thursday, August 2 2007 at 4:40 PM +10:00

Creative Commons