mnot’s Web log

Design depends largely on constraints.” — Charles Eames

Friday, 4 July 2008

The WS-Empire Strikes Back... feebly

Here’s a gem on a little-used mailing list:

As most of you know, over the last several years fairly good progress has been made on standardizing Web services. Many Web services specifications have, in fact, been standardized in W3C (i.e. SOAP 1.2, WSDL 2.0, WS-Addressing, WS-Policy, etc). There is still some work to be done.

Accessing data about a resource through Web services is an area of the Web services architecture that has yet to be fully realized. Some good work has already been done to date, however, some pieces of the overall puzzle are still waiting to be completely standardized.

[…]

We believe that four specifications, in particular, work together to provide mechanisms for accessing and manipulating the XML representation of a resource as well as any metadata associated with that resource. The four specifications are:

To this end, we recommend that the W3C create a new Working Group (with the suggested name of ” Web Services Resource Access Working Group”) to standardize the four specification mentioned above.

Right… So they need a protocol to access resources on the Web (this is Web services, after all…). Quite a puzzle indeed; what to do? This certainly isn’t possible on an Enterprise scale today.

My first concern was that Big Vendors and the W3C are still trying to replace HTTP with SOAP, but then I realised that there’s a far greater risk (because it’s more probable that it’ll actually happen); if they charter this group, they’re risking waking Mark Baker from his well-deserved hibernation. The fools!

this entry’s page ( 9 comments )

Thursday, 22 May 2008

The Pitfalls of Debugging HTTP

Some folks at work were having problems debugging HTTP with LWP’s command-line GET utility; it turned out that it was inserting Link headers — HTTP headers, mind you — for each HTML <link> element present.

Blurgh.

This brought to mind some other peculiarities that can make debugging HTTP more complex and REALLY ANNOYING…

Curl automagically adds a Pragma: no-cache to requests, so that you don’t have to worry about scaling the Web or getting decent performance.

LiveHTTPHeaders shows you headers, but through the lens of Mozilla’s header parsing and processing, not what’s on the wire.

Even Wireshark can’t be completely trusted; it will remove the \r\n from the Content-Length header (because it’s parsing the message for its own purposes), making you think that the sender has header delimitation bugs.

Anybody have another?

All of this is why I tend to use telnet when debugging HTTP. Sometimes tcpflow helps too.

this entry’s page ( 24 comments )

Thursday, 15 May 2008

Atom gets a new audience

Huh. The Atom Format RFC has been out for a while, and as one of the authors, I get the odd mail now and again asking a question or just saying “thanks.”

In the last week or two, however, there’s been a bit of an uptick, and all of these e-mails contain the word “Astoria.”

As a Unix/Open Source developer, it’s easy to forget that there’s a whole other world out there that’s much bigger than the one I live in (just look at the job ads anywhere but Silicon Valley). However, one of the big takeaways that I took from the whole SOA experience was that the big enterprise vendors will do anything to court those developers, developers, developers.

If this is indicative, I think we may be in for for something pretty big.

this entry’s page ( 1 comment )

Wednesday, 2 April 2008

Moving the Goalposts: “Use” Patents and Standards

It’s become quite fashionable for large IT shops to give blanket Royalty-Free licenses for implementation of “core” technologies, such as XML, Web Services and Atom. I’ll refrain from linking to any of them, as the purpose of this post* is not to pick on any single one**.

Rather, it’s to call attention to a blind spot. IT folk see these licenses, nod their heads and relief, and assume that all is well; they can use this technology in their projects without fear of at least a handful of big, bad companies coming to get them.

That’s not the case.

You see, most of these licenses are restricted to the implementation of this technology, not its use. This clears the people who actually write the code that implements the [ XML, Web Services, Atom ] parsers, processors and tools, but it doesn’t help the folks that use those things.

I should point out that this isn’t limited to these one-off “commitments”; the vaunted W3C Patent Policy says:

“Essential Claims” shall mean all claims in any patent or patent application in any jurisdiction in the world that would necessarily be infringed by implementation of the Recommendation.

In other words, people who implement XML parsers are free from worrying about any W3C Member for coming after them, while the people who use those parsers are out in the cold; if you’re using XML for healthcare, the same folks who make those commitments can have “XML in healthcare” patents and come after you.

In effect, the vendors are pooling together their IP and giving each other free cross-licenses on chosen technologies — calling a truce, if you like — but not including their users. A rapacious vendor could hoard IP relating to the use of a technology, push it as a standard to get wide adoption, and then cherry-pick cashed-up users to get the most revenue from patent enforcement.

Is it on purpose? In most cases I doubt it, but that’s the outcome, and it’s important for people to understand it. Can we do much about it? Probably not, unless somebody gets really generous and starts a patent pool. Sorry to end on a down note, but patents are involved, after all…

* Since it’s a sensitive subject, I should point out that I’m not wearing any of my various hats when writing this post; it just reflects my personal thoughts, and I am not a lawyer, or *shudder* a patent lawyer, so take it with a huge grain of salt.

** No, the timing is not purposeful; recent events merely reminded me that I wanted to post this.

this entry’s page ( 8 comments )

Thursday, 20 March 2008

Moving Beyond Methods in REST

Having complained before about the sad state of HTTP APIs, I’m somewhat happy to say that people seem to be getting it, producing more capable server-side and client-side tools for exposing the full range of the protocol; some frameworks are even starting to align object models with resource models, where HTTP methods map to method calls on things with identity. Good stuff.

However, something’s been bugging me for a long time about this. While there’s a nice internal logic to mapping HTTP methods to object methods, it doesn’t realise the power of having generic semantics.

Consider a resource;

class Person (Resource):
    def GET(self):
        # do acls...
        # get the representation out of some persistent store
        # translate to the format asked for
        return representation
    def PUT(self, representation):
        # do acls...
        # translate the representation to the appropriate format
        # put the representation into some persistent store
        # cook up a status message
        return status_representation
    def DELETE(self):
        # do acls...
        # delete the resource from some persistent store
        # cook up a status message
        return status_representation
    def POST(self, representation):
        return representation

In this interface, GET, PUT and DELETE all have well-defined semantics. So well-defined that they really shouldn’t need application-specific code; after all, they’re just manipulating state in well-known ways.

In fact, I’d posit that you can specify the behaviour of any RESTful resource by describing a) the processing that POST does, and b) any side effects of PUT and DELETE.

There are a lots of caveats around that, of course. You need to define access control for the methods, and specify if authentication is required. You need to specify the formats that the application can work with (potentially with differing answers for input and output, and possibly with appropriate translations). You’ll need to specify the processing that happens around query parameters (e.g., in filtering output for GET).

The thing is, none of those have an implementation that’s specific to this particular resource; instead, they’re better abstracted out, so that the implementation looks something like this;

@store_type("mysql") # tell the Resource what implements GET, PUT and DELETE
@acl("choose your ACL poision") # tell who / when access is allowed, per-method and finer-grained
class Person (Resource):
    store_format = PersonML
    def POST(self, representation):
        # operate on the store...
        return representation
    def PUT_effect(self, representation):
        # called IFF the presented representation is storable, 
        # but before it is available; raising an exception will back it out
        return status_representation
class PersonML(Format):
    translations = {
        'application/xml': (self.to_xml, self.from_xml),
        'application/json': (self.to_json, self.from_json),
    }
    def to_xml(self, native_input):
        # do whatever you've got to do
        return xml_output
    def from_xml(self, xml_input):
        # do whatever you've got to do
        return native_output
    ...

I haven’t incorporated a way to handle query parameters here, but you get the idea.

The tantalising part of this approach is that it can be implemented close to your persistence layer; all you need is hooks in the right places for side effects and POST processing. In fact, as long as you’re willing to be flexible on consistency, you can almost do it with mod_rewrite (calling a completely separate script as well as PUTting/DELETEing the state doesn't yet seem to be possible; ping me if you can figure out how to), and stuff like Apple's FSEvent looks very, very interesting in this light.

Is anybody aware of anything along these lines out there in an existing tool or framework? I’ve been meaning to write some code along these lines for some time; if you'd like to help out, please drop me a line.

The other place that this view has impact is in describing RESTful applications, if you believe in doing such things. WADL, for example, gives GET PUT and DELETE equal weight with POST, when I’ve always suspected it would be more elegant if you took the abstraction up a notch and talked about state, rather than methods.

this entry’s page ( 10 comments )


Powered by Movable Type