sparta.py 0.4: Data Binding for RDF in Python

Saturday, 15 May 2004

After a short pause (OK, nearly three years), I’ve released version 0.4 of sparta.py.

Sparta is a simple API for RDF that binds RDF nodes to Python objects and RDF arcs to attributes of those Python objects. As such, it can be considered a “data binding” from RDF to Python.

For an example of its use, see spartaTest.py and its output.

New in this Version

This version is based on rdflib, which makes it much easier to install and simpler to use. A side effect of this is that you can instantiate a ThingFactory with a generic rdflib Store, and still access the state of the Store through its usual methods.

Preliminary support for automatic typing has also been added; if a property has an rdf:type property, it will convert it to and from the appropriate Python type. This only works for a subset of the datatypes in XML Schema types so far. See spartaTest.py for an example of this.

As before, this software is experimental; it is not complete, has not been tested well, and may contain bugs. The API may (and probably will) change in the future.

The Back Story

I’ve updated this software for two reasons; to support the API style that I prefer, and to prove a point about RDF vs. XML vis a vis data binding.

The original version of Sparta, in 2001, was intended to be a demonstration of a simple API for RDF. Aaron Swartz picked up the ball and came up with TRAMP, an alternate style for doing this, and then xmltramp, which takes a similar approach to XML.

I liked TRAMP at first, because it treated the properties of a node as a dictionary, which seemed like goodness because a datatype has a more constrained and well-known interface than an arbitrary object.

However, after following this path to its logical conclusion with my HTTP header parsing library (still a work under progress), I’ve become convinced that a) an object’s attributes are also a dictionary (at least in Python), and b) less line noise is inherently more developer-friendly. I expect that Aaron has thoughts on this as well, but I’d characterise any difference between us as merely stylistic.

xmltramp is a different beast; because the underlying data model is an Infoset, it is much more complex, and I don’t think that any such API is a really good way to work with data, unless you’re forced to by an existing format that doesn’t have a data model above the Infoset (see recent discussion regarding this).

I point this out not to trash xmltramp (it’s very good to have, if you need that sort of thing), but to show that if your underlying data model is based upon the Infoset, it’s difficult to keep all of the information therein and still provide an intuitive interface. If the data model is simple — e.g., RDF — it’s very easy and intuitive to work with it, and to retain all of the relevant information. Things like typing and validation during data binding become absolutely dirt simple.

Mark Nottingham