revised April 23, 2003
Tarawa is a Web server API, similar to Java's Servlet interface; it provides an abstraction of the Web server that allows you to easily write Web applications, without knowing the details of the HTTP protocol. Unlike other HTTP APIs, Tarawa is designed to encourage applications to align their object model with the interface they expose as Web resources. It does this by allowing you to model resources and representations as objects and HTTP methods as object methods.
This tutorial shows how to use Tarawa to build both simple and complex Web applications. For more information and an exact specification of the interface, see the API definition.
The most simple demonstration of Tarawa's capabilities is to handle a HTTP
GET request with a plain text "Hello, World!" response. It also shows the
basic structure of a Tarawa Web application. Here's the code for a file named
hello:
1 #!/usr/bin/python 2 3 from tarawa import Resource, CGIServer 4 5 class HelloWorld(Resource): 6 def GET(self, query): 7 """Content-Type: text/plain""" 8 return "Hello, World!" 9 10 server = CGIServer(HelloWorld, "http://www.example.com/hello") 11 server.serve()
Let's go through this line-by line.
Line 1 is the normal way to invoke the Python interpreter
(at least on Unix platforms). Note that the "#!/usr/bin/env
python" form is not used; although this is usually preferable, as it
does not work well with CGI scripts (this example is wrapped in a CGI
script).
Line 3 imports the Resource and CGIServer classes from the tarawa package.
Now it gets interesting.
Line 5 defines a class, HelloWorld, that extends the Resource class from the tarawa package. This is how all Web resources are created in Tarawa; the Resource class is never used directly (in object-oriented parlance, it is an abstract base class).
According to RFC2396, which specifies URIs and URLs, a resource is "anything that has identity." Tarawa is designed to model the things that your application is made up of as both resources, to align with the Web architecture, and as objects, to align with object-oriented programming.
A Tarawa Resource, when instantiated by a Server, should be considered to be the embodiment of the Web resource identified by its URI; although it may be destroyed immediately after serving a request, it may serve many requests (see below).
Line 6 defines GET as a method of HelloWorld. This method will be called every time the Web application receives a HTTP GET request for a URI bound to the HelloWorld object. It takes one argument, query, that we don't use (yet).
Line 7 is the documentation string for the GET method. In Tarawa, method docstrings are used both to provide documentation for people and to associate metadata with the method, for use by the Tarawa engine.
Here, we associate a text/plain Content-Type with representations that this method returns. We can include any other HTTP header beginning with "Content-" here; these headers are all entity headers, which describe the payload of the HTTP message (the content of the body). Any headers that you include in the docstring like this are sent along with all representations generated by the method; they're also used by Tarawa internally to properly implement the details of the protocol.
The docstring can also contain normal documentation after the metadata; to do so, separate them with a blank line.
Line 8 returns the "Hello, World!" string,
which will be used as the body of the HTTP response.
Line 10 instantiates a new Server, specifically a CGIServer. Tarawa's Server class defines the interface that Resources attach to; subclasses of Server then can implement the Server interface for a particular server environment. In this case, CGIServer allows Tarawa Web applications to operate as Common Gateway Interface (CGI, one of the most prevalent Web server APIs) applications. Similarly, Server can be subclassed to run Tarawa Web applications as Web server plugins (e.g., mod_perl, mod_python, ISAPI) or as standalone Web servers.
When instantiating a Server, two arguments need to be passed; the first is
the root Resource class for the Server, while the second is the corresponding
URI (absolute or relative, as desired) for that resource. Here, we tell our
Server to serve the HelloWorld class at the
http://www.example.com/hello URI.
Line 11 calls the serve() method of the Server instance. This sets the machinery of Tarawa in motion. Depending on the type of Server used, calling serve() may handle one or many requests. For example, a CGIServer will handle one request per serve(), because a CGI script itself is called once per request. A Server that enables use of a Web server plugin (like mod_python) may call serve() once when the server is started, and handle many requests subsequently to save computing resources (because the Resources only have to be instantiated once per serve()).
Let's personalize our application; instead of saying "Hello, World!", we can call the user by name if we know it. URIs have query components that allow data to be passed to resources, and Tarawa makes the query available as a parameter (called "query") of the method that handles the request.
1 def GET(self, query):
2 """Content-Type: text/plain"""
3 return "Hello, %s" % query.get('name', ['World'])[0]
Now the query is used to communicate the name of the person to say hello
to; if none is present, it defaults to "Hello, World!" So, accessing the URI
http://www.example.com/hello?name=Bob would result in
"Hello, Bob!"
Note that query is a dictionary, keyed on the query arguments, whose
values are lists of the query values. This allows URIs like
http://www.example.com/hello?name=Bob&name=Mary to be
represented as {name: ["Bob", "Mary"]}. If you've programmed CGI
in Python before, the data structure is the same as that which you're used
to.
Although GET is the most commonly used HTTP method, it is not the only one. Tarawa natively supports the HTTP GET, PUT, DELETE and POST methods, through methods with appropriate names. The GET and DELETE methods are passed a query argument (as explained above); PUT and POST are passed a query argument and a representation argument as well. This is an instance of the Representation class that allows you to access the request entity headers and body.
Methods that accept a request representation (e.g., POST and PUT) should include an Accept header that describes the type(s) of representation they're willing to take. For example, "text/xml" says that the method it's attached to only accepts that type; "text/*" allows any text type, and "*/*" allows any type whatsoever. [NOT IMPLEMENTED - use Accept? Alternates? Private? ]
POST is different than other methods, in that the name of the method contains the media type of the request entity; this allows the Tarawa engine to dispatch the request to a method that understands that media type. This is done by appending the media type to the POST method name and converting any non-alphanumeric characters to underscores; e.g., "text/rdf+xml" becomes "text_rdf_xml". If an appropriately named method is not found, a UnsupportedMediaType Status Exception (see below) will be returned.
For example, this method handles the ubiquitous
application/x-www-form-urlencoded media type (commonly generated
by HTML forms), updates a fictitious database with the submitted values, and
returns a status message;
1 def POST_application_x_www_form_urlencoded(self, query, representation): 2 """Content-Type: text/html 3 Accept: application/x-www-form-urlencoded 4 """ 5 form = cgi.parse_qs(representation.body) 6 db.update(form) 7 return "<html><h1>Updated!</h1></html>"
[ TODO: method hack ]
Most Web applications require more than a single resource, because they're made up of more than one thing that requires identity. Tarawa accommodates this by allowing Resources to declare a relationship to one or more child resources (which themselves are Resources). For example, a HomePage Resource might have a ContactInfo subresource. In Tarawa, this would look like
1 class ContactInfo(Resource):
2 """Content-Type: text/plain"""
3 return """Give us a call at 555-1212"""
4
5 class HomePage(Resource):
6 children = { 'contact': ContactInfo }
7 def GET(self, query):
8 """Content-Type: text/html"""
9 return """<p>Go to our <a href="contact/">Contact page</a>.</p>"""
The children attribute is a dictionary, keyed on the URI path
component of the child resource, with a class as the value. Note that it
isn't an instance; Tarawa will instantiate the child Resource when it's first
called.
For example, if our Homepage Resource above were bound to the URI 'http://www.example.com/', it would have a child resource, 'http://www.example.com/contact' that is a ContactInfo resource.
[ TODO: /foo and /foo/ relationships ] [ TODO: ref attribute ]
Child resources allow you to assign identity - and therefore a URI - to every important aspect of your application at design time. However, this information may only be available at runtime; for example, if your application is a gateway to a database, you might want to assign identity to each table in the database. Tarawa accommodates this with the getChild() method.
The getChild() method works by intercepting requests that don't map to an assigned subresource and dynamically instantiating them into a nominated Resource. For example;
1 class Table(Resource): 2 def GET(self, query): 3 """Content-Type: text/plain""" 4 return """This is the %s table of the database...""" % self.name 5 6 class Database(Resource): 7 def getChild(self, name): 8 return Table(name) 9 ...
Here, if Database() maps to http://www.example.com/database,
requests to http://www.example.com/database/users will cause an
instance of Table to be created. The Table knows its own identity by
examining the .name attribute, which in this case will be 'users'.
Just as the children attribute contains Resource subclasses, not instances, getChild() should return a subclass of Resource to be instantiated by Tarawa. [ is this still true? ]
Tarawa uses a "200 OK" HTTP status code when a method exits normally (i.e., by returning a string). However, if a method doesn't capture an exception that is raised, the "500 Internal Server Error" will be used as the response. For example;
1 def GET(self, query):
2 """Content-Type: text/html"""
3 file = open("/tmp/foo")
4 return file.read()
Here, if the file "/tmp/foo" doesn't exist, an IOError
exception will be raised, which Tarawa will serve with a 500 status code,
along with some text explaining the error.
Methods can also raise special exceptions that Tarawa knows about called Status Exceptions; this allows them to respond properly to errors and also to redirect clients, using the appropriate HTTP status codes. Many Status Exceptions will be raised automatically by the Tarawa engine, but some, like Gone and NotFound (which indicate missing resources) should be raised by the application.
1 import http.statusMessage as status
2 def GET(self, query):
3 """Content-Type: text/html"""
4 try:
5 file = open("/tmp/foo")
6 except IOError:
7 raise status.Gone()
Here, if the file "/tmp/foo" doesn't exist, a 410 Gone HTTP status code will be returned.
Each Status Exception has a default representation that can be changed by instantiating a new one. For example;
1 def GET(self, query):
2 """Content-Type: text/html"""
3 try:
4 file = open("/tmp/foo")
5 except IOError:
6 gone = status.Gone()
7 gone.headers['Content-Type'] = "text/plain"
8 gone.body = "Not here any more"
9 raise gone
Note that the HTTP headers in the documentation string aren't applied to representations returned by Status Exceptions. Therefore, you'll need to explicitly set the Content-Type.
Status Exceptions can also be used to redirect clients;
1 def GET(self, query): 2 """Content-Type: text/plain""" 3 redirect = status.MovedPermanently() 4 redirect.headers['Location'] = "http://www.example.com/other" 5 raise redirect
For a complete listing of Status Exceptions and their default values, see [XXX]. Note that the OK Status Exception (which has a 200 HTTP status code) can be used when you need to dynamically set entity headers (e.g., when you don't know the Content-Type of the response ahead of time).
Our "Hello, World!" example returned plain, unadorned text. However, the Web uses hypertext to present more interesting, linked views of resources. We could change the GET method to return HTML, but some users still might want to get the plaintext version.
HTTP provides a solution to this problem through content negotiation. Content negotiation allows resources to return alternate versions of a resource's representations (known as variants) based on preferences expressed by the client (with the Accept request header).
To effect content negotation in our example Web application, we only need to base the HellowWorld class on NegotiatedResource, and add a new method;
1class HelloWorld(NegotiatedResource): 2 def GET(self, query): 3 ... 4 def text_plain_TO_text_html(self, body): 5 """Content-Type: text/html""" 6 return "<html><body><p>%s</p></body></html>" % body
Now, a GET request with an Accept header that states a preference for text/html will get that type; although the GET() method produces text/plain, this method is automatically called (based upon the request preferences) to transform text/plain into text/html.
Content negotiation methods are applied to all responses as appropriate, no matter what method (or exception) generates the response. They use the same naming convention for media types that POST methods use, in the form media_type_TO_media_type.
[ TODO: type hack ]
Sometimes, it won't be clear from a client's preferences what type should be returned through content negotiation. For example, your application might be capable of producing image/gif and image/jpeg, but the only stated preference is for image/*.
Tarawa allows the Web application itself to state preferences that are used in these cases. QValues are numeric values from 0.0 to 0.1 that indicate preference; 1.0 is the highest preference, while 0.0 will never be served.
So, if GIFs are more costly to produce, you can associate a very low QValue to them, relative to that for JPEGs.
1 def image_png_TO_image_gif(self, body): 2 """Content-Type: image/gif; q=0.7""" 3 return Image.toGif(body) 4 5 def image_png_TO_image_jpeg(self, body): 6 """Content-Type: image/jpeg; q=0.5""" 7 return Image.toJpeg(body)
Here, image/gif has a QValue of 0.5, while image/jpeg has a QValue of 0.7. If a request comes in that states only a preference for image/*, or does not state any preference, the image/jpeg response will be served.
Some formats aren't easily transformed; for example, text/plain can only have very generic markup added to it. To accommodate complex transformations, the primary methods (e.g., GET, POST_*, PUT, DELETE) can generate representations that use a private media type with a QValue of 0.0. This guarantees that the private media type will never be served to a client, but it can be intercepted by content negotiation methods to produce the final type.
[ MORE ]
Resources that need to keep state between invocations - whether it's the actual state kept by the resource, or an abstract representation, or a connection to a database that contains that state - should manage it by overriding the getState() and setState() methods, which are called by Tarawa on object instantiation and destruction, respectively.
This allows Tarawa to reuse Resource instances for multiple requests when possible.
[ TODO ]
Thanks to Mark Baker, Mike Ciavarella and Aaron Swartz for their comments, suggestions and extensive review. Any design flaws, errors or omissions are the author's, not theirs.

This work is licensed under a Creative Commons License.