document(Web)

Tuesday, 22 February 2005

I love the XSLT document function. With it, you can access the whole Web from a stylesheet; this gives a lot of flexibility, in the right situation.

For example, my local library’s online system is based upon iPac (now sold as the Horizon Information Portal, I think), a common packaged library management system. One of its nifty features is letting you keep a list of books (“My List”) that you’d like to eventually check out of the library. In conjunction with Jon Udell’s LibraryLookup bookmarklet, you can shift from keeping books in your Amazon shopping cart and buying them to keeping them in your library’s list and borrowing them. Cool.

That said, the data isn’t exactly in the format that I want it; to figure out what’s available in my local branch, I have to click through and scan a list of branches. What I’d really like is a list of books that are on my list, at my local branch, and currently available, all on one page. While we’re at it, I’d like to have it available on my phone, so I can know where to go in the stacks when I’m looking for something to read.

A bit of Googling turns up a nifty feature in iPac; if you append ‘GetXML=true’ to the URL’s query arguments, you get back an XML representation of the page’s underlying data. Unfortunately, iPac doesn’t use HTTP authentication; it has a form login and then gives you a session ID, but luckily, the session ID is in the URL, not in a cookie.

Enter document(). Because everything’s in the URL, it’s possible to use XSLT to log in, get a session ID, get My List, find availability information for each individual book, and then log out. All in one stylesheet.

Here’s the stylesheet and an example of a document to run it against; you’ll need to supply a username and password, as well as a search URI and a branch name that you’re interested in. Needless to say, this is incredibly iPac-specific, and even with other iPac systems, may need tweaking.

You can also see a snapshot of the books which are both currently interesting to me, and available at the Burlingame library.

One caution; this is pretty resource-intensive on the library’s servers, because it has to check each book’s availability with a GET. I’ve got mine running with a cron job just every few days, so it won’t stress them; I wouldn’t suggest running it as a client-side stylesheet for this reason.

Mark Nottingham

other XML posts

document(Web)