RSS History Thoughts

Mark Nottingham

revised: April 27, 2003

Introduction

Most people use RSS not only to see what the current news, weblog entries or other items are available, but also to keep a history of those items over time, so that they can be revisited and referred to later. It is also valuable that, when properly used, RSS gives us the ability to track news even when we're not attending; we can "catch up" on a channel's old items when we return.

This is possible because it is possible to reconstruct the list of items a feed has contained. Although there are several ways to communicate how often the feed must be polled to assure that no items are missed (e.g., HTTP Cache-Control, RSS2.0's ttl element and mod_syndication), there is not yet any means of communicating how the items should be used to reconstruct the feed.

The history module is designed to allow feeds to dictate how this should happen, so that aggregators don't have to resort to guessing how to reconcile changes in the feed.

In particular, this module allows content providers to explicitly give the following information to aggregators and other RSS consumers;

Notational Conventions

This document assumes the following namespace declarations;

The h:history Element

The h:history element is suitable for use as a module in both RSS 1.x and RSS 2.x -style documents. It must be a child of the RSS channel element, and must contain exactly one defined h:history child element. Unrecognized element or attribute children must be ignored.

Defined h:History Children

h:none

When this element is present as a child of h:history, it indicates that no history of the channel should be kept; each representation of it constitutes the full feed. Although links and metadata previously conveyed may (or may not) still be valid, the feed is no longer considered to contain them if they do not appear in the most recent representation of the channel.

For example, a feed containing a list of stock quotes that the user has subscribed to should use h:none, so that items that the user unsubscribes from will properly disappear from the list. Likewise, a feed of the Top 25 best-selling books might also use h:none, so that there are always exactly 25 items in the feed.

h:overwrite

When this element is present as a child of h:history, it indicates that when items which already exist in the feed appear again later, they should be overwritten with the newer items.

Here, "overwritten" means that the old item will be completely replaced; none of its metadata will survive in the feed.

An item is considered to match an existing item when they have character-for-character matching guid elements (RSS 2.x), rdf:about attributes (RSS 1.x), or link elements (if neither of the former elements is present), unless other explicit information regarding item matching is available (e.g., an extension to the history module, or another RSS module).

For example, a Weblog feed where items are often edited should use h:overwrite, so that edited items are properly re-integrated into the feed, in place.

h:add

When this element is present as a child of h:history, it indicates that all new items should be appended to the feed, except where the item has been identified as overlapping the last representation of the feed.

Overlapping items are those up to and including that item which matches the latest item seen previously in the feed; these items should be discarded from the new representation, and the remaining items added to the feed.

For example, a comments or list-of-links feed where a link may be referred to more than once in the lifetime of a feed, but with different metadata or different people, should use h:add, so that individual items are not lost.

Note that use of h:add is not necessary unless the feed cannot guarantee the uniqueness of the item identifier (i.e., the guid element or rdf:about, appropriately).

Example

...
  <channel>
    <title>Bob's Big Boy Specials for Today</title>
    <link>http://bobs.example.com/todays-specials.rss</link>
    <h:history xmlns:h="http://mnot.net/rss/history/">
      <h:none/>
    </h:history>
   ...

Extending the History Module

New element and attribute children for the h:history element may be defined, so long as they do not change the meaning or operation of those already defined (e.g., a "not" element that reverses the meaning of its sibling elements).

Such extensions should use their own namespace URI, unless their definition is coordinated with the owner of this module's namespace.