mark nottingham

Try This RSS Experiment

Tuesday, 12 April 2005

Web Feeds

Way back when I put the first Atom drafts together, I included a placeholder for a section that I hoped would allow reconstruction of feed state. Presently, this often isn’t necessary, because you have to be away for a seriously long time (e.g, on vacation) before you actually miss anything. However, I’d put forth that this state of grace is going to be increasingly unlikely.

Why? While your average, computer-obsessed geek — the main audience for syndication so far — won’t notice missing items because they obsessively check their feeds, more casual users will notice it as syndication goes prime-time. Furthermore, as people syndicate more and more types of information, it’s more likely that some feeds will change quickly enough that there’ll be problems.

In other words, if you happen to look away for too long you miss information, essentially making the channel leaky. To that end, I put together a proposal and a demonstration feed (in fact this very blog’s feed, dear reader), in the hopes of convincing people that this is a real issue. Silence ensued, and the ATOMPUB WG declined my proposal.

I wasn’t happy with that, but what to do? Rather than tilt at windmills, I’d like to try an experiment.

The Experiment

Set up two RSS aggregators. Get them both up-to-date on your usual selection of RSS feeds, and mark everything read.

Then turn one off.

Leave it that way for a day; i.e., have one aggregator running for 24 hours, the other dormant.

Now, fire the dormant aggregator up, let it sync, and look at the difference between them. It represents the updates, news and blog entries you miss when you’re offline for a day.

Now try it with a three-day gap (if this seems like unrealistic test conditions to you, please get professional help quickly).

My predictions;

  1. You won’t lose many (or any) entries from slow-moving news and information sources, or from all but the most prolific blogs.
  2. You might lose some entries from faster, stream-of-conciousness blogs (e.g., Dave Winer), mailing list feeds ( Yahoo! Groups) and aggregated or republishing feeds ( Planetizen, GridSkipper, craigslist, Technorati, RSSJobs)
  3. You’re pretty much guaranteed to lose entries from high-volume sources (e.g., Slashdot)
  4. If you’re subscribed to a monitoring feed (like pair.com’s system status feed or BT-EFNET’s feed), it totally depends on what happens on that day.

I’ll be back soon with my results.


8 Comments

Randy Charles Morin said:

Why does RSS have to be this big giant Inbox with items read and unread? We have email for that. RSS is about community and micro-content. You subscribe to x feeds because you find them interesting and want to read more. If they were urgent (need to read them all), then an RSS client is not the solution, rather, email is.

Tuesday, April 12 2005 at 10:00 AM

Bill de hOra said:

“In other words, if you happen to look away for too long you miss information, essentially making the channel leaky.”

To solve this problem in an system management scenario (lots of entries being generated) we dropped HTTP altogether and used XMPP. The problem was that if we missed a system’s ‘FATAL: omigod, omigod, look what happened’ message because something was up down or whatever during that time that would be very bad. By the time you went through the rm scenarios, widening and closing feed windows widening and closing feed polling times, or splitting feeds based on severity/class it was so much easier to use a different protocol and have the nodes push out data in blocks of N entries (ie we cared a lot more about entries than feeds and ordering by severity was trivial).

It’s worth try out Atom over something other than HTTP for a bit and see how feel about the idea of a ‘feed’ - I think me and Bob Wyman have decided a feed reflects more on HTTP than Syndication. Maybe add an IM client to your experiement.

“Silence ensued, and the ATOMPUB WG declined my proposal.”

I though I was +1 on that. Apologies if I wasn’t.

Tuesday, April 12 2005 at 10:09 AM

Robert Sayre said:

Hmm. Your proposal concerned a couple link relations, right? Those would be easy to add to the format at anytime, and… Blogger and 6A have both asked for similar functionality on the protocol side. Seems like more of a server layout and protocol problem, anyway.

Tuesday, April 12 2005 at 10:39 AM

Andrew Ho said:

Hi,

I’d just like to say that I think this is a great idea. Sometimes, I’m away for several days, or don’t have an internet connection, and I don’t like to lose information about what’s happened over the past few days. I’ve considered several solutions (such as bloglines), but none of them have been ultimately satisfactory.

I’ll be watching this space, and hope for great things :).

– Andy

Wednesday, April 20 2005 at 5:20 AM

Robert Sayre said:

Well, wouldn’t it be nice if you didn’t need to know about the protocol document to perform any of the protocol’s read operations? That’s my thinking. View-source on a couple of link relations should be enough to pick it up.

Wednesday, April 27 2005 at 8:25 AM