mnot’s Web log

Design depends largely on constraints.” — Charles Eames

Tuesday, 12 April 2005

Try This RSS Experiment

Way back when I put the first Atom drafts together, I included a placeholder for a section that I hoped would allow reconstruction of feed state. Presently, this often isn’t necessary, because you have to be away for a seriously long time (e.g, on vacation) before you actually miss anything. However, I’d put forth that this state of grace is going to be increasingly unlikely.

Why? While your average, computer-obsessed geek — the main audience for syndication so far — won’t notice missing items because they obsessively check their feeds, more casual users will notice it as syndication goes prime-time. Furthermore, as people syndicate more and more types of information, it’s more likely that some feeds will change quickly enough that there’ll be problems.

In other words, if you happen to look away for too long you miss information, essentially making the channel leaky. To that end, I put together a proposal and a demonstration feed (in fact this very blog’s feed, dear reader), in the hopes of convincing people that this is a real issue. Silence ensued, and the ATOMPUB WG declined my proposal.

I wasn’t happy with that, but what to do? Rather than tilt at windmills, I’d like to try an experiment.

The Experiment

Set up two RSS aggregators. Get them both up-to-date on your usual selection of RSS feeds, and mark everything read.

Then turn one off.

Leave it that way for a day; i.e., have one aggregator running for 24 hours, the other dormant.

Now, fire the dormant aggregator up, let it sync, and look at the difference between them. It represents the updates, news and blog entries you miss when you’re offline for a day.

Now try it with a three-day gap (if this seems like unrealistic test conditions to you, please get professional help quickly).

My predictions;

  1. You won’t lose many (or any) entries from slow-moving news and information sources, or from all but the most prolific blogs.
  2. You might lose some entries from faster, stream-of-conciousness blogs (e.g., Dave Winer), mailing list feeds (Yahoo! Groups) and aggregated or republishing feeds (Planetizen, GridSkipper, craigslist, Technorati, RSSJobs)
  3. You’re pretty much guaranteed to lose entries from high-volume sources (e.g., Slashdot)
  4. If you’re subscribed to a monitoring feed (like pair.com’s system status feed or BT-EFNET’s feed), it totally depends on what happens on that day.

I’ll be back soon with my results.


Filed under: Syndication

discussion of this entry

Randy Charles Morin said…

Why does RSS have to be this big giant Inbox with items read and unread? We have email for that. RSS is about community and micro-content. You subscribe to x feeds because you find them interesting and want to read more. If they were urgent (need to read them all), then an RSS client is not the solution, rather, email is.

Tuesday, April 12 2005 at 10:00 AM +10:00

Bill de hOra said…

"In other words, if you happen to look away for too long you miss information, essentially making the channel leaky."

To solve this problem in an system management scenario (lots of entries being generated) we dropped HTTP altogether and used XMPP. The problem was that if we missed a system's 'FATAL: omigod, omigod, look what happened' message because something was up down or whatever during that time that would be very bad. By the time you went through the rm scenarios, widening and closing feed windows widening and closing feed polling times, or splitting feeds based on severity/class it was so much easier to use a different protocol and have the nodes push out data in blocks of N entries (ie we cared a lot more about entries than feeds and ordering by severity was trivial).

It's worth try out Atom over something other than HTTP for a bit and see how feel about the idea of a 'feed' - I think me and Bob Wyman have decided a feed reflects more on HTTP than Syndication. Maybe add an IM client to your experiement.

"Silence ensued, and the ATOMPUB WG declined my proposal."

I though I was +1 on that. Apologies if I wasn't.

Tuesday, April 12 2005 at 10:09 AM +10:00

Mark Nottingham said…

Randy — I don’t have to give my e-mail address to consume an RSS feed; this is just one of the critical differences between the two media. RSS is about “community and micro-content” to you, but not to many — or even most — of its users.

Bill — apologies; you did +1 it, but the WG as a whole didn’t have much to say, unfortunately.

Tuesday, April 12 2005 at 10:37 AM +10:00

Robert Sayre said…

Hmm. Your proposal concerned a couple link relations, right? Those would be easy to add to the format at anytime, and... Blogger and 6A have both asked for similar functionality on the protocol side. Seems like more of a server layout and protocol problem, anyway.

Tuesday, April 12 2005 at 10:39 AM +10:00

Andrew Ho said…

Hi,

I'd just like to say that I think this is a great idea. Sometimes, I'm away for several days, or don't have an internet connection, and I don't like to lose information about what's happened over the past few days. I've considered several solutions (such as bloglines), but none of them have been ultimately satisfactory.

I'll be watching this space, and hope for great things :).

-- Andy

Wednesday, April 20 2005 at 5:20 PM +10:00

Mark Nottingham said…

For those interested, in one day, I missed;

- no personal Weblog entries
- about eight Slashdot entries
- a couple on the New York Times homepage
- about five entries in each age.com.au feed (Melbourne newspaper)
- a large number in my del.icio.us inbox
- some programs on versiontracker.com

and a few others here and there.

Sunday, April 24 2005 at 9:54 AM +10:00

Mark Nottingham said…

Robert —

The only problem I have with that is that AFAIK so far, I don’t need to know about the protocol document to consume an Atom document; it’s only when you want to manipulate a feed that you have to work on that side of the house.

That said, it’s good to hear that others want this too.

Tuesday, April 26 2005 at 9:51 PM +10:00

Robert Sayre said…

Well, wouldn't it be nice if you didn't need to know about the protocol document to perform any of the protocol's read operations? That's my thinking. View-source on a couple of link relations should be enough to pick it up.

Wednesday, April 27 2005 at 8:25 AM +10:00

add to the discussion

your details

name
e-mail address

Your e-mail address will not be shared.

your comment

Separate paragraphs with blank lines; HTML markup will be removed.

By submitting a comment, you agree to grant a limited license to reproduce it, under the same terms as the page being commented upon. If you have questions or prefer other terms, please contact me.

Creative Commons License