mark nottingham

The Syndication Sky is Falling!

Tuesday, 18 May 2004

Web Feeds

A few people got together in NYC to talk about Atom going to the W3C this morning. One part of the minutes of this discussion raised my eyebrows a fair amount;

sr: […] Lots of people are saying RSS won’t scale. Somebody is going to say I told you so.
bw: Werner Vogels at Cornell has charted it out. We’re at the knee of the curve. I don’t think we have 2 years.
sr: I have had major media people who say, until you solve this, I’m not in.
bw: However good the spec is, unless we deal with the bag issues, it won’t matter. There are fundamental flaws in the current architecture.

Fundamental flaws? Wow, I guess I should remind the folks at Google, Yahoo, CNN and my old colleagues at Akamai that what they’re doing is fundamentally flawed; the Web doesn’t scale, sorry.

I guess I’ll also have to tell the people at the Web caching workshops that what they do is futile, and those folks doing Web metrics are wasting their time. What a shame.

I don’t mean to pick on Sam and Bob specifically here; the minutes may not have caught the context, and if people held me to every casual comment that came from my mouth, I’d be in a world of pain. However, I do mean to pick on the general notion that the Web can’t scale enough for syndication’s purposes; the Web provably does scale, and like gangbusters.

Economics of Scale

The fact is, the only limitation on the scalability of syndication using current Web technology — i.e., HTTP — is economic, and it’s not a great one. If you want to serve feeds to two billion people, you’re going to need the infrastructure to handle it, and the Web has well-established and surprisingly cost-effective ways to do this; either choose a hosting firm, or use a content delivery network.

Switching to pub/sub isn’t a magic bullet; you still need to size your infrastructure appropriately, and you’ll need to keep and manage a heck of a lot of state about your subscriptions, along with all of the problems that brings. Do we have a great body of experience doing Internet-scale pub/sub? No, so you’ll need to be willing to be a guinea pig as well.

For example, I helped Adam Bosworth set up his blog, and was very concerned about the scalability issues, as he’s a well-respected figure in the industry. As a result, he’s got an account at pair.com that costs some small number of dollars a month, and is massively oversized for his traffic, I’d wager.

If that’s not enough, Speedera, Akamai, Digital Island and a host of other CDNs will sell you high-performance bandwidth at very reasonable rates.

Bad Reasons to Change the Web Architecture

But wait, there’s more. “Media people” want to have their cake and eat it too. It’s not good enough that they’re getting an exciting, new and viable (as compared to e-mail) channel to eyeballs; they also have to throw their weight around to reduce their costs with a magic wand.

What a horrible reason to foist new protocols, new software, and added complexity upon the world.

Good Reasons to do Pub/Sub

That’s not to say that there aren’t good reasons to do pub/sub. In particular, if timely event notification is necessary — to the point that polling is impractical — pub/sub has a lot to offer. This isn’t the case with syndication in the vast majority of scenarios; do you need to know your headlines as they happen, up to the millisecond? No? Then why would you use pub/sub?

In time, I fully expect that a Web-friendly (read: built on top of REST) pub/sub mechanism will take root and flourish; when it does, RSS and the rest of the world can take advantage of it as they see fit. In the meantime, I see no reason to foist yet another problem onto the backs of the people doing Atom standardisation; they’ve got enough on their plate as it is.

What Would Werner Do?

Reading Werner’s roll-up post about his investigations in this area, I don’t think he’d disagree with any of this (but I’d love to hear his thoughts). The problem isn’t the architecture, it’s how we use it.

In particular, the approach to aggregation that software takes needs to be rethought; e.g., if it blocks every time it fetches a representation of a feed, it certainly won’t scale.

In this sense, we do need to break new ground, because we’re moving beyond the “browser” model — which is an incredibly healthy thing to do.


6 Comments

Sam Ruby said:

Before you scoff, it is worth noting that “bw” works on/for http://www.pubsub.com/. They have published http://www.pubsub.com/REST/. Care to critique it? It might be flawed, but it appears to be “bw’s” honest attempt to EXACTLY what you are suggesting is ultimately what will need to be done.

Tuesday, May 18 2004 at 2:26 AM

Sam Ruby said:

“What Would Werner Do?”

http://www.imc.org/atom-syntax/mail-archive/msg03798.html

Tuesday, May 18 2004 at 5:49 AM

Bob Wyman said:

By the way… For at least one example of someone doing “interesting things” with the PubSub REST API, take a look at:

http://www.estey.com/archives/000431.html

This service is built on the current REST API. Clearly, there are alternative ways to implement it. But REST works too.

bob wyman

Wednesday, May 19 2004 at 1:06 AM

Bob Wyman said:

As Sam points out, I am the “bw” in the meeting notes that you discuss here. Also, please note that the REST publishing interface provided described at http://pubsub.com/REST/ is a “beta” interface at best. We have intentionally not gone to all the effort needed to make it as good as we can. We want to hear critiques of it. The reason for providing it was to spark some thinking in the community on the role of REST delivery for syndication. (It has many advantages and many limitations… Firewalls and NAT are a real problem…)

You asked: “Do we have a great body of experience doing Internet-scale pub/sub?” No. We don’t. Very few people have tried it. Here at PubSub.com, we’re doing our best to figure out what it means to be Internet Scale since that is what we think our system is… So far, everything is working as we expected. That may change… As you suggest yourself, it is time that we move beyond the “browser” model. Pubsub is one of a number of technologies that should be considered in this move.

bob wyman

Wednesday, May 19 2004 at 12:07 PM