[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [syndication] blogs and syndication
My take on this is that RSS is good at distributing lists of
references to resources (in the URI, URL sense), along with metadata
(title, abstract, etc.). This is a good fit for news sites.
Blogs tend to want to syndicate their content, in the purest sense of
the word; i.e., they want to get it out there and re-incorporated
into other resources. While RSS talks about resources, they're more
interested in shoving around small chunks of hypertext.
These are two fundamentally different distribution models, which
brings a number of implications. RSS doesn't like hypertext because
the idea is that I might want to see the updates it brings me on a
number of different ways (e.g., through small devices), and I might
want a machine to process the link lists to do some intersting
things.
I'm not sure if this should lead to two formats or not (there are
commonalities in the problem, certainly), but perhaps it should be
considered; should there be a sending-links-around and a
syndicating-html-chunks branch of RSS? Maybe it would be good if
content authors as well as consumers had an easy way to think about
and recognize the difference...
On Mon, Sep 10, 2001 at 06:24:50PM +0100, Julian Bond wrote:
> I'm hitting a problem that I feel the need to share. It's all related to
> blogs and how they can or should be syndicated. And then how this
> relates to RSS.
>
> News sites typically contain articles produced reasonably often. Each
> article usually breaks down into 4 main elements.
> - A Title or headline that is about 40-50 characters that are designed
> to attract attention.
> - An Abstract paragraph of 2-300 characters that explain what the
> article is about.
> - A Body that contains the main text and graphics of the article
> - A Link that is a fairly permanent URL for the article.
>
> This is a pretty good match to an RSS item with Title=<title>,
> Abstract=<description>, Link=<link>. There's quite a lot of consensus
> out there in examples of RSS and this is the way most news sites produce
> their files. It's common for the Abstract to contain one or more <a href
> links and it's useful therefore for the <description> to contain those
> same links.
>
> This is also an easy layout for an RSS reader or aggregator. Lists of
> items can be displayed as a box of titles where each title is a link to
> the page with the full text. Description can be left out for a condensed
> box, or left in for a larger and more complete display. Taking the
> slashdot.org home page as an example, The full display is like the
> centre main section, the condensed display is like the RHS "Older Stuff"
> list.
>
> But now what about blogs. There are numerous examples of what are
> actually news sites that use blog technology as a quick and easy
> publishing route. Of course, blogs get used for a lot of other things
> besides. But a typical blog item only contains the abstract/body.
> There's no title and frequently no permalink to the item or even a page
> that the item will always be on. Blog systems also typically have a html
> templating system so even though the html is generally consistent for
> each item, there is no consistency between sites. Some blogs don't even
> have the concept of an item, the closest being a whole day.
>
> Then with the two main sources of blogs, Radio/Manila and Blogger, we
> get another set of issues. Radio/Manila has got 3 syndication formats in
> use with Scripting news, RSS 0.91 (with <desc> html stripped) and 0.92
> (with no <title> or <link>) while Blogger has none. Various people
> including myself have tried to hack an RSS feed on top of Blogger but
> it's awkward and a kludge.
>
> Now if I've built an aggregator (which I have) that takes some news RSS
> and some blog output and tries to display it in either a full or
> condensed display I've got a series of problems.
> - Blogs that treat a whole page as an item make it impossible to check
> for dupes so you see a new copy for every edit.
> - The lack of title means you have to synthesize one.
> - The lack of agreement about how to use <link> or the complete lack of
> it, mean It's hard to get consistency about what happens when the user
> clicks on a link.
> - Getting html stripped out of some of the <description>s means you lose
> the links that the blogmeister patiently added.
> - Bad html code or bad underlying code (like the infamous
> "title="Permanent link to ") have a habit of screwing up the display.
> - And then, many blogmeisters have little or no understanding of what is
> producing their RSS and not much more about what is good or bad html. So
> suggesting to them that there's something wrong is met with a blank
> stare.
>
> At which point this is turning into a rant. I can always just give up on
> reading the output of sites that don't produce manageable RSS, but this
> seems an enormous shame when their output is actually worth reading. I'm
> not sure what to do about this beyond airing my frustration here and
> hassling the individuals involved. I think the technical and social
> problems are solvable but it's going to take a little commitment both
> from them and from the rest of us. It feels to me that the Syndic8
> project will have to devote as much time to trying to fix the RSS we
> have and get blog sites to produce something useful, as to evangelizing
> RSS to people who've never heard of it.
>
> So here's some real world examples of what I'm talking about. Any ideas
> on how to deal with this will be gratefully received!
>
> http://www.evhead.com - Evan Williams site from blogger.com - No RSS and
> a fairly complicated template with permalinks. I had to write a custom
> parser with code to come up with http://www.newsisfree.com/sources/info/
> 2373/ it's not perfect but it almost works.
>
> http://www.boingboing.net - Blogger powered - They've implemented the
> <span class= kludge, but their item template is complex and the
> heuristics don't work well. I ended up using another custom parser to
> get
> http://www.newsisfree.com/sources/info/2376/ Like Evhead, it's not
> perfect.
>
> http://doc.weblogs.com - Horribly broken RSS at
> http://doc.weblogs.com/xml/rss.xml with an unusable title and the whole
> day packed into one item.
>
> http://blackholebrain.editthispage.com/xml/rss.xml - one item per day
> with all the links stripped out. And because they edit frequently, my
> view of it contains most of the intermediate copies. eg
> http://www.voidstar.com/module.php?mod=import&op=feed&id=32
>
> http://glennf.com/blog - Scripting News format only from the blog. But
> with "RSS feed" next to the XML glyph. And http://glennf.com/ uses the
> <span class= kludge *and* has a link to Radio. I'd suggest that Glenn
> should have talked to the Cluetrain people when he met them, but that
> would be unkind.
>
> And finally a moderate success story.
> http://blog.org/ - Blogger powered - David installed the <span class=
> kludge in 5 minutes and the output came out not great, but serviceable.
> http://www.newsisfree.com/sources/info/2271/
>
> Oh, linkrot!
>
> --
> Julian Bond email: julian_bond@voidstar.com
> CV/Resume: http://www.voidstar.com/cv/
> WebLog: http://www.voidstar.com/
> HomeURL: http://www.shockwav.demon.co.uk/
> M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433
> ICQ:33679568 tag:So many words, so little time
>
>
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>
--
Mark Nottingham
http://www.mnot.net/