[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [syndication] Aggregate Multiple Feeds



On Thu, 22 Aug 2002, gnumember wrote:

> I would like all the stories to appear in chronological order
>
> Is this possible, if so how.
>

Its cetainly possible.

I've got a script[1] I wrote to do exactly that on www.indymedia.org,
maintaining a list of the most recently posted headlines from the network.
Feel free to use it, as a starting point, but it assumes RSS 1.0 with
dc:date info. (If you want to spit RSS back out you'll need my XML::RSS[2]
that adds a small encoding patch)

The one trick when sorting by publish time is finding when an item was
published :)

In RSS .9 you have no information to make this calculation, except, maybe,
the Last-Modified info when you request the file.  And then you need to
keep an old version of the file cached, so you can do diffs.

With the RSS .9x you've got pubDate, and lastBuildDate on the channel, and
then in later versions on items. (Which version adds these?  Are they
optional? I don't know as much about these middle versions)

And with RSS 1.0 you have the optional dc:date on channel and on items.

So merely answering the question, when were there stories published in
order to sort then chronologically can be tricky.

At Indymedia we solve the problem by insisting that each site include
dc:date for each item in their feed, but that doesn't work as well when
you don't have a relationship with the feed provider.


I've also (for a different project[3]) been playing with code to identify
when a feed was most recently published based on the above heuristics.
Extending it to return the item, not just the date would be trivial.

http://protest.net/~kellan/daterss.py

apologies upfront if it doesn't quite work, i just extracted the code from
being deeply entangled with a bunch of unrelated stuff.  also, its relies
on mnot's RSS and isodate classes[4]

1. http://protest.net/~kellan/aggregate_imc_rss.pl.txt
2. http://protest.net/~kellan/XML-RSS-patched.tar.gz
3. http://laughingmeme.org/archives/000034.html#000034
4. http://www.mnot.net

hope that helps,
kellan