[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [syndication] Aggregating your global content output into your blog



>I routinely post to many different content systems on the web. I'd like
>to aggregate all this into my blog. I suspect that I'm not the only
>person who would like to do this.[1]

I've done this with RSS feeds, but under the assumption that:

 * you're using the MT metaWeblog API.
 * all feeds have <dc:date> values.
 * you run the script once a night.
 * it's in Perl and requires modules.

The script is at http://disobey.com/d/code/myrssmerger.pl. You
can see an example of it's output with the two items on May 27th:

  http://www.disobey.com/dnn/2003/05/index.shtml#001498

Here's a quick run-through of it's use:

 ./myrssmerger.pl --server http://disobey.com/cgi-bin/mt/mt-xmlrpc.cgi
		  --username morbus --password HAAHAHAH -blogid 1
                  --showcategories

The output is:

 ----------------------------------------------------------------------
  The following blog categories are available:

  1: Disobey Stuff
  2: The Idiot Box
  3: CHIApet
  4: Friends O' Disobey
  5: Stalkers O' Morbus
  6: Morbus Shoots, Jesus Saves
  7: El Casho Disappearo
  8: TechnOccult
  9: Potpourri
  10: Collected Nonsensicals

 Category ID's can be used for --catid or -c.
 ----------------------------------------------------------------------

If you have no categories, you'll be told as such. Now, when you're
actually posting to the blog, you can choose to post into a category or not
- if I wanted to post into "Disobey Stuff", I'd use either -c 1 or --catid
1. If I wanted no category, I'd specify no category.

Here's one way I use this script. What I want to do is take all these
different RSS feeds that have my data in it (from Gamegrene, from ORA,
etc.) and have them all synchronized into my primary blog, DNN. So, I run
this through cron every night:

 ./myrssmerger.pl --server http://disobey.com/cgi-bin/mt/mt-xmlrpc.cgi
		  --username morbus --password HAAHAHAH -blogid 1
                  --catid 1 http://gamegrene.com/index.xml

In this case, I'm saying "ok, every night, check that URL for entries
posted today, and if you see some, post to 'Disobey Stuff', referenced as
category ID 1." I do something similar to my O'Reilly blog, only using
--catid 8. An example of the above output looks like:

 ----------------------------------------------------------------------
 Downloading RSS feed at http://gamegrene.com/index.xml...
  Publishing item: 'RPG, For Me'.
  Skipping (failed date check): 'Just Say No To Powergamers'.
  Skipping (failed date check): 'Every Story Needs A Soundtrack'.
  Skipping (failed date check): 'The Demise of Local Game Shops'.
  Skipping (failed date check): 'Death Of A Gaming System'.
  Skipping (failed date check): 'What Do You Do With Six Million Elves?'.
 ----------------------------------------------------------------------

You can pass multiple URLs too. Say I've got thirty friends, and I want to
check all their RSS feeds for new entries posted today. Any new entries
should be posted to my blog with an id of 4 ('Friends of Disobey'):

 ./myrssmerger.pl --server http://disobey.com/cgi-bin/mt/mt-xmlrpc.cgi
		  --username morbus --password HAAHAHAH -blogid 1
                  --catid 4 http://gamegrene.com/index.xml
                  http://researchbuzz.com/researchbuzz.rss
		  http://camworld.com/index.rdf

The shortened output looks like:

 ----------------------------------------------------------------------
 Downloading RSS feed at http://gamegrene.com/index.xml...
  Skipping (failed date check): 'RPG, For Me'.
  Skipping (failed date check): 'Just Say No To Powergamers'.
  Skipping (failed date check): 'Every Story Needs A Soundtrack'.
 ----------------------------------------------------------------------
 Downloading RSS feed at http://camworld.com/index.rdf...
  Publishing item: 'Trinity's Hack from Matrix Reloaded'.
  Skipping (failed date check): 'Siberian Desktop'.
  Skipping (failed date check): 'The Sweet Hereafter'.
 ----------------------------------------------------------------------
 Downloading RSS feed at http://researchbuzz.com/researchbuzz.rss...
  Skipping (no description/date): 'Northern Light Coming Back?'.
  Skipping (no description/date): 'This Week in LLRX'.
 ----------------------------------------------------------------------

Finally, you add a --filter "" to the command line. The
following will only post entries that match "perl":

 ./myrssmerger.pl --server http://disobey.com/cgi-bin/mt/mt-xmlrpc.cgi
		  --username morbus --password HAAHAHAH -blogid 1
                  --catid 4 --filter "perl" http://camworld.com/index.rdf

As such, it skips over Cam's latest entry:

  Skipping (failed filter): 'Trinity's Hack from Matrix Reloaded'.

So, in this regard, you could pass 30 URLs on the command line, all
filtered according to the word "perl" and all stuck in a category called
"Perl", for instance. Assuming the RSS feeds had a <dc:date> entry, new
items matching those requirements would be posted.

-- 
Morbus Iff ( softcore vulcan pr0n rulezzzzz )
Culture: http://www.disobey.com/ and http://www.gamegrene.com/
Tech: http://www.oreillynet.com/pub/au/779 - articles and weblog
icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus