[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Finding Feeds

To: syndication@yahoogroups.com
Subject: Re: Finding Feeds
From: "Bill Kearney" <wkearney99@hotmail.com>
Date: Wed, 03 Oct 2001 03:33:47 -0000
In-reply-to: <20011002195229.A1430@mnot.net>
User-agent: eGroups-EW/0.82

> (and enough metadata, like title, etc. can
> usually be extracted from the target if necessary).

Yes, let's eliminate the need to scrape at all.  If you want the 
data, just ask the source for it.  Yes, this would require an 
interactive source to query.  For static blogs this would not work 
without some help.

Theory has it that a site storing the data in XML and simply applying 
brower compatible XSLT against it might allow this to happen rather 
easily.  However, the CPU load from all that dynamic processing 
really doesn't seem worth it.  Leaving it static and waiting for the 
request seems like a lot less load on the system.

> However, I can see how this would be useful for a lot of other uses
> of RSS, such as those where you're actually shoving the content
> around (e.g., WebLogs, etc.). Of course, if the metadata were in the
> target, the link would be a good means of identification, but that
> assumes that the target and the metadata are authored by the same
> person; often not the case with WebLogs, etc.

I strongly disagree with the reliability of links.  They're not 
reliable, as many sites have shown.  And not because the content is 
no longer online.  A site dying is a problem, of course, but 
developmental changes to a site often break hierarchical web links 
merely on the whims of developers.  Let's not require them to keep 
anything other than an item ID and single URI with a known set of 
parameters.  That way they can move it around to their hearts content.

This might also be a way to persist data even after it's gone from 
the web.  If the content provider dies but something else maintained 
a database of items then you could redirect from that repository.  
This could be something the client interface could be told to perform.

> (/me still wonders if this kind of confusion could be avoided if we
> used different terms for the different uses (linking and content) of
> RSS...)

Terminology confusion in RSS?  Say it ain't so!  <grin>

Try reading some of the other formats.  I understand what they're up 
to but it's a lot to wrap your head around.  These so-called formats 
all have sixteen different names for the same thing.

Anyway, it would be a good start to be able to find a single item 
based on some uniquely identifying bit of info and extract it's 
greater structure.

-Bill

References:
- Re: [syndication] Re: Finding Feeds
  - From: Mark Nottingham <mnot@mnot.net>

Prev by Date: Re: [syndication] Re: Finding Feeds
Next by Date: TopicMaps
Previous by thread: Re: [syndication] Re: Finding Feeds
Next by thread: Re: [syndication] Re: Finding Feeds
Index(es):
- Date
- Thread