[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [syndication] Re: Finding Feeds



Gotcha. That sounds intersting. It wasn't obvious to me, I think,
because my main interest in RSS is for 'traditional' URL
linking/headline fetching. etc., where the metadata is really only
useful for finding the link; after you've found an interesting link,
the URL is an adequate identifier for e-mailing/shoving around to
other devices, etc. (and enough metadata, like title, etc. can
usually be extracted from the target if necessary).

However, I can see how this would be useful for a lot of other uses
of RSS, such as those where you're actually shoving the content
around (e.g., WebLogs, etc.). Of course, if the metadata were in the
target, the link would be a good means of identification, but that
assumes that the target and the metadata are authored by the same
person; often not the case with WebLogs, etc.

In a way, what you want to do is turn the items in RSS into Resources
in their own right, so that you can talk about them. 

(/me still wonders if this kind of confusion could be avoided if we
used different terms for the different uses (linking and content) of
RSS...)

Cheers,



On Wed, Oct 03, 2001 at 02:19:00AM -0000, Bill Kearney wrote:
> > > My point here was being able to find THIS ONE ITEM in a feed.  
> > > Presuming you're viewing a single item then the meta-data would 
> > > indicate where to find this ONE item in a feed.  If you were 
> looking 
> > > at a number of items, a la Slashdot's opening screen, then you'd 
> most 
> > > likely want the entire feed (or just the currently view scope).
> > 
> > I am missing your point ;)  Couldn't a particular item be identified
> > by the tuple of the feed URI and the item URI, like
> > 
> >  ( "http://example.com/feed.rss";, "http://example.com/item5.html";)
> > 
> > ? This might be serialised in the HTML for the page something like
> > 
> >   <link rel="rss-feed" href="http://www.example.com/feed.rss";>
> >   <link rel="rss-item" href="http://example.com/item5.html";>
> > 
> > so you could the find the rss-item in the rss-feed.
> > 
> > Or is it a matter of just putting IDREFs in the RSS, like
> > 
> > <item id="5">
> >   ...
> > </item>
> > 
> > and then linking to http://example.com/feed.rss#5
> > (yeah, yeah, this should be XPointer, but the idea is there)
> > 
> > If you identify by URI, it's guaranteed unique, except that if you
> > have multiple items referring to the same URI in the feed, they'll
> > 'overwrite' each other. If you use IDREFs, it's unique within the
> > *current* view of the feed, but not outside of it (unless the server
> > guarantees them to be unique over time, like an ETag in HTTP).
> > 
> > > The missing link in my idea is that feeds don't generally support 
> the 
> > > idea of one item being located this way.  Take it one step 
> further 
> > > and give me a way to grab the XML data for the item instead of 
> the 
> > > HTML presentation.
> > 
> > Can you give a real-world example of how you'd want to use this? I
> > might be confusing a few different threads here...
> 
> Let's take it from the top, shall we?  You're looking at a page that 
> contains news items.  You'd like to know if it has a feed available.  
> Something in the page is available for your enlightened browser to 
> determine the feed source.   That would satisfy your browser idea and 
> I think it's a good one.
> 
> To go another step, let's say you wanted to redirect one of the items 
> on the page to some other destination.  Be that destination a mail 
> message, a blog, an instant message, aggregator or some other 
> destination.  How do you extract that message without resorting to 
> some very imprecise scraping?  This is where I'd like to be able to 
> get 'back' to the source of the data using a programmatic interface.  
> In order to keep the web pages 'lightweight' it seems like it would 
> be better for this no to be a mass of meta-data tacked into the 
> HTML.  I'd like the page to have an XML source URI and an item 
> identifier to be applied against it.
> 
> Some reasons for wanting this are to get to more data than might be 
> shown on the HTML view.  If I'm on a WAP phone, for example, but I 
> want to redirect the full feed material to an aggregator.  Scraping 
> just the HTML would leave out a LOT of material.  Sending it as meta-
> data would overwhelm the memory in the phone. (AvantGo limits 
> anyone?).  But if the phone supported detecting this source meta-data 
> and had a way to push that, well, then we'd be getting somewhere!
> 
> Don't let the data die.  
> 
> Stop pushing it into presentation formats only to be poorly scraped.  
> Provide a way to get back to the original data in an XML format.
> 
> You raise a good point about the identifier not being feed-oriented.  
> Making it unique within the feed seems like a reasonable start.  
> Forcing global uniqueness doesn't seem necessary and would probably 
> irritate too many people.  Using a GUID isn't hard but that's another 
> debate.
> 
> This wouldn't be very hard to code.  Most of the stuff is coming out 
> of databases now.  They've usually got some form of ID on the 
> record.  Put that into a meta-data structure inside the HTML.  Put 
> the source URI in the page itself.  Put the item ID in a meta-tag 
> with the item.  Browsers support scripting that can extract this data.
> 
> Then put a SOAP interface or even simple CGI that will return the 
> data in XML.
> 
> -Bill Kearney
> 
> 
> 
> 
>  
> 
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 
> 
> 

-- 
Mark Nottingham
http://www.mnot.net/