[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

A little different approach to discovery



Hi folks.
I don't pretend to know very much about this problem, but here's a little
something I've written that talks about solving syndication from a different
direction:

http://weblogs.asp.net/rosherove/posts/31987.aspx



Regards,

Roy Osherove

www.iserializable.com

> -----Original Message-----
> From: Julian Bond [mailto:julian_bond@voidstar.com]
> Sent: Wednesday, October 15, 2003 12:56 PM
> To: syndication@yahoogroups.com
> 
> More thoughts on mypublicfeeds.
> 
> - I wonder if it would help to separate the discovery part of this from
> the format part and try and solve these independently.
> 
> - I was thinking about use cases. A classic one for me is this. I was
> looking at Zdnet UK, wireless section.
> (http://news.zdnet.co.uk/communications/wireless/). I'm sure I remember
> reading that there was some RSS on zdnet somewhere but I don't know
> where, so what should I do next? There's no obvious XML link on the
> visible page. So do I "view source" and look for <link> and/or start
> looking for likely files containing lists in:-
> http://news.zdnet.co.uk/communications/wireless/
> http://news.zdnet.co.uk/communications/
> http://news.zdnet.co.uk/
> http://www.zdnet.co.uk/
> 
> In the absence of a *really widely* implemented standard like robots.txt
> looking for files will just give me loads of 404s. Incidentally, there's
> no robots.txt or favicon.ico in any of those zdnet directories either.
> 
> - There's enough different reasons for having lists and enough different
> things to list that it feels to me like we need to solve this generally
> and not just for rss. Even for rss, I feel the need to spec which
> flavour each entry refers to.
> 
> - I think I've got three or four things I'd like to put in the header of
> every html file. They're all optional and there might be more than one
> of each.
> 1) Here's a pointer to single alternate version of the same content
> 2) Here's a pointer to a machine-readable file containing lists of
> alternate versions of this content or related content. eg RSS0.92,
> RSS1.0, RSS2.0, Atom, WML, Author's FOAF, Assorted metadata
> 3) Here's a pointer to a machine-readable file containing lists of files
> related to this section of the site
> 4) Here's a pointer to a machine-readable file containing lists of files
> related to this whole site.
> 
> 1) needs more work on identifying the type of the target file. Not type
> as in text/plain vs text/xml, but type as in RSS0.92 vs FOAF vs Atom
> 
> 2), 3) and 4) need work on the markup approach and standard. I don't
> think any of RDFS:seeAlso, OPML or OCS are actually good enough or
> complete enough yet. If this is going to be general it needs to solve a
> whole load of cases now and it absolutely must handle new file types.
> And it had better be really simple to parse and produce too. There's
> going to be a temptation to start adding all sorts of meta-meta-data
> about each entry. Please resist this. It should be a simple list of file
> type, name, URI. Any additional meta-data about each file should be
> contained in the files themselves and available by collecting them.
> 
> 3) and 4) are actually about metadata describing the directory
> structure. I think there is a case here for the W3C to come up with a
> standard way for this to be found and created. If they haven't already.
> It feels like there's a case for a standard file with a standard name in
> each directory. This would actually help robots because it could contain
> sitemap lists of pages.
> 
> However, I think we actually already have a standard here. And that's
> that web requests to directories with no file name should return
> something via http and with a mime type. Either a web page, an index, a
> 404, a graphic or whatever. Now if the returned doc is of type text/html
> then we're back to <link>
> 
> Aside: I wonder if creating new http headers is out of the question? ;-)
> 
> So. I think we need the following:-
> 1) Some standards for specifying target file content type in <link>
> 2) As well as RSSx.xx, Atom, NITF, FOAF etc, create some content types
> for cases 1,2,3,4 above.
> 3) A defined way of creating new content types.
> 4) A standard file format for lists of <links> This needs to include a
> section which is metadata about this list. We can probably do this in a
> way that the entries can be inserted into all sorts of other types of
> files.
> 
> Once we've got this far, then we can move to stage 5) viz. evangelising;
> writing toolkits; writing apps; writing validators; arguing about common
> locations and file names; arguing about whether it should be xml or RDF
> or both; arguing about what it all means; and all that other stuff we're
> so good at.
> 
> This seems to me to be a bare minimum. Once we've done that if the fixed
> filename camp want to create these files with a fixed filename on their
> webservers, then they can go ahead. Just as long as they do <link> too.
> The fixed filename standard can then succeed or fail on it's own merits
> without killing the file format standard in the process.
> 
> --
> Julian Bond Email&MSM: julian.bond@voidstar.com
> Webmaster:              http://www.ecademy.com/
> Personal WebLog:       http://www.voidstar.com/
> M: +44 (0)77 5907 2173   T: +44 (0)192 0412 433
> 
> 
> 
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/