[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [syndication] discovery vs information
> (http://news.zdnet.co.uk/communications/wireless/). I'm sure I remember
> reading that there was some RSS on zdnet somewhere but I don't know
> where, so what should I do next? There's no obvious XML link on the
> visible page. So do I "view source" and look for <link> and/or start
> looking for likely files containing lists in:-
Nah, most aggregators now support detecting this sort of thing automatically.
Either via a javascript bookmarklet or an actual browser add-in.
> - There's enough different reasons for having lists and enough different
> things to list that it feels to me like we need to solve this generally
> and not just for rss. Even for rss, I feel the need to spec which
> flavour each entry refers to.
Uh oh, is that the sound of a can of worms being opened?
I don't disagree with you on the idea of more robust pointer to more robust
data. I'm just not sure we're in a position to tackle that Gordian knot. After
all, that's what the semantic web guys have been plodding along toward. What
/will/ help is encouraging the idea that data inside the head sections can
actually have machine processable value. Get them to start paying more
attention to this and things like the semantic web will become a lot more
obvious.
I'd almost favor promoting the idea of a discovery mechanism based on something
like XPath. But I'm not at all confident people have a good enough grasp on it.
> There's going to be a temptation to start adding all sorts of meta-meta-data
> about each entry. Please resist this. It should be a simple list of file
> type, name, URI. Any additional meta-data about each file should be
> contained in the files themselves and available by collecting them.
+1 to that!
> 3) and 4) are actually about metadata describing the directory
> structure. I think there is a case here for the W3C to come up with a
> standard way for this to be found and created. If they haven't already.
> It feels like there's a case for a standard file with a standard name in
> each directory. This would actually help robots because it could contain
> sitemap lists of pages.
No, I disagree. If a standard link tag exists it frees the implementors to use
whatever their system best supports. Look at eTags, why would you use them when
timestamps exist? Well, if you're on a box that didn't have a clock you'd be
stuck. It's that same sort of idea. Best to encourage the idea that machine
processing is /the/ way to learn this and start them by parsing the head
section.
> However, I think we actually already have a standard here. And that's
> that web requests to directories with no file name should return
> something via http and with a mime type. Either a web page, an index, a
> 404, a graphic or whatever. Now if the returned doc is of type text/html
> then we're back to <link>
>
> Aside: I wonder if creating new http headers is out of the question? ;-)
Aiiieeeeee!!! Run away! I've never seen effective examples where the current
HTTP spec comes up lacking so much that new headers are required. What I have
seen in a failure to understand existing HTTP features and the horrible state of
implementations in the field. Things like connection negotiation could
certainly be pressed into service here. But, unfortunately, a great many sites
have NO access to the configuration pieces necessary to make that happen. See
perma-threads on redirecting feeds for the backstory here.
> So. I think we need the following:-
> 1) Some standards for specifying target file content type in <link>
> 2) As well as RSSx.xx, Atom, NITF, FOAF etc, create some content types
> for cases 1,2,3,4 above.
> 3) A defined way of creating new content types.
> 4) A standard file format for lists of <links> This needs to include a
> section which is metadata about this list. We can probably do this in a
> way that the entries can be inserted into all sorts of other types of
> files.
Yep, as I suggested using <link rel="something" type="something" href="URI"/>
should neatly handle this. The only issues are what text to stuff in the 'rel'
attribute and what range of types to promote. I'd go with using a spec's
namespace URI here. Or fallback to a MIME type should the spec be unable to
provide a URI.
As for new content types, the issue is only whether those content types
effectively promote themselves to their desired audience. If someone wants to
come up with 'application/widget+xml' as an alternative format then it'd going
to be up to them to promote it's use. The only issue for us here is making sure
the spec allows for the extensibility. This was one of the most significant
limitations of RSS-0.9x. It's failure to even /allow/ for namespaces directly
impeded extensibility. Let's not make that mistake again.
> Once we've got this far, then we can move to stage 5) viz. evangelising;
> writing toolkits; writing apps; writing validators; arguing about common
> locations and file names; arguing about whether it should be xml or RDF
> or both; arguing about what it all means; and all that other stuff we're
> so good at.
Yep.
> This seems to me to be a bare minimum. Once we've done that if the fixed
> filename camp want to create these files with a fixed filename on their
> webservers, then they can go ahead. Just as long as they do <link> too.
> The fixed filename standard can then succeed or fail on it's own merits
> without killing the file format standard in the process.
While that logic is sound the unintended consequences make it unacceptable.
-Bill Kearney