[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [RSS-DEV] Cross-Site Unique ID Formats? And ID Debating...
Wow, I get on a plane for a few hours and lots happens :-).
Ideally, I would like the IDs to be unique across time and space, like
a Microsoft COM GUID. These are 128 bits long and are, as the docs say,
"unique to a high degree of certainty." It would be cool if this
was included somewhere within the RSS <channel> tag, and if there was
a web service that the site author would use to assign it.
However, I do not think that there is a totally automated way to make
sure that the same (leaving out the definition of "same" for the moment)
GUID is assigned to different URLs that are aliases to the same info.
Human judgement is needed to look at the info contained and pointed
to by two or more candidate URLs and to decide if they are the same or
not. So this service would have to have the ability to say "I'll get
back to you on that."
For a moment I thought that a SOAP service that accepted a candidate
URL, checked it against a master list, then either returned an existing
GUID or created and then returned a new one. But this takes the human
judgement out of the loop and then we end up needing an alias list again
to say that "these two GUIDs are the same."
For me, this all starts and ends as a quality of service issue. I want
to give my users a unique list of content. If they are regular readers
of, say, "CNET", and they've done some customization to that channel
within my program, then I don't want them to be confused if the source
of that channel changes, say, from scraped data (once available from
the late Internet Alchemy) to Moreover data, to what could at sometime
become a direct feed. I've actually got lots of stuff in the works on
the customization front (none of which I want to talk about yet), and
it is possible that a user could productively spend 10 or 15 minutes
on customization. The unique ID is what allows me to do this, and to
not have them lose their work when the data source or the source's
name changes. I cannot index the contributions by "CNET", and I cannot
index them to the URL which is supplying the content. Neither is truly
fixed.
I never defined "same" before. This is where things get hairy. For the
work I am doing, I look at the content returned and make a judgement
call. Identical content is two different formats (e.g. <rss> and <moreover>)
is the same. Different human languages (English vs. French) are not.
Parameterized URLs which specify different item counts to return are
the same. I make this call from the user's point of view.
Jeff;
-----Original Message-----
From: Morbus Iff [mailto:morbus@disobey.com]
Sent: Monday, February 26, 2001 7:13 PM
To: rss-dev@yahoogroups.com
Subject: RE: [RSS-DEV] Cross-Site Unique ID Formats? And ID Debating...
>Right - all I meant is that the creator of an xml channel either in terms
of
>an aggregated topic from multiple sources (in which case a third party) a
>channel of commentary (weblog style) or an original source (in which case
>the ID should relate to the original publisher via a namespace or someother
>and not the third party who may be creating the xml channel) should not
>necessarily be who the namespace relates to. This way you do not get
>duplicates and can just use a namespace without a timestamp.
I don't think I'm understannding then - "the creator of an xml channel"?
So, we'd be forcing the end user to decide a id for themselves?
How do you feel about the "xmlUrl as unique id" idea?
--
Morbus Iff
Here we have One DimensionalMorbus - Flatter than
_____ Brooke Shields, able to to be ignored for days at a time,
slower than a Microsoft Slug. Defender of AOL users, Bill
Gates and other one dimensional life forms who mutter
"I don't get it..."
-03--- <\/> ---- <http://www.disobey.com/> --- Bad Ascii, Short Notice ----
To unsubscribe from this group, send an email to:
rss-dev-unsubscribe@egroups.com
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/