[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [syndication] Re: New poll for syndication
Yes!
Let's focus on the use of syndicated content.
We can do so on this list or on another one.
Let's talk about what we are doing now and what
we can do in the future. Perhaps this will bridge
those of us doing cool stuff now and those us
planning to do something cool in the future, and
let us find some common ground driven by real
needs of real applications.
Let's find some issues and solve them in a clean
fashion.
I'm wondering if one of the issues here is that
some of the "namespace folks" have some kind of
future vision that the rest of us have yet to
get. Not because they have hoarded it up and kept
it a secret, but perhaps because they've looked
at the current level of "cool" and seen where it
can go next -- if only RSS has "amazing new
feature X". I'm not accusing anyone of anything
here.
So let me talk about what I am doing, right here,
right now. Headline Viewer inhales content from
over 2600 sites. We ship it with 536 sites builtin,
and the user can choose to load in Userland's
service list (1687 more providers), the xmlTree
list (366 more when loaded after Userland's), and
30-50 more providers from GrokSoup.
On any given day, there is always something broken.
Servers are sometimes down or busy. By far the
more difficult problem to deal with is the fact
that much of the XML is in bad shape. Here are
some real-world examples:
* I've had to drop at least one site (www.allusb.com)
because the XML parser could not digest the
non-standard encoding attribute in their XML.
Repeated messages to the site over the course
of several months failed to elicit a response.
* Even the very simple character encoding rules of
XML cause problems. You'd all be amazed at how
many sites suddenly become dysfunctional when
"AT&T" is prominent in the news. This is compounded
by the fact that the news scrolls by quickly
enough that troublesome headlines are often gone
between the time that I get a report and I can
investigate.
* Many sites can't manage to include the simple
88x31 "button" <imageurl> without fouling it up.
The Dire Straits Lyrics archive includes an
320x442 picture of Mark Knopfler in there. Its a
nice picture, but it definitely does not follow
the rules. Lots of sites have no image, so we
spend time digging them up.
* There is not much of a consensus on the use of
<title> vs. <description>. Headline Viewer
uses description if present, and then defaults
to title. But a fair number of providers put
weird meta-info in the description, so I store
and respect a "use title" flag for each built-in
news provider.
* Some sites accidentally spew debugging info into
their XML. Don't laugh, its happened more than
once.
Now how could things be improved? I've got to get
to sleep, but let me rattle off a few ideas:
1. Categorization. This is a rat's nest. Some
sites want thousands of categories. I want
10-20. I want them to reflect the kinds of
things that users want (Business, Technical,
Sports, etc.)
2. More widespread use of service lists like
that found on Userland. eGroups can now
generate RSS. So we feature (in a release
that will go out the door in a day or two)
drag-drop eGroup integration. Subscribe
to an eGroup, drop any URL that mentions the
group name on Headline Viewer, and wham, you
can read the list headlines. Way cool. But
it would be cooler to get a list of mailing
lists from eGroups, let the user subscribe
to them, etc.
3. More content. We've but scratched the surface
here. I want content in all sorts of languages
for all sorts of topics. I want the NY Times
bestsellers as an RSS file, and I want
eBay categories in RSS. I want press releases,
I want regional info for places I've never
heard of.
4. More metainfo so that I can more easily track
down those who generate bad RSS (and help
them). I think that this has been proposed.
Excuse me if its there already; its late and I
am tired.
5. More awareness of the whole syndication concept.
Imagine how much great content we would have
if we could position syndication as a form of
site advertising? Give out your headlines for
free, get visitors to read the articles. Not a
bad deal at all.
6. Unique site IDs. This is a messy problem. What
I have found is that the same content has been
registered for syndication under multiple URLs.
Sometimes this reflects evolution, perhaps from
a sub-domain on a free site to a true top-level
domain. Other times the site can generate content
in several forms that I have to consider equivalent
for my purposes. Moreover can emit RSS or their
own <moreover> format. I use the <moreover> form,
but the serviceList at Userland includes the
RSS form.
To detect and eliminate the duplication
that this causes, I've built and maintain an
alias list (http://www.vertexdev.com/chv_aliases.xml).
Take a look at this to understand the problem.
The first entry contains 7 names for the same
content. I build this list semi-automatically but
I have to do sufficient manual checking that I
cannot see this scaling to accomodate say 50K
providers. Headline Viewer loads this list and
uses it to avoid duplicating built-in providers
with those loaded from the service lists.
Gack, I've written a lot. I hope this is some good
food for thought. I'm really looking forward to
some productive discussion and forward motion.
Jeff;
Jeff Barr - Home: 425-836-5624 Office: 425-936-3098
mailto:jeff@vertexdev.com
http://www.vertexdev.com/~jeff
http://jeffbarr.editthispage.com/
4610 191st Place NE. Redmond, WA
-----Original Message-----
From: Dave Winer [mailto:dave@userland.com]
Sent: Wednesday, September 13, 2000 9:50 PM
To: syndication@egroups.com
Cc: Lynn Siprelle
Subject: Re: [syndication] Re: New poll for syndication
Maybe some people don't want to contribute to standards, maybe they just
want to get more hits for their sites. I think we should start a new mail
list with a charter that puts the focus on use of this stuff, not on a
standards process. Just speaking for myself, this is exactly what I want to
get away from, I've had enough standing at the fork, Aaron, you got
everything you wanted, go create something new with the RSS 1.0 people.
Thanks. Dave