[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[syndication] RSS vs. HTML Bandwidth and "Scalability"...
Morbus Iff writes:
> Initially, I was "hey! what's the problem? people care more about
> your content than the pretty design! be happy!".
Do we have any specs on end-users actually using RDF? I haven't looked
at my stats in a while, but I'd bet that syndication formats cause me
to serve more data proportionately than I have users using that
syndicated data.
It's hard to tell now that the web aggregators (my.netscape, the
UserLand tool that provided a web interface) aren't as heavily used,
but my guess is that at the peak I had 5 distinct end-users of my RDF
content. Last time I tried to make a detailed study of it I guessed
circa 1500 distinct readers of my HTML renderings of the same content
who return at least weekly.
> a) Embed the time limit in the RSS file.
HTTP has provision for this...
> b) Check the HTTP headers from the server.
All my pages for which modification means something intelligent
(including CGI/mod_perl served pages) send a "Last-modified" header
for "HEAD" (and "GET", but that's ancillary to the discussion)
requests.
Furthermore, all my server parsed HTML and all static pages (including
my RDF content) support If-Modified-Since. And for static data (which
RDF/RSS/whatevertheheckyouwannacallit content almost always is), this
is the default for most web servers.
Since Perl's LWP has hooks for If-Modified-Since, probably most other HTTP
libraries do as well, there's no reason to not support it in clients.
Clients/spiders/whatever which aren't using these effectively should
be taken out and thrashed.
> c) Implement server control - block repetitive ip's
> on a cron'd schedule and allow them back in when
> the going gets happy.
I don't want to have to micromanage my servers. I don't want to have
bars on my windows. I want to build a community where people don't
abuse my resources as a matter of course but are free to them when it
isn't a drain on me.
Let's try encouraging spider and client writers to do the right thing
first. If we can't get them to abide by the existing HTTP standards,
then we can take further action (I'm a big fan of torches and
pitchforks, but I'm amenable to other ways to make examples of repeat
offenders that don't take advantage of existing mechanisms. Hot
stones, the rack, crucifixion, I'm not picky...)
Dan