Jeremy Zawodny wrote:
I now agree that #1 is not worth it. HOWEVER, I would like to pursue the possible use of robots.txt as suggested by Chad. Does anyone here know any robots.txt "experts"--people who were involved with the discussions back then? We should try to get an authoritative answer on whether it is (1) possible to do what we want there, and (2) not completely unreasonable to do it. If the robots.txt option doesn't work out, then it doesn't work out. We tried.
The robotstxt.org [0] site contains notes from a 1996 meeting on
spidering [1]. It contains the following tantalyzing tidbit:
These are issues recommended for future standards discussion that
could not be resolved within the scope of this workshop.
...
* ways of advertising content that should be indexed (rather
than just restricting content that should not be indexed)
...
Looks like its time to break some new ground, in an area where
the original folks thought it would eventually go.
Jeff;
[0] - http://www.robotstxt.org/
[1] - http://www.robotstxt.org/wc/meta-notes.html