[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [syndication] The RSShovah Witnesses...
Hi All,
> We need some volunteers for door-knockers and for scrapers.
Sorry to be coming in a little late of this, but can I sign up ;-)
> We need a list of sites, and then some state indicators:
Ok, I'll can start with Ireland and if the UK is not going to be covered by
anyone else I could do that also.
That kind of geographically approach seems a logical way for me to get
started, does that sound ok to everyone else?
> We could keep this list in XML form (under CVS control) on a server
> somewhere, and format it with XSL to produce a "status report".
I'll leave the format of that file to the more technically able members of
this list ;-)
> At this point, we also need to answer some questions:
> 1. What is the process we take when someone requests a feed?
> Possible answer: contact the provider with a prewritten
> email, wait a week for a response, send another email
> if no response, wait a week, if no response, scrape and
> report.
This sounds fine.
> 2. Should we announce the scraped feeds to the site in question?
> I worry that some sites will be "hey! stop that! take it
> down!". Whilst one route, we respect their wishes at the
> expense of wasted time and disgruntled requestors, the other
> route we're being "sneaky".
Perhaps we should start out annouincing it to them and see how it goes. Can
we also assume that if a provider requests that we stop scraping at any
point, we do that?
> 3. If we do this off a CVS, the code for custom scrapes should
> also be thrown on the CVS, along with any libraries and
> required code.
Ok, I might be misunderstanding the question here but I'm presuming this
means that anyone who scrapes is open sourcing all the code they use to do
so. Is that is the case, I'm not sure it should be enforced because that
might discouage some people from becoming scrapers.
>4. Should fulfilled requests XML be kept in a legacy file for
safe keeping?
ok, I'll leave this to the more technically able also ;-)
5. Will there be a definitive announce source for our scraped
feeds? I recommend Jeff's Manilla site.
That sounds fine, perhaps an RSS feed would be made available also, so if
others want to announce new feeds they can do so from their sites also.
Alis