[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: How to scrape?
Alis Marsden <alis@purplepages.ie> wrote:
> I'm guessing I'd do it by spidering the pages somehow but there really
> doesn't seem to be much information about how it could be done on the web.
The way I do it is that I load the text of the page in to memory, than use
regular expressions to extract the proper information. Then I just spit it
out in RSS.
If you'd like some Tcl code to do this, I can send you some.
--
[ Aaron Swartz | me@aaronsw.com | http://www.aaronsw.com ]