B
basi
Hello,
I'm looking for a screen scraper that will extract text contents off
html pages and save the text into files. I have looked at Mechanize and
Rubyful_Soup, but they are a bit over my head to modify to save just
the text contents to a file. (I'm a researcher trying to use Ruby for
real world text analysis tasks, and trying to learn Ruby at the same
time.) The levels of usage I'd love to have (choosey beggar):
Of course a program that, given a url, would walk down the links, open
the pages, and save the text contents to a file would be ... that would
be a commercial product. Is there one?
Thanks!
basi
I'm looking for a screen scraper that will extract text contents off
html pages and save the text into files. I have looked at Mechanize and
Rubyful_Soup, but they are a bit over my head to modify to save just
the text contents to a file. (I'm a researcher trying to use Ruby for
real world text analysis tasks, and trying to learn Ruby at the same
time.) The levels of usage I'd love to have (choosey beggar):
program prompts me for url address to scrape and file name to save texts into,
or edit program to enter url address and file name
Of course a program that, given a url, would walk down the links, open
the pages, and save the text contents to a file would be ... that would
be a commercial product. Is there one?
Thanks!
basi