D
Dan Cuddeford
Hello all,
I've was pushed towards ruby from by a friend. I'm used to the usual
shell scripting and was told this will be much more powerful /
gracefully / easier.
It does look all very exciting but I'm having a problem looking for the
easiest way to implement something quite simple.
I'm used to using wget to crawl some of my sites to a certain layer.
wget http://www.digg.com -r -l 2 will digg down two layers from the
front page (follow the links).
I can't find an easy way of doing this. Open-uri doesn't seem to
supporting recursive following. I've looked at pulling down the HTML and
parsing it back to open-uri but there doesn't seem to be an easy way of
doing this.
Another thing I would like to do it pull down other elements from the
html such as images so I explored html-parsing but they all seemed to
be geared towards manipulation rather than downloading information for
manipulation later
Thanks for your help if you can
Dan
I've was pushed towards ruby from by a friend. I'm used to the usual
shell scripting and was told this will be much more powerful /
gracefully / easier.
It does look all very exciting but I'm having a problem looking for the
easiest way to implement something quite simple.
I'm used to using wget to crawl some of my sites to a certain layer.
wget http://www.digg.com -r -l 2 will digg down two layers from the
front page (follow the links).
I can't find an easy way of doing this. Open-uri doesn't seem to
supporting recursive following. I've looked at pulling down the HTML and
parsing it back to open-uri but there doesn't seem to be an easy way of
doing this.
Another thing I would like to do it pull down other elements from the
html such as images so I explored html-parsing but they all seemed to
be geared towards manipulation rather than downloading information for
manipulation later
Thanks for your help if you can
Dan