extracing the URL from hpricot element

N

Nikita Ratlos

I want to get a list of URLs from a webpage as follows:

First I create the Hpricot element as follows
doc = Hpricot(open(searchurl))

links = doc/"//html//body//div[6]//div[2]//a[@id='p-1']" +#

Next I want to append the URLs to an array as such:

results << links.map.each{|link| puts link.attributes['href'] }

The line nicely prints out the URLs how I need them, but then
puts the whole HTML link in the results array.

Any ideas how to get the URLs (without the HTML) into my results array ?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top