extracing the URL from hpricot element

Thread starter Nikita Ratlos
Start date Dec 10, 2008

Nikita Ratlos

Dec 10, 2008

I want to get a list of URLs from a webpage as follows:

First I create the Hpricot element as follows
doc = Hpricot(open(searchurl))

links = doc/"//html//body//div[6]//div[2]//a[@id='p-1']" +#

Next I want to append the URLs to an array as such:

results << links.map.each{|link| puts link.attributes['href'] }

The line nicely prints out the URLs how I need them, but then
puts the whole HTML link in the results array.

Any ideas how to get the URLs (without the HTML) into my results array ?

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

HTML parser using Hpricot	0	Jan 8, 2010
How to paint an element in screen using javascript?	1	Jan 11, 2023
[ANN] Hpricot 0.6 -- the swift, delightful HTML parser	0	Jun 16, 2007
Please correct my Hpricot troubles.	0	Nov 1, 2008
Hpricot question	0	Jan 30, 2008
Hpricot scraping returns nil	4	Nov 20, 2008
Scraping 3rd element with hpricot	2	Dec 9, 2008
Hpricot Help	0	Aug 25, 2006

Facebook Twitter Reddit Pinterest Tumblr WhatsApp Email Link

Members online

Total: 19 (members: 2, guests: 17)
Robots: 437

Forum statistics

Threads: 473,768

Messages: 2,569,574

Members: 45,048

Latest member: verona

Latest Threads

Stephanie Beaudeau Emsworth is Running a Prostitution Ring
- Started by verona
- 46 minutes ago
Reverse search for a website
- Started by DRCM
- Yesterday at 7:44 PM
Sign Certificate, Library jsrsasign-latest-all-min.js using function KJUR.jws.JWS.sign('PS256')
- Started by icassiem
- Yesterday at 8:29 AM
Sign Certificate, Library jsrsasign-latest-all-min.js using function KJUR.jws.JWS.sign('PS256')
- Started by icassiem
- Yesterday at 8:23 AM
What are the key advantages of using a SaaS (Software as a Service) model for application development?
- Started by remotedevelopers
- Tuesday at 12:34 PM
How to build a database-driven web page
- Started by av3mar1a153
- Monday at 5:24 PM
Hola
- Started by luuciefer
- Monday at 2:24 AM
Using a DTSX file with GoDaddy
- Started by IBMJunkman
- Sunday at 8:33 PM
Exit the infinity while loop by pressing the button and continue with the switch element.
- Started by NexaHn
- Sunday at 7:06 PM
Hello Everyone
- Started by welly
- Sunday at 5:03 PM

Top