A
Arun Kumar
Hi,
I'm new to ruby and my co. has given me an assignment in ruby. It is
regarding html extraction. It works fine except for some sites like
http://www.youtube.com, http://www.gmail.com where i'll get errors like
'400 Bad Request' and 'getaddrinfo: Name or service not known
(SocketError)' respectively for each of the 2 sites. I came to know that
may be it is because the url is being redirected. But i'm not sure about
it. My code for html extraction is :
require 'rubygems'
require 'hpricot'
require 'open-uri'
require 'dbi'
puts "Enter domain name :"
domain = gets
#concatinating 'http://www.' with the url to open the page
url = "http://www."+domain
document = open(url)
#getting the original url of the site
url2 = document.base_uri.to_s
Can anybody please help. It is urgent. I'll be really greatful for those
who reply
Regards,
Arun Kumar
Attachments:
http://www.ruby-forum.com/attachment/3450/htmlParse.rb
I'm new to ruby and my co. has given me an assignment in ruby. It is
regarding html extraction. It works fine except for some sites like
http://www.youtube.com, http://www.gmail.com where i'll get errors like
'400 Bad Request' and 'getaddrinfo: Name or service not known
(SocketError)' respectively for each of the 2 sites. I came to know that
may be it is because the url is being redirected. But i'm not sure about
it. My code for html extraction is :
require 'rubygems'
require 'hpricot'
require 'open-uri'
require 'dbi'
puts "Enter domain name :"
domain = gets
#concatinating 'http://www.' with the url to open the page
url = "http://www."+domain
document = open(url)
#getting the original url of the site
url2 = document.base_uri.to_s
Can anybody please help. It is urgent. I'll be really greatful for those
who reply
Regards,
Arun Kumar
Attachments:
http://www.ruby-forum.com/attachment/3450/htmlParse.rb