A
ask josephsen
Hi NG
I'm making a program to crawl the internet. It works by retrieving all links
in a page, downloading the page of each link and again retrieving all the
links. (If there is better ways I'd like to hear)
My problem is relative links (like "../../wohoo.asp"). What is the smartest
way to get the full url (http://www.xyz.com/wohoo.asp)? Do I have to parse
the relative link in relation to the url where the relative link was found
and then concatenate it? Does anyone know how other search-engines/ crawlers
walk the net?
Thanks
../ask
I'm making a program to crawl the internet. It works by retrieving all links
in a page, downloading the page of each link and again retrieving all the
links. (If there is better ways I'd like to hear)
My problem is relative links (like "../../wohoo.asp"). What is the smartest
way to get the full url (http://www.xyz.com/wohoo.asp)? Do I have to parse
the relative link in relation to the url where the relative link was found
and then concatenate it? Does anyone know how other search-engines/ crawlers
walk the net?
Thanks
../ask