Web Page Downloader

  • Thread starter Chase Preuninger
  • Start date
C

Chase Preuninger

I want to write a program that downloads web pages and replaces all
the relative URLs with absolute ones

EX. files/banner.jpg gets Replaced by http://www.mysite.com/files/banner.jpg

Where are the locations in which I would have to look to find a url
that needs to be replaced?
 
D

dorayme

Chase Preuninger said:
I want to write a program that downloads web pages and replaces all
the relative URLs with absolute ones

EX. files/banner.jpg gets Replaced by http://www.mysite.com/files/banner.jpg

Where are the locations in which I would have to look to find a url
that needs to be replaced?

Good question, I don't know if there is a general answer. I know that I
can do it often by S & R by targeting any href=" that does not have
after the " a http://
 
D

dorayme

Ed Jay said:
dorayme scribed:


Then do a general search and replace? :)

No, this is not right either because there is no such thing as a general
this. You have to be specific. Hence the problem. I am not meaning to be
awkward here Ed, it just comes naturally. <g>
 
C

Chase Preuninger

I was talking about something that downloads a web page so that it
will still work fine in a browser so that means replacing any
references to an external resource.
 
D

dorayme

Chase Preuninger said:
I was talking about something that downloads a web page so that it
will still work fine in a browser so that means replacing any
references to an external resource.

To take the first part of what you want, do you mean, will work fine
offline starting with a cleared cache and continue to work fine offline
to get all the other pages on the website?

Different browsers have different abilities to save webpages and sites.
With old Mac IE you could specify the level of links you wanted
preserved and it would prepare a file that worked entirely off line to
the depth wanted. Proprietary MS method. Basically it saves a page
*with* all the images and other stuff (all the info goes into the
offline file) and goes to the links on it and does the same at those
(online) pages and so on to the depth specified. You get the lot and can
view offline later. It worked *quite* well.

In Safari, you can save a page but not deeper and it resolves the urls
on that page so that if you are viewing the offline file, it will get to
the online links ok provided you are online at the time or have the page
cached. It also is prettu proprietary looking.

Firefox is more straightforward, transparent in that you get a html file
and a folder is created with the images and other resources downloaded
to you machine for that one page.

I better stop in case no one is reading me...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,586
Members
45,084
Latest member
HansGeorgi

Latest Threads

Top