How to download the webpages that I want, using HTTRACT ?

D

Disco

Penang said:
Dear all,
What setup I must use in HTTRACT to get it to download ONLY the Jan 1
to Jan 30 links on http://202.186.86.35/english/jan2002.asp and not
any other ?

looks like it is only getting files that have relative URIs. Th Jan 1 - Jan
30 files have URIs like href="/blah/blah/blah.ext" whereas other URIs have
href=http://www.freakmeout.example.com/whatthe.freakinfreakshowfreakhead.ext
..

Maybe try looking for something such as "Only download file on the web site"
or "Do not download external files" in the help of the application.

Also, maybe because you are going to the ip address (202.186.86.35) instead
of the domain (www.freakenfreakfreaker.example.com)
 
P

Penang

Sid Ismail said:
Look in the help of the program HTTRACT.

Sid


I did. I tried all the different settings. I even turned off the
"robot.txt" setting.

But HTTRACT still won't do the simple thing like getting the Jan 30 to
Jan 1 links from the page.

Instead, it got all types of unrelated page AWAY from the
http://202.186.86.35/english/jan2002.asp page.

Just don't know what to do next.

Anyone has any suggestion?

Thanks !
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,189
Latest member
CryptoTaxSoftware

Latest Threads

Top