How to download the webpages that I want, using HTTRACT ?

Discussion in 'HTML' started by Penang, Jun 25, 2003.

  1. Penang

    Penang Guest

    Penang, Jun 25, 2003
    #1
    1. Advertising

  2. Penang

    Sid Ismail Guest

    On 25 Jun 2003 01:26:11 -0700, (Penang) wrote:

    > Dear all,
    >
    > I am trying to download all the subordinate files of
    > http://202.186.86.35/english/jan2002.asp
    >
    > I am using HTTRACT to do the job.
    >
    > What setup I must use in HTTRACT to get it to download ONLY the Jan

    1
    > to Jan 30 links on http://202.186.86.35/english/jan2002.asp and not
    > any other ?



    Look in the help of the program HTTRACT.

    Sid
    Sid Ismail, Jun 25, 2003
    #2
    1. Advertising

  3. Penang

    Disco Guest

    Penang wrote:
    > Dear all,


    > What setup I must use in HTTRACT to get it to download ONLY the Jan 1
    > to Jan 30 links on http://202.186.86.35/english/jan2002.asp and not
    > any other ?


    looks like it is only getting files that have relative URIs. Th Jan 1 - Jan
    30 files have URIs like href="/blah/blah/blah.ext" whereas other URIs have
    href=http://www.freakmeout.example.com/whatthe.freakinfreakshowfreakhead.ext
    ..

    Maybe try looking for something such as "Only download file on the web site"
    or "Do not download external files" in the help of the application.

    Also, maybe because you are going to the ip address (202.186.86.35) instead
    of the domain (www.freakenfreakfreaker.example.com)
    Disco, Jun 26, 2003
    #3
  4. Penang

    Penang Guest

    Sid Ismail <> wrote in message news:<>...
    > On 25 Jun 2003 01:26:11 -0700, (Penang) wrote:
    >
    > > Dear all,
    > >
    > > I am trying to download all the subordinate files of
    > > http://202.186.86.35/english/jan2002.asp
    > >
    > > I am using HTTRACT to do the job.
    > >
    > > What setup I must use in HTTRACT to get it to download ONLY the Jan

    > 1
    > > to Jan 30 links on http://202.186.86.35/english/jan2002.asp and not
    > > any other ?

    >
    >
    > Look in the help of the program HTTRACT.
    >
    > Sid



    I did. I tried all the different settings. I even turned off the
    "robot.txt" setting.

    But HTTRACT still won't do the simple thing like getting the Jan 30 to
    Jan 1 links from the page.

    Instead, it got all types of unrelated page AWAY from the
    http://202.186.86.35/english/jan2002.asp page.

    Just don't know what to do next.

    Anyone has any suggestion?

    Thanks !
    Penang, Jun 26, 2003
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. James Yong
    Replies:
    0
    Views:
    307
    James Yong
    Aug 23, 2005
  2. Song Zhang
    Replies:
    3
    Views:
    100
    Bryan Field-Elliot
    Sep 27, 2003
  3. Replies:
    4
    Views:
    142
    Ted Zlatanov
    Sep 13, 2006
  4. Brett

    Reading Webpages using javascript

    Brett, May 29, 2005, in forum: Javascript
    Replies:
    1
    Views:
    78
    Randy Webb
    May 29, 2005
  5. pavi
    Replies:
    0
    Views:
    1,323
Loading...

Share This Page