JS Web Robot

Discussion in 'Javascript' started by Paul Dennis, Jan 10, 2004.

  1. Paul Dennis

    Paul Dennis Guest

    Hi,

    I'm trying to write a web robot using JavaScript.
    It's objective would be to surf around and look
    for patterns in the way web pages link to each
    other or in the text they contain. Data would be
    returned in a web box which could later be copied
    into another application.

    That's not to tough a challenge. I can make a
    JS application surf around my hard drive or
    web site with ease. I simply click an html into
    a second window and wait for the document
    readyState to be complete, then grab the
    document.links array and point the window
    at a new location. Off it goes.

    But when it tries to surf from my drive to
    my web site, or from my web site to another
    web site, it gets an error. It crashes the first
    time it tries to check the readyState of a
    document from a different server.

    I think that maybe JS has been designed to foil
    attempts to build web robots with it. If so, is there
    any way around it? Or maybe I'm just missing a
    critical JS detail or two. So, does anyone know
    what's going on here? Can anyone help me out?

    -Paul Dennis.
     
    Paul Dennis, Jan 10, 2004
    #1
    1. Advertising

  2. "Paul Dennis" <> writes:

    > But when it tries to surf from my drive to
    > my web site, or from my web site to another
    > web site, it gets an error. It crashes the first
    > time it tries to check the readyState of a
    > document from a different server.
    >
    > I think that maybe JS has been designed to foil
    > attempts to build web robots with it.


    The browser security model has. If you try to access the content of a
    page from a different domain, you are stopped - the hard way.

    > If so, is there any way around it?


    Not in any browser, but if it is just your own browser you might be
    able to give it extended permissions. If the browser is IE, you can
    look into HTML Applications (google for "HTML application HTA").

    /L
    --
    Lasse Reichstein Nielsen -
    DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
    'Faith without judgement merely degrades the spirit divine.'
     
    Lasse Reichstein Nielsen, Jan 10, 2004
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jonathan Vance

    1993 Van Rossum Python Web Robot

    Jonathan Vance, Apr 19, 2005, in forum: Python
    Replies:
    0
    Views:
    377
    Jonathan Vance
    Apr 19, 2005
  2. Replies:
    3
    Views:
    590
    George Sakkis
    Feb 13, 2006
  3. Desmond

    User_agent and web robot names

    Desmond, Jun 10, 2007, in forum: HTML
    Replies:
    3
    Views:
    357
    Desmond
    Jun 10, 2007
  4. web robot

    , Oct 25, 2007, in forum: Java
    Replies:
    4
    Views:
    396
    Roedy Green
    Nov 4, 2007
  5. Marlo Brandon

    web crawl /robot

    Marlo Brandon, Aug 3, 2004, in forum: ASP General
    Replies:
    0
    Views:
    101
    Marlo Brandon
    Aug 3, 2004
Loading...

Share This Page