Newbie LWP question - simulate browser?

Discussion in 'Perl Misc' started by philthym, May 10, 2004.

  1. philthym

    philthym Guest

    Hi

    As the title suggests, I am a Perl newbie. I am trying to monitor a
    remote site and would like to time it in returning all objects on the
    page, ie the HTML and all the associated GIFs, bits of JavaScript,
    Java applets and so on.

    Here is the code as it stands today:

    #!/usr/bin/perl
    use CGI;
    use LWP::Simple;
    use Time::HiRes qw(gettimeofday);

    $URL="http://www.xyz.com/index.html";

    $usec1 = gettimeofday;
    $timenow = localtime();

    $HomePage=get($URL);

    if
    ($HomePage =~ /String/)
    { $usec2 = gettimeofday;
    $elapsed = $usec2-$usec1;
    print "$timenow Page retrieved in $elapsed seconds\n" }
    else
    { print "$timenow Page not retrieved\n"; }

    I'm not sure I understand the whole lwp/get thing! What I'm wondering
    is does this request effectively initiate the web server to return all
    objects or just the HTML itself? If it's returning everything, then
    does the second timer occur after all objects have been returned? In
    other words, does this code do what I want it to? If not, any ideas
    how I would achieve my aim, please?

    Any help would be gratefully appreciated.

    Thanks

    Phil
     
    philthym, May 10, 2004
    #1
    1. Advertising

  2. philthym wrote:

    > I'm not sure I understand the whole lwp/get thing! What I'm wondering
    > is does this request effectively initiate the web server to return all
    > objects or just the HTML itself?


    It does *exactly* what you ask it to, no more - it fetches index.html.
    Parsing the HTML, extracting the <img ...> elements from it, and making
    additional requests to the server to fetch the images they point to, will
    require additional code.

    Have a look at HTML::parser - it's a good place to start.

    sherm--

    --
    Cocoa programming in Perl: http://camelbones.sourceforge.net
    Hire me! My resume: http://www.dot-app.org
     
    Sherm Pendley, May 10, 2004
    #2
    1. Advertising

  3. philthym

    Joe Smith Guest

    philthym wrote:

    > As the title suggests, I am a Perl newbie. I am trying to monitor a
    > remote site and would like to time it in returning all objects on the
    > page, ie the HTML and all the associated GIFs, bits of JavaScript,
    > Java applets and so on.


    It's one thing to fetch a Javascript. It is quite another to fetch
    the things that would be requested had the Javascript been executed.
    For that, you need a proxy that logs the requests from a real browser.

    There are several, including the "Web Scrapting Proxy"
    http://www.research.att.com/~hpk/wsp/

    -Joe
     
    Joe Smith, May 10, 2004
    #3
  4. philthym

    philthym Guest

    Sherm Pendley <> wrote in message news:<>...
    > philthym wrote:
    >
    > > I'm not sure I understand the whole lwp/get thing! What I'm wondering
    > > is does this request effectively initiate the web server to return all
    > > objects or just the HTML itself?

    >
    > It does *exactly* what you ask it to, no more - it fetches index.html.
    > Parsing the HTML, extracting the <img ...> elements from it, and making
    > additional requests to the server to fetch the images they point to, will
    > require additional code.
    >
    > Have a look at HTML::parser - it's a good place to start.
    >
    > sherm--


    Thanks Sherm, I thought it would be something like that. I'll check out HTML:parser.

    Regards

    Phil
     
    philthym, May 11, 2004
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. La Jesus
    Replies:
    9
    Views:
    1,337
    Gunnar Hjalmarsson
    Oct 27, 2003
  2. choi2k
    Replies:
    4
    Views:
    1,691
    Chris Angelico
    Mar 16, 2012
  3. Andrew
    Replies:
    3
    Views:
    104
    Andrew
    Nov 24, 2003
  4. Bumble

    Newbie LWP Question

    Bumble, Jan 7, 2004, in forum: Perl Misc
    Replies:
    7
    Views:
    96
    Sherm Pendley
    Jan 11, 2004
  5. TB
    Replies:
    1
    Views:
    138
    Sherm Pendley
    Dec 7, 2004
Loading...

Share This Page