LWP user agent grabs the intermediate wait page after POST intead ofthe actual result page

Discussion in 'Perl Misc' started by bhabs, Feb 12, 2008.

  1. bhabs

    bhabs Guest

    Hi,

    I wrote a small LWP based perl program to search the air fare from a
    travel website using POST.

    #!/usr/bin/perl
    use strict;
    use CGI;
    use LWP;

    my $web_browser = LWP::UserAgent->new();
    push @{ $web_browser->requests_redirectable }, 'POST';
    $web_browser->timeout(300);
    my $web_response = ();

    $web_response = $web_browser->post('http://blabla.com/travel/
    InitialSearch.do',
    [
    'fromCity' =>
    'SFO',
    'toCIty'
    => 'CVG'
    .... #the rest
    of the fields occur here
    ],
    );

    die "Error: ", $web_response->status_line()
    unless $web_response->is_success;

    my @content = $web_response->content;
    print "@content";

    When I print the content, I see the "intermediate" wait page (where it
    displays the progress bar using javascript.... => I matched the
    content with the "view source" from IExplorer)
    I am unable to capture the final air fare page. It takes time for the
    website to do the search and then display the air fare result page.
    How do I make my program wait for the actual result and not grab the
    intermediate response.

    Could anyone please help me on this?

    Regards,
    bhabs
    bhabs, Feb 12, 2008
    #1
    1. Advertising

  2. bhabs

    Ben Morrow Guest

    Re: LWP user agent grabs the intermediate wait page after POST inteadof the actual result page

    Quoth Christian Winter <>:
    > bhabs wrote:
    > > I wrote a small LWP based perl program to search the air fare from a
    > > travel website using POST.
    > >

    > [...code snipped]
    > >
    > > When I print the content, I see the "intermediate" wait page (where it
    > > displays the progress bar using javascript.... => I matched the
    > > content with the "view source" from IExplorer)
    > > I am unable to capture the final air fare page. It takes time for the
    > > website to do the search and then display the air fare result page.
    > > How do I make my program wait for the actual result and not grab the
    > > intermediate response.

    >
    > You have to simulate what the browser does, and from your
    > description, this is most likely a repeated ajax request
    > to the server. Analyze the behaviour of the javascript
    > and see how it fetches the progress state and what it
    > does once the result is calculated, then craft those
    > actions yourself. You best chances to see exactly what is going
    > on in the background is with a network sniffer like wireshark,
    > or a browser plugin like Firefox' Live HTTP Headers.


    Or http://www.research.att.com/sw/tools/wsp/ , which will write a Perl
    script to make the appropriate requests for you.

    Ben
    Ben Morrow, Feb 12, 2008
    #2
    1. Advertising

  3. Re: LWP user agent grabs the intermediate wait page after POST intead of the actual result page

    Christian Winter <> wrote:
    > bhabs wrote:
    >> I wrote a small LWP based perl program to search the air fare from a
    >> travel website using POST.
    >>

    > [...code snipped]
    >>
    >> When I print the content, I see the "intermediate" wait page (where it
    >> displays the progress bar using javascript.... => I matched the
    >> content with the "view source" from IExplorer)
    >> I am unable to capture the final air fare page. It takes time for the
    >> website to do the search and then display the air fare result page.
    >> How do I make my program wait for the actual result and not grab the
    >> intermediate response.

    >
    > You have to simulate what the browser does, and from your
    > description, this is most likely a repeated ajax request
    > to the server. Analyze the behaviour of the javascript
    > and see how it fetches the progress state and what it
    > does once the result is calculated, then craft those
    > actions yourself. You best chances to see exactly what is going
    > on in the background is with a network sniffer like wireshark,



    I like the Web Scraping Proxy for this, it logs the traffic in
    the form of LWP Perl code:

    http://www.research.att.com/sw/tools/wsp/


    > or a browser plugin like Firefox' Live HTTP Headers.



    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
    Tad J McClellan, Feb 13, 2008
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. darrel
    Replies:
    10
    Views:
    7,446
    darrel
    Mar 28, 2005
  2. Bumble
    Replies:
    2
    Views:
    116
    Tad McClellan
    Feb 28, 2004
  3. P.R.Brady

    LWP user agent query

    P.R.Brady, Aug 26, 2005, in forum: Perl Misc
    Replies:
    5
    Views:
    125
    Brian Wakem
    Aug 26, 2005
  4. Blue
    Replies:
    6
    Views:
    98
    Erik Wasser
    Jun 6, 2006
  5. Alexandre Damiron
    Replies:
    0
    Views:
    146
    Alexandre Damiron
    Nov 29, 2005
Loading...

Share This Page