Understanding Mechanize

Discussion in 'Perl Misc' started by Bryan Balfour, Aug 19, 2005.

  1. hi, I'm new to perl and have been trying to get into using
    WWW::Mechanize on WindowsXP. I'm stuck on trying to use it to submit a
    username and password. The following snippet of code shows what I'm
    trying to do:

    my($mech) = WWW::Mechanize->new(autocheck => 1,
    cookie_jar => $cookieJar);
    $mech->get($url);
    $mech->follow_link(text => "Begin");
    if($mech->content() =~ m/Please log in./)
    {
    print("***** Please log in. *****\n");
    outputContent($mech->content(), "Login.html");
    $mech->set_visible($userName, $password);
    my($response) = $mech->click();
    if($response->content() =~ m/Please try again/)
    {
    print("Logon details incorrect\n");
    }
    outputContent($response->content(), "Loggedin.html");
    }

    sub outputContent
    {
    my($content, $fileName) = @_;
    my($tree) = HTML::TreeBuilder->new;
    $tree->ignore_ignorable_whitespace(0);
    $tree->no_space_compacting(1);
    $tree->parse($content) || die $!;
    open(OUT, ">$fileName") || die "Can't write: $!";
    print OUT $tree->as_HTML;
    close(OUT);
    $tree->delete;
    }

    I'm deliberately setting an incorrect password so expect to see the
    messages:

    ***** Please log in. *****
    Logon details incorrect

    I'm getting the first one but not the second. Conclusion: I've got a
    problem with the 'set_visible' or 'click' methods.

    The 2 files output by outputContent are identical and are the html for
    entering username and password. Looking at these two input fields, they
    are NOT defined on a form but in a table. (Is the html treated as one
    form by default?)

    I've tried using other Mechanize methods such as:

    $mech->form_number(1);
    $mech->field('user', $userName);
    $mech->field('password', $password);
    my($response) = $mech->click('submit');

    but get the same result. (Curiously, the page contains only one form at
    the beginning but that is hidden. The above code generates no error
    messages so what form is form_number(1)?)

    Am I missing a step as the 'click' method is not returning a new
    response but the old one?

    I'd appreciate your comments, hints etc.

    Bryan
    Bryan Balfour, Aug 19, 2005
    #1
    1. Advertising

  2. Bryan Balfour

    Brian Wakem Guest

    Bryan Balfour wrote:

    > Am I missing a step as the 'click' method is not returning a new
    > response but the old one?
    >
    > I'd appreciate your comments, hints etc.
    >
    > Bryan



    Most WWW::Mechanize problems are caused by javascript on the page you are
    playing with. Does it use any javascript in the form or upon submission?


    --
    Brian Wakem
    Email: http://homepage.ntlworld.com/b.wakem/myemail.png
    Brian Wakem, Aug 19, 2005
    #2
    1. Advertising

  3. Brian Wakem wrote:
    > Most WWW::Mechanize problems are caused by javascript on the page you are
    > playing with. Does it use any javascript in the form or upon submission?
    >

    Yes, it's full of it. For example:

    <script language="javascript">
    <!--
    if(window!=top)
    top.location.href=location.href;
    // -->
    </script>
    <script language="javascript">
    <!--
    function doSubmit() {if(document.forms[0].btn.value == "" ) {return
    false;} else { return true; } }// -->
    </script>
    <script language="JavaScript">
    function userMessage(message)
    {
    document.write("<table border='0' cellpadding='0' cellspacing='1'
    bgcolor='#535353' width='695' height='40' align='center'>\n");
    document.write("<tr><td style='border-left: 1px solid black;
    border-right: 1px solid black; border-bottom: 1px solid black'
    valign='center' align='center' colspan='3' bgcolor='#ffffff'>\n");
    document.write("<font class='cn'>"+message+"</font>\n");
    document.write("</td></tr>\n");
    document.write("</table>\n");
    }
    document.cookie = "Enabled=true";
    var cookiesEnabled = false;
    var cookieValid = document.cookie;
    if (cookieValid.indexOf("Enabled=true") != -1) cookiesEnabled = true;
    else cookiesEnabled = false;
    //cookiesEnabled = false;
    </script>

    What sort of things should I be looking for?

    I appreciate your help....

    Bryan
    Bryan Balfour, Aug 19, 2005
    #3
  4. Bryan Balfour

    Brian Wakem Guest

    Bryan Balfour wrote:

    > Brian Wakem wrote:
    >> Most WWW::Mechanize problems are caused by javascript on the page you are
    >> playing with. Does it use any javascript in the form or upon submission?
    >>

    > Yes, it's full of it. For example:
    >
    > <script language="javascript">


    > </script>
    >
    > What sort of things should I be looking for?
    >
    > I appreciate your help....
    >
    > Bryan



    WWW:Mechanize does not 'do' javascript, so you are screwed. Well not quite,
    you have to work out what the javascript does to the form params and
    contruct your own POST request accordingly.


    --
    Brian Wakem
    Email: http://homepage.ntlworld.com/b.wakem/myemail.png
    Brian Wakem, Aug 19, 2005
    #4
  5. > WWW:Mechanize does not 'do' javascript, so you are screwed. Well not quite,
    > you have to work out what the javascript does to the form params and
    > contruct your own POST request accordingly.
    >

    I was afraid that would be the case. Looks like I'll need to learn a
    bit of javascript.

    As a thought, do you know if there are any tools around that would
    enable me to run the html I've captured (or the actual html) and show
    me what's posted? (Ideally single stepping with watches.)

    Thanks for taking the time and trouble of replying. No doubt I'll be
    posting again soon as I delve into POST requests etc.

    Bryan
    Bryan Balfour, Aug 19, 2005
    #5
  6. Bryan Balfour

    Brian Wakem Guest

    Bryan Balfour wrote:

    >> WWW:Mechanize does not 'do' javascript, so you are screwed. Well not
    >> quite, you have to work out what the javascript does to the form params
    >> and contruct your own POST request accordingly.
    >>

    > I was afraid that would be the case. Looks like I'll need to learn a
    > bit of javascript.
    >
    > As a thought, do you know if there are any tools around that would
    > enable me to run the html I've captured (or the actual html) and show
    > me what's posted? (Ideally single stepping with watches.)
    >
    > Thanks for taking the time and trouble of replying. No doubt I'll be
    > posting again soon as I delve into POST requests etc.
    >
    > Bryan



    I use firefox with the Live HTTP Headers plugin -
    http://livehttpheaders.mozdev.org/ - which will show all the communication
    going back and forth. Just submit the form and watch what happens. Then
    emulate this with $mech->post

    push @{ $mech->requests_redirectable }, 'POST'; # in case of 302 redirect

    $mech->post( $url,
    Content => [
    "username" => "user",
    "password" => "pass",
    "etc" => "etc",
    ],
    );

    WWW::Mechanize uses LWP::UserAgent, read more at
    http://cpan.uwinnipeg.ca/htdocs/libwww-perl/LWP/UserAgent.html


    --
    Brian Wakem
    Email: http://homepage.ntlworld.com/b.wakem/myemail.png
    Brian Wakem, Aug 19, 2005
    #6
  7. "Brian Wakem" <> wrote in message
    news:...
    > I use firefox with the Live HTTP Headers plugin -
    > http://livehttpheaders.mozdev.org/ - which will show all the communication
    > going back and forth. Just submit the form and watch what happens. Then
    > emulate this with $mech->post


    You can also use the web scraping proxy to catch all the HTTP traffic
    http://www.research.att.com/~hpk/wsp/
    Brian Helterline, Aug 19, 2005
    #7
  8. Bryan Balfour

    Guest

    Bryan Balfour wrote:
    > > WWW:Mechanize does not 'do' javascript, so you are screwed. Well not quite,
    > > you have to work out what the javascript does to the form params and
    > > contruct your own POST request accordingly.
    > >

    > I was afraid that would be the case. Looks like I'll need to learn a
    > bit of javascript.
    >
    > As a thought, do you know if there are any tools around that would
    > enable me to run the html I've captured (or the actual html) and show
    > me what's posted? (Ideally single stepping with watches.)
    >
    > Thanks for taking the time and trouble of replying. No doubt I'll be
    > posting again soon as I delve into POST requests etc.
    >


    You can also see what LWP is doing behind the curtain by adding:


    use LWP::Debug qw(+);


    hth,
    --
    Charles DeRykus
    , Aug 19, 2005
    #8
  9. Thanks for that. I'll give it a go.

    P.S. Thanks too to all the others who took the trouble of replying with
    their suggestions.

    Bryan
    Bryan Balfour, Aug 20, 2005
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. zoewu

    help with Perl Mechanize

    zoewu, Feb 27, 2004, in forum: Perl
    Replies:
    0
    Views:
    2,184
    zoewu
    Feb 27, 2004
  2. bruce
    Replies:
    1
    Views:
    287
    John J. Lee
    Jul 9, 2006
  3. bruce

    Mechanize-Browser question..

    bruce, Jul 8, 2006, in forum: Python
    Replies:
    1
    Views:
    366
    Tal Einat
    Jul 9, 2006
  4. bruce

    unistall python mechanize

    bruce, Jul 10, 2006, in forum: Python
    Replies:
    1
    Views:
    284
    John J. Lee
    Jul 10, 2006
  5. bruce
    Replies:
    0
    Views:
    1,331
    bruce
    Jul 10, 2006
Loading...

Share This Page