LWP::UserAgent::request: Simple response: Not Found

Discussion in 'Perl Misc' started by Phil Powell, Feb 18, 2008.

  1. Phil Powell

    Phil Powell Guest

    I am using WWW::Mechanize to scrape from https://blah.foo.www.domain.com/main/index.html,
    which is a secured site that can only be obtained via Netegrity
    Siteminder-based authentication via https://blah.foo.www.domain.com/registration/login.html

    So this is what normally happens if you go manually:

    1) you type in https://blah.foo.www.domain.com/main/index.html
    2) You are not detected to be logged in so you're automatically
    redirected to https://blah.foo.www.domain.com/registration/login.html
    3) You fill out an online form with your username and password
    4) You submit, which takes you to https://blah.foo.www.domain.com/login/login.fcc
    (Siteminder) which will check out your entry for authentication
    5) You are successfully logged in, so on you to go
    https://blah.foo.www.domain.com/main/index.html

    I have a Perl script that should perform all steps 1 - 5, however it
    breaks at 5, never getting to the intended URL due to receiving a 404
    Object Not Found error, furthermore, in my LWP Debug statements I get
    "LWP::UserAgent::request: Simple response: Not Found".

    Here is how I'm trying to do it using Perl:

    #!/strawberry/perl/bin/perl

    use strict;
    use warnings;
    use HTTP::Cookies;
    use HTTP::Headers;
    use WWW::Mechanize;
    use LWP::Debug qw(+);

    require "C:\\Documents and Settings\\me\\Desktop\\subs.pl";

    if (my err = read_cfg("C:\\Documents and Settings\\me\\Desktop\
    \file.cfg")) {
    print(STDERR $err, "\n");
    exit(1);
    }

    my $mech = WWW::Mechanize->new();
    $mech->agent_alias('Windows IE 6');
    $mech->cookie_jar(HTTP::Cookies->new(autosave => 1));
    $mech->add_header('UID' => $CFG::cfg{'USERNAME'}, 'cn' =>
    $CFG::cfg{'CN'});

    # FIRST PART: GO TO INDEX WHICH WILL REDIRECT YOU TO LOGIN AS YOU
    HAVEN'T YET
    my $response = $mech->get($CFG::cfg{'URL'});
    die 'Error at ' , $CFG::cfg{'URL'}, "\n" $response->status_line, "\n
    Aborting" unless $response->is_success;
    print "\n\nSUCCESS data type: ", $response->content_type, "\n\n";

    # SECOND PART: LOGIN FROM LOGIN PAGE WITH USERNAME AND PASSWORD
    PROVIDED
    $mech->submit_form(
    form_number => 1,
    fields => {
    USER =>
    $CFG::cfg{'USERNAME'},
    PASSWORD =>
    $CFG::cfg{'PASSWORD'}
    }
    ); # WILL SUBMIT TO /registration/login_2 TO BUNDLE USERNAME/PASSWORD
    TO SEND TO SITEMINDER AGENT

    # THIRD PART: GO TO /registration/login_2 TO HANDLE LOGIN REQUEST WITH
    FORM ENTRIES
    my $content = $mech->content;
    my $urlHeader = $CFG::cfg{'URL_HEADER'};
    $content =!~ s[(href=['"]?)(/main)][$1$urlHeader$2]isg;
    $mech->update_html($content);
    $mech->submit(); # WILL SUBMIT TO SITEMINDER AGENT AND RETURN VERIFIED

    $content = $mech->content;

    ------------------------------------------------------------------------------------------------------------------------------------------

    However, it never successfully gets to Step 5), here is the output to
    hopefully explain why:

    C:\>perl "C:\\Documents and Settings\\me\\Desktop\\file.pl"
    LWP::UserAgent::new: ()
    LWP::UserAgent::request: ()
    HTTP::Cookies::add_cookie_header: Checking blah.foo.www.domain.com for
    cookies
    HTTP::Cookies::add_cookie_header: Checking .foo.www.domain.com for
    cookies
    HTTP::Cookies::add_cookie_header: Checking foo.www.domain.com for
    cookies
    HTTP::Cookies::add_cookie_header: Checking .www.domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking www.domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking .domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking .com for cookies
    LWP::UserAgent::send_request: GET https://blah.foo.www.domain.com/main/i
    ndex.html
    LWP::UserAgent::_need_proxy: Not proxied
    LWP::protocol::http::request: ()
    LWP::UserAgent::request: Simple response: Found
    LWP::UserAgent::request: ()
    HTTP::Cookies::add_cookie_header: Checking blah.foo.www.domain.com for
    cookies
    HTTP::Cookies::add_cookie_header: Checking .foo.www.domain.com for
    cookies
    HTTP::Cookies::add_cookie_header: Checking foo.www.domain.com for
    cookies
    HTTP::Cookies::add_cookie_header: Checking .www.domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking www.domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking .domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking .com for cookies
    LWP::UserAgent::send_request: GET https://blah.foo.www.domain.com/registration/login_
    1?
    TYPE=33554432&REALMOID=06-3fc2790b-96fd-0006-0000-1e2300001e23&GUID=&SMAUTHREA
    SON=0&METHOD=GET&SMAGENTNAME=$SM$NKtSpmLX2VhxLeg5Fc91DK221%2bP2Wf
    %2bMwczu%2fbNNC
    LE%3d&TARGET=$SM$https%3a%2f%2fblah%2efoo%2ewww%2edomain%2ecom%2fmain
    %2f
    2f%2findex%2ehtml
    LWP::UserAgent::_need_proxy: Not proxied
    LWP::protocol::http::request: ()
    LWP::protocol::collect: read 469 bytes
    LWP::UserAgent::request: Simple response: Found
    LWP::UserAgent::request: ()
    HTTP::Cookies::add_cookie_header: Checking blah.foo.www.domain.com for
    cookies
    HTTP::Cookies::add_cookie_header: Checking .foo.www.domain.com for
    cookies
    HTTP::Cookies::add_cookie_header: Checking foo.www.domain.com for
    cookies
    HTTP::Cookies::add_cookie_header: Checking .www.domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking www.domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking .domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking .com for cookies
    LWP::UserAgent::send_request: GET https://blah.foo.www.domain.com/registration/login_
    1/?
    TYPE=33554432&REALMOID=06-3fc2790b-96fd-0006-0000-1e2300001e23&GUID=&SMAUTHRE
    ASON=0&METHOD=GET&SMAGENTNAME=$SM$NKtSpmLX2VhxLeg5Fc91DK221%2bP2Wf
    %2bMwczu%2fbNN
    CLE%3d&TARGET=$SM$https%3a%2f%2frup2%2edev1%2eprime%2eirs%2ecom
    %2fsemail%2fviews
    %2f%2findex%2ehtml
    LWP::UserAgent::_need_proxy: Not proxied
    LWP::protocol::http::request: ()
    LWP::protocol::collect: read 4096 bytes
    LWP::protocol::collect: read 284 bytes
    LWP::protocol::collect: read 4096 bytes
    LWP::protocol::collect: read 619 bytes
    HTTP::Cookies::extract_cookies: Set cookie JSESSIONID =>
    0000CNUANTLZENJHYTKEG3S
    USXQ:10i31rv29
    LWP::UserAgent::request: Simple response: OK


    SUCCESS data type: text/htmlcharset=ISO8859-1

    LWP::UserAgent::request: ()
    HTTP::Cookies::add_cookie_header: Checking blah.foo.www.domain.com for
    cookies
    HTTP::Cookies::add_cookie_header: - checking cookie path=/
    HTTP::Cookies::add_cookie_header: - checking cookie
    JSESSIONID=0000CNUANTLZENJH
    YTKEG3SUSXQ:10i31rv29
    HTTP::Cookies::add_cookie_header: it's a match
    HTTP::Cookies::add_cookie_header: Checking .foo.www.domain.com for
    cookies
    HTTP::Cookies::add_cookie_header: Checking foo.www.domain.com for
    cookies
    HTTP::Cookies::add_cookie_header: Checking .www.domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking www.domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking .domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking .com for cookies
    LWP::UserAgent::send_request: POST https://blah.foo.www.domain.com/registration/login
    _2/0,,,00.html
    LWP::UserAgent::_need_proxy: Not proxied
    LWP::protocol::http::request: ()
    LWP::protocol::collect: read 815 bytes
    LWP::UserAgent::request: Simple response: OK
    LWP::UserAgent::request: ()
    HTTP::Cookies::add_cookie_header: Checking blah.foo.www.domain.com for
    cookies
    HTTP::Cookies::add_cookie_header: - checking cookie path=/
    HTTP::Cookies::add_cookie_header: - checking cookie
    JSESSIONID=0000CNUANTLZENJH
    YTKEG3SUSXQ:10i31rv29
    HTTP::Cookies::add_cookie_header: it's a match
    HTTP::Cookies::add_cookie_header: Checking .foo.www.domain.com for
    cookies
    HTTP::Cookies::add_cookie_header: Checking foo.www.domain.com for
    cookies
    HTTP::Cookies::add_cookie_header: Checking .www.domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking www.domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking .domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking .com for cookies
    LWP::UserAgent::send_request: POST https://blah.foo.www.domain.com/login/login.f
    cc
    LWP::UserAgent::_need_proxy: Not proxied
    LWP::protocol::http::request: ()
    HTTP::Cookies::extract_cookies: Set cookie FORMCRED => +Jy1O0pLR06/
    i8PvjzNJjBkjQ
    gIms8tauo/Va9iPbL7dVx2DsUD2UTEg1ebOw
    +yVzRzlV3t7ziD8EFjzHX1WYBO50h2gCkRGgk2Z1BVCF
    FJ97ixaop+sW3F39bQWPGpanf6nDrJYkzY=
    LWP::UserAgent::request: Simple response: Found
    LWP::UserAgent::request: ()
    HTTP::Cookies::add_cookie_header: Checking blah.foo.www.domain.com for
    cookies
    HTTP::Cookies::add_cookie_header: - checking cookie path=/
    HTTP::Cookies::add_cookie_header: - checking cookie
    JSESSIONID=0000CNUANTLZENJH
    YTKEG3SUSXQ:10i31rv29
    HTTP::Cookies::add_cookie_header: it's a match
    HTTP::Cookies::add_cookie_header: Checking .foo.www.domain.com for
    cookies
    HTTP::Cookies::add_cookie_header: - checking cookie path=/
    HTTP::Cookies::add_cookie_header: - checking cookie FORMCRED=
    +Jy1O0pLR06/i8Pvjz
    NJjBkjQgIms8tauo/Va9iPbL7dVx2DsUD2UTEg1ebOw
    +yVzRzlV3t7ziD8EFjzHX1WYBO50h2gCkRGgk
    2Z1BVCFFJ97ixaop+sW3F39bQWPGpanf6nDrJYkzY=
    HTTP::Cookies::add_cookie_header: it's a match
    HTTP::Cookies::add_cookie_header: Checking foo.www.domain.com for
    cookies
    HTTP::Cookies::add_cookie_header: Checking .www.domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking www.domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking .domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking domain.com for cookies
    HTTP::Cookies::add_cookie_header: Checking .com for cookies
    LWP::UserAgent::send_request: GET https://blah.foo.www.domain.com/main/i
    ndex.html
    LWP::UserAgent::_need_proxy: Not proxied
    LWP::protocol::http::request: ()
    LWP::protocol::collect: read 42 bytes
    HTTP::Cookies::extract_cookies: Set cookie FORMCRED =>
    HTTP::Cookies::extract_cookies: Set cookie smretry_cookie => true
    HTTP::Cookies::extract_cookies: Set cookie SMSESSION =>
    NffP0fewmFwhTPvRVx/EXE1H
    5236J2xl/olTx6HGAv5Hs3pVASSK6Px4J/9kEX5VuYN/f
    +XgOIDojsKNrCqFZZK4vPxFtWBWIuifF/n2
    VWi0h02TP0TvDPhSQ63uzau1cHPygGFSBSA3qLl+g0S3s/llOT27KR6/
    Vz35ddJWxiNHkU4b6f7G6tll
    /dKJjH7STmJ5stZNWIglZu7IAgRcnn5S7txqNynNkn9XevKI60q/
    yMqTf1oIb5ZrMHRQbXpB3SEkBFyf
    COC9AR7/e2+U3qyMSAJ6p6I0iKNB/
    GHzwWztJ6pEc1OJNLPE78BV9rXcom0FjWjOopWq3rrGormWKZPa
    tVY2K3+GeUfn/5ygqlTzjUycF+8c3v6J/PPGCZ+3bFQvIc8vxtXomCtkTQ4cSnUk/
    mR2+MndMD5hK6Wc
    SxdaktyUEIF7d621AEvtxXFYVI98cb5a25l6jLWaVrMHN76wPns
    +DkitKgwswXyrgCNbGAFEjrgzBMbq
    qeFfzDOJXVUvsTuVXPKD2LJYcN4HwaoK1h3Nw+4NUhHGcz9tvisv20zGMMKIkyIMl/
    LXVwe8vQ0MCme4
    0WFcIzkToDXZsSe7E7Mhk7D9t6u/SE2oeFheS07qyFW217MGTsOgbnN80Mv0N/
    g6TsKxTOROB3QTuo43
    rXdOmH9zlmxtwP4uJ13KvN8JDtEdj5MK/
    kN8ePQtuMK240f3vfcYqzI33nXnfhI2N6hkqo6yyqGjW0dO
    RQAMYn5/
    qsI0yEbFf2h1oVECf8SMYF2XKEz32uMH9vwPC14FwGJPaVeOgB9QMrFgVFDyzf2xhrODtxcI
    Py+oMGxROADxOo1ibIrBEwBKlJlg7moahW0OnZRSF16n8R2yQu6ctuVy2xSGAErXEbWQA
    +YeeI+iaavK
    GaHyqNaYxv5M95SXg9GPj5H3oD2z1hyouU9JvUTsTGBdM8w2QwVCXcFmarP0Hb9s66YjAJiFPCvNztK8
    mwATiJvARlJV76TfVzu86BHD4+sPIWlyoehHDUSoPSaSlU/
    SwWV05gFqgx8rNg26zmAchudkqZXFMLx8
    JdK80rs8x4JaBiIhNzfXyYzKo/qfm2T4X7D3E4smxdqWF9fzTaXd9ke5Wp/
    iCDq9HfEbr0PTBXDpLl8u
    YdISKfD9alvHIop8d3JDucRMswbtx3K0Mgx+rQAvKMscAlvHVK/
    TzZX6QBRUdvpzl1oWtO3bn3HNjdSv
    AFVK7dZCTZVcZw==
    HTTP::Cookies::extract_cookies: Set cookie smretry_cookie => true
    LWP::UserAgent::request: Simple response: Not Found
    Phil Powell, Feb 18, 2008
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Vinay Gupta
    Replies:
    2
    Views:
    1,096
    Michael Schlenker
    Jul 27, 2004
  2. jarkun
    Replies:
    1
    Views:
    106
    Charles DeRykus
    Sep 5, 2003
  3. danglesocket

    not able to access a URL with LWP::UserAgent.

    danglesocket, Sep 11, 2003, in forum: Perl Misc
    Replies:
    6
    Views:
    134
    danglesocket
    Sep 12, 2003
  4. P.R.Brady

    LWP::UserAgent and 404 page not found

    P.R.Brady, Jun 22, 2005, in forum: Perl Misc
    Replies:
    4
    Views:
    361
    P.R.Brady
    Jun 24, 2005
  5. scottny07
    Replies:
    1
    Views:
    197
    Peter Scott
    Mar 30, 2007
Loading...

Share This Page