Logging into and parsing a website using Perl

Discussion in 'Perl Misc' started by Antwerp, Feb 16, 2005.

  1. Antwerp

    Antwerp Guest

    Hi,

    I'm trying to create a perl script that will log into a website (the login
    form uses POST), navigate to several pages, and append the (html) content parsed
    from those pages to a seperate log file. I'm not very familiar with this aspect
    of perl, and have been having some trouble in the POSTing of the form data,
    while using cookies to log in.

    Visting the site automatically redirects you to a login page. Once you fill
    out the login form and click the submit button, you are redirected to the main
    site index. The login form uses cookies to establish identity.

    I've searched through several google resources, and have built upon what
    I've read. I *believe* I am now storing the cookies I come across when loading
    the page (using a cookie jar), however, this doesn't seem to be allowing me to
    log on to the secure areas of the site. I suspect this is because I am not
    properly sending the appropriate cookies with every request, or else am not
    properly POSTing the login form data, or otherwise not 'following' the redirect
    to the secure area of the site.

    At this point in time, I am just trying to get my script to login to the
    page (using my appropriate credentials), and then display the site index.

    If someone could please offer me some insight and direction into using the
    appropriate modules, or else point out any flaws in my code (included below).

    ------START CODE------

    #!/usr/bin/perl -w

    use LWP::Simple;
    use LWP::UserAgent;

    use HTML::TokeParser;
    use HTML::parser;

    use HTTP::Request::Common;
    use HTTP::Cookies;

    use POSIX;

    #----Variables----#
    $t_url='http://www.memberplushq.com/pe/register/include/processlogin.jsp';
    # This is where the log in form is located - once logged in, you are
    redirected to the secured content (below).
    $s_url='http://www.memberplushq.com/pe/index.jsp';
    # This is where the secured content is located. If you aren't logged in, you
    are redirected to the above.

    $login='My_username';
    $password='My Password';
    $submit_value='Login';
    #----/Variables----#

    #----User Agent Config----#
    $ua = LWP::UserAgent->new;
    $ua->cookie_jar(HTTP::Cookies->new(file => "cookies.txt", autosave => 1,
    ignore_discard => 1));
    #----/User Agent Config----#

    #----Really posting my buttons----#
    $content = $ua->request(POST $t_url , [ login_name => $login , password =>
    $password , loginSubmit => $submit_value ] );
    $ua->request(POST $t_url , [ login_name => $login , password => $password ,
    loginSubmit => $submit_value ] );
    #----/Really pressing my buttons----#

    #----Completing----#
    print "$content";
    #----Completing----#

    ------END CODE------

    As you can tell, I am *trying* to get through to the secure site. However, this
    is proving to be somewhat interesting.

    I would appreciate any guidance you can offer,

    AntWerp
    Antwerp, Feb 16, 2005
    #1
    1. Advertising

  2. Antwerp

    Peter Scott Guest

    In article <a4yQd.6434$>,
    "Antwerp" <> writes:
    >Hi,
    >
    > I'm trying to create a perl script that will log into a website (the login
    >form uses POST), navigate to several pages, and append the (html) content parsed
    >from those pages to a seperate log file. I'm not very familiar with this aspect
    >of perl, and have been having some trouble in the POSTing of the form data,
    >while using cookies to log in.


    Way too much work to go to. Get WWW::Mechanize from CPAN and you can get
    rid of most of your code.

    --
    Peter Scott
    http://www.perlmedic.com/
    http://www.perldebugged.com/
    Peter Scott, Feb 16, 2005
    #2
    1. Advertising

  3. Antwerp

    Antwerp Guest

    Thank you for your excellent suggestion.

    :)

    Altrus


    "Peter Scott" <> wrote in message
    news:7RHQd.407246$Xk.367396@pd7tw3no...
    : In article <a4yQd.6434$>,
    : "Antwerp" <> writes:
    : >Hi,
    : >
    : > I'm trying to create a perl script that will log into a website (the
    login
    : >form uses POST), navigate to several pages, and append the (html) content
    parsed
    : >from those pages to a seperate log file. I'm not very familiar with this
    aspect
    : >of perl, and have been having some trouble in the POSTing of the form data,
    : >while using cookies to log in.
    :
    : Way too much work to go to. Get WWW::Mechanize from CPAN and you can get
    : rid of most of your code.
    :
    : --
    : Peter Scott
    : http://www.perlmedic.com/
    : http://www.perldebugged.com/
    Antwerp, Feb 16, 2005
    #3
  4. Antwerp

    Antwerp Guest

    Thank you :)

    AntWerp


    "Bill Segraves" <> wrote in message
    news:JISQd.2909$...
    : "Antwerp" <> wrote in message
    : news:a4yQd.6434$...
    : > Hi,
    : >
    : > I'm trying to create a perl script that will log into a website (the
    : login
    : > form uses POST), navigate to several pages, and append the (html) content
    : parsed
    : > from those pages to a seperate log file. I'm not very familiar with this
    : aspect
    : > of perl, and have been having some trouble in the POSTing of the form
    : data,
    : > while using cookies to log in.
    : >
    : > Visting the site automatically redirects you to a login page. Once you
    : fill
    : > out the login form and click the submit button, you are redirected to the
    : main
    : > site index. The login form uses cookies to establish identity.
    : >
    : > I've searched through several google resources, and have built upon
    : what
    : > I've read. I *believe* I am now storing the cookies I come across when
    : loading
    : > the page (using a cookie jar), however, this doesn't seem to be allowing
    : me to
    : > log on to the secure areas of the site. I suspect this is because I am not
    : > properly sending the appropriate cookies with every request, or else am
    : not
    : > properly POSTing the login form data, or otherwise not 'following' the
    : redirect
    : > to the secure area of the site.
    : >
    : > At this point in time, I am just trying to get my script to login to
    : the
    : > page (using my appropriate credentials), and then display the site index.
    : >
    : > If someone could please offer me some insight and direction into using
    : the
    : > appropriate modules
    :
    : <snip>
    :
    : See Randal L. Schwartz' excellent article "Automatically Testing a Form":
    :
    : http://www.stonehenge.com/merlyn/WebTechniques/col43.html
    :
    : for something to help you get started.
    :
    : In addition, you should review the documentation for LWP::UserAgent,
    : HTTP::Request::Common, and HTTP::Cookies to see how to perform the steps you
    : wish to perform after you login.
    :
    : Good luck!
    : --
    : Bill Segraves
    :
    :
    Antwerp, Feb 17, 2005
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Antwerp
    Replies:
    0
    Views:
    1,205
    Antwerp
    Feb 15, 2005
  2. Brian Henry

    automatically logging into a website

    Brian Henry, Apr 29, 2005, in forum: ASP .Net
    Replies:
    3
    Views:
    4,299
    Herfried K. Wagner [MVP]
    Apr 29, 2005
  3. MotorcycleIke

    logging into a website on an external server

    MotorcycleIke, Sep 21, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    453
    MotorcycleIke
    Sep 22, 2005
  4. dpoehls
    Replies:
    2
    Views:
    481
    Fuzzyman
    Jan 25, 2006
  5. James_Dean
    Replies:
    0
    Views:
    537
    James_Dean
    Jun 27, 2007
Loading...

Share This Page