Password-protected login problem


J

jim simpson

I am trying to automate the process of loggin in to a number of
password-protected financial sites and downloading my statements from them.
I've succeeded with about one-half of them. In each case, I've started with
the HTML for the login page, stripped it down to just the login FORM, and
logged in using that abbreviated login page in IE. Once I get that shortened
form working, I write a perl script using the same INPUT fields and ACTION
URL as that form. In about half the cases, that works fine -- in the other
half I'm unable to get logged in. I get an "i The page cannot be displayed"
page returned.

For example, the following script fails to login to the Scudder website
athttp://www.myscudder.com/t/index.jhtml. You can see below the abbreviated
(form-only) login page, which gets me logged in just fine.

It appears that something else is required to be submitted for those sites
where this procedure fails to login.

For obvious reasons I have obscured the User Name and the Password. I
realize someone trying to help will be handicapped by not being able to
access the site.

Can someone please help me determine what else is required for the several
sites that have failed to login using this procedure?

Thanks,

Jim

########################################################
#! perl -w

use strict;

use LWP;
use HTTP::Cookies;

my $https_loginform =
"https://investments.scudder.com/t/index.jhtml?_DARGS=/t/home/customiz
e_areas%2Fhome_login.jhtml";

my $https_user = 'xxxxxx';
my $https_pass = 'xxxxxx';

my $browser = LWP::UserAgent->new();
$browser->protocols_allowed( ['https'] );
$browser->cookie_jar(HTTP::Cookies->new(file => "abccookies.txt", autosave
=> 1));

# Following is the "abbreviated" login in page, which works fine to get
# me logged in with a browser.
#
#<html>
# <head>
# <title>Scudder Login</title>
# </head>
# <body>
# <form method="post"
action="https://investments.scudder.com/t/index.jhtml?_DARGS=/t/home/c
ustomize_areas%2Fhome_login.jhtml">
# <input type="hidden" name="/direct/login/LoginForm.login"
value="xxxxxx">
# <input type="hidden" name="_D:/direct/login/LoginForm.login" value="
">
# <input type="hidden" name="/direct/login/LoginForm.password"
value="xxxxxx">
# <input type="hidden" name="_D:/direct/login/LoginForm.password"
value=" ">
# <input type="submit" name="submit" value="&nbsp;&nbsp;Log
in&nbsp;&nbsp;">
# <input type="hidden" name="_D:submit" value=" ">
# <input type="hidden" name="/direct/login/LoginForm.successUrl"
value="/t/login/ruby_slippers.jhtml">
# <input type="hidden" name="_D:/direct/login/LoginForm.successUrl"
value=" ">
# <input type="hidden" name="/direct/login/LoginForm.profileResetUrl"
value="/t/index.jhtml?content=/t/login/reset_password.jhtml">
# <input type="hidden"
name="_D:/direct/login/LoginForm.profileResetUrl" value=" ">
# <input type="hidden" name="/direct/login/LoginForm.alternateUrl"
value="/t/index.jhtml?content=/t/registration/index.jhtml">
# <input type="hidden" name="_D:/direct/login/LoginForm.alternateUrl"
value=" ">
# <input type="hidden" name="/direct/login/LoginForm.errorUrl"
value="/t/index.jhtml?content=/t/login/returnlogin.jhtml">
# <input type="hidden" name="_D:/direct/login/LoginForm.errorUrl"
value=" ">
# </form>
# </body>
#</html>
#
# End of abbreviated login page.

my $response = $browser->post($https_loginform,
[
'/direct/login/LoginForm.login' => $https_user,
'_D:/direct/login/LoginForm.login' => ' ',
'/direct/login/LoginForm.password' => $https_pass,
'_D:/direct/login/LoginForm.password' => ' ',
'submit' => '&nbsp;&nbsp;Log
in&nbsp;&nbsp;',
'_D:submit' => ' ',
'/direct/login/LoginForm.successUrl' =>
'/t/login/ruby_slippers.jhtml',
'_D:/direct/login/LoginForm.successUrl' => ' ',
'/direct/login/LoginForm.profileResetUrl' =>
'/t/index.jhtml?content=/t/login/reset_password.jhtml',
'_D:/direct/login/LoginForm.profileResetUrl' => ' ',
'/direct/login/LoginForm.alternateUrl' =>
'/t/index.jhtml?content=/t/registration/index.jhtml',
'_D:/direct/login/LoginForm.alternateUrl' => ' ',
'/direct/login/LoginForm.errorUrl' =>
'/t/index.jhtml?content=/t/login/returnlogin.jhtml',
'_D:/direct/login/LoginForm.errorUrl' => ' '
] );

if ($response->is_error())
{
printf "Error on POST: %s\n", $response->status_line;
}
else
{
open my $fh, '> abclogin.html' or die $!;
print $fh $response -> content;
close $fh;
}
 
Ad

Advertisements

J

jim simpson

Bob, Thanks a lot for your response.

I'll try your netscape suggestion and look into "Web Scraping Proxy". I had
already checked that no Java Script is involved in this site, however it is
involved in some of the others.

Thank you very much.

Jim
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top