how to browse using urllib2 and cookeilib the correct way

J

jnair

Hi ,

I am using python2.4 "urllib2" and "cookelib".
In line "5" below i provide my credentials to
login into a web site.During the first attempt i "fail",
judging from the output of line "6".
I try again and the second time i succeed,judging
from the output of line "8".

Now using the "twill" module (http://www.idyll.org/~t/www-tools/twill/)
it suceeds in the first attempt(lines "10" to "19").

can anybody explain, in the first case why i need to do two attempts.


1) >>>import urllib2,urllib,cookielib
2) >>>cj = cookielib.CookieJar()
3) >>>opener = urllib2.build_opener( urllib2.HTTPCookieProcessor(cj))
4) >>>data = urllib.urlencode( { "username" : "user" , "password" :
"****" } )
5) >>>fp = opener.open(
"https://rhn.redhat.com/rhn/LoginSubmit.do",data )
6) >>>fp.url

'https://rhn.redhat.com/rhn/ReLogin.do?url_bounce=/network/index.pxt'
7) >>>fp = opener.open(
"https://rhn.redhat.com/rhn/LoginSubmit.do",data )
8) >>>fp.url
'https://rhn.redhat.com/network/index.pxt'



#twill module

10) >>>from twill.commands import *
11) >>>go("https://rhn.redhat.com")
12) ==> at https://rhn.redhat.com
13) 'https://rhn.redhat.com'
14) >>>fv("2","username","eipl")
15) >>>fv("2","password","ensim1234")
16) >>>submit()
17) Note: submit is using submit button: name="None", value=" Â Sign
In "
18) >>>url('https://rhn.redhat.com/network/index.pxt')
19) 'https://rhn.redhat.com/network/index.pxt'

Thanks
Jitu
 
E

Edward Elliott

can anybody explain, in the first case why i need to do two attempts.

I would guess it's because redhat requires your browser to submit a session
cookie with the login form. In the urllib2 example, the first request you
make tries to submit login form data directly. Since it's your first hit
on their site, you don't have a cookie yet. People browsing interactively
would at least load the login page first before submitting it.

Your twill example takes care of this by requesting a page before trying to
login.

That would be my guess.
 
J

John J. Lee

Edward Elliott said:
I would guess it's because redhat requires your browser to submit a session
cookie with the login form. In the urllib2 example, the first request you
make tries to submit login form data directly. Since it's your first hit
on their site, you don't have a cookie yet. People browsing interactively
would at least load the login page first before submitting it.

Your twill example takes care of this by requesting a page before trying to
login.

That would be my guess.

Uh, yeah you're right actually. Forget what I said about Refresh...


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,756
Messages
2,569,533
Members
45,007
Latest member
OrderFitnessKetoCapsules

Latest Threads

Top