cookielib incorrectly escapes cookie

  • Thread starter =?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=
  • Start date
?

=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=

Hello,

I have some very serious trouble getting cookes to work. After a lot
of work (urllib2 is severly underdocumented, arcane and overengineerd
btw) I'm finally able to accept cookes from a server. But I'm still
unable to return them to a server. Specifically the script im trying
to do logs on to a server, get a session cookie and then tries to
access a secure page using the same session cookie. But the cookie
header cookielib produces is very different from the header it
received.

This example demonstrates it:

import cookielib
import urllib
import urllib2

# Install an opener that can handle cookies
policy = cookielib.DefaultCookiePolicy(rfc2965 = True)
cj = cookielib.CookieJar(policy)
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
urllib2.install_opener(opener)

# Login to the server
url = "http://server/dologin.htm"
params = urllib.urlencode({"user" : "foo", "pass" : "pwd"})

req = urllib2.Request(url, params)
handle = urllib2.urlopen(req)

# So far so good, the request should have added a cookie to the jar.
# The cookie header looks like this:
#
# Set-Cookie: SessionId="acf941a1fb4895ed"; Version=1; Path=/
#
assert len(cj) == 1

# Hack around the code in cookielib.CookieJar._cookie_attrs(),
# specifically the part that quotes, search for "quote_re" in the
# file cookielib.py.
cj._cookies_for_request(req)[0].version = 0

# Now request a secure page from the server
req = urllib2.Request("http://server/secure.htm")
cj.add_cookie_header(req)
handle = urllib2.urlopen(req)

# Here is where it doesn't work unless the hack is applied. The cookie
# header that is sent without the hack looks like this:
#
# Cookie: $Version=1; SessionId=\"66b908e5025d93ed\"; $Path="/"
#
# It is not accepted by the server, probably because the SessionID
# string is wrong.
 
J

John J. Lee

BJörn Lindqvist said:
I have some very serious trouble getting cookes to work. After a lot
of work (urllib2 is severly underdocumented, arcane and overengineerd
btw) I'm finally able to accept cookes from a server. But I'm still

And a good day to you too ;-)

In passing, there's a new HOWTO document on urllib2 here, which you
may find helpful:

http://svn.python.org/view/python/trunk/Doc/howto/urllib2.rst?rev=46062&view=markup


Doesn't seem to be part of the build process yet, so not available yet
in nicely-formatted HTML form on the python.org website -- I guess
it's included in HTML format in 2.5 beta1, though.

Note that that document is substantially rewritten over the version
that was originally on Michael's web site, from which the HOWTO
originally came. I haven't checked the version on Michael's website
has been updated recently, so use the version linked to above instead.

unable to return them to a server. Specifically the script im trying
to do logs on to a server, get a session cookie and then tries to
access a secure page using the same session cookie. But the cookie
header cookielib produces is very different from the header it
received.

Well (sigh), I didn't make all that up, you know ;-) Believe it or
not, that's what's supposed to happen if you send Version=1 cookies
(though few browsers ever supported it). In case it's your own
server, I should note that I don't know of any reason for an internet
server ever to send Version=1 cookies, given what the majority of
browsers actually do. However, since the cookie protocols (plural)
are, in practice, ill-defined (which is no individual's fault,
really), cases that work in popular browsers should usually be fixed.

Please test to make sure your problem goes away with Python 2.5 beta1:
I believe this bug is already fixed. Please do try it though: it's
unlikely that anybody else has tested the fix. I think beta2 is due
on Wednesday 12th, so it's advisable to get in quick if you want this
to work in 2.5 (please Cc: me personally to let me know whether it
works for you).

Note that it should work for you in Python 2.5 if and only if (not
rfc2965 or rfc2109_as_netscape) is true, where rfc2109_as_netscape and
rfc2965 are constructor arguments of DefaultCookiePolicy. To
understand why (on some level, anyway), read the in-development docs
for DefaultCookiePolicy here:

http://docs.python.org/dev/lib/module-cookielib.html


Thanks for the report.

If you'd like a better workaround than the one you have for older
Pythons, I'll be happy to post one if you'll test this with 2.5 (no
good deed goes unpunished ;-)


[...]
# Here is where it doesn't work unless the hack is applied. The cookie
# header that is sent without the hack looks like this:
#
# Cookie: $Version=1; SessionId=\"66b908e5025d93ed\"; $Path="/"
#
# It is not accepted by the server, probably because the SessionID
# string is wrong.

There is a bug here, I think: I think the quoting is indeed incorrect,
but probably not for the reason you expect (also, on a separate point,
the funny-looking $Version and $Path are at least strictly correct,
and for example my copy of the "lynx" browser does send them). I
won't try to explain the details here.

Since the fix would likely be complicated and risky, and of benefit
only in very unusual circumstances, I don't intend to fix it at this
stage of the Python release process. It will not affect you when
using Python 2.5, as long as (not rfc2965 or rfc2109_as_netscape) is
true (see above for the definition of those names). That's true by
default in 2.5, so all you should need is:

opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
opener.open("http://www.example.com/")


(Unless you want to get at the CookieJar, e.g. to load and save
cookies), in which case go ahead and override the default CookieJar by
passing one to the HTTPCookieProcessor as you do in the code you
posted.)

I also note that you're adding an HTTPCookieProcessor, *and* also
calling .add_cookie_header(). HTTPCookieProcessor's job is to call
..add_cookie_header() / .extract_cookies() for you (even on redirects,
where you never get the opportunity to do it "manually"). You never
need to call those functions yourself if using urllib2.

HTH!


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,528
Members
45,000
Latest member
MurrayKeync

Latest Threads

Top