ClientCookie bug

M

Mark Carter

I am using Windows 98, python 2.3, ClientCookie 0.4.3a.

When I do:
import ClientCookie
import os
c = ClientCookie.MSIECookieJar(delayload=1)
c.load_from_registry()

I get the response:
Traceback (most recent call last):
File "C:\My Documents\mycookies.py", line 33, in ?
c.load_from_registry()
File "C:\PYTHON23\Lib\site-packages\ClientCookie\_MSIECookieJar.py",
line 230, in load_from_registry
self.load(filename, ignore_discard, ignore_expires)
File "C:\PYTHON23\Lib\site-packages\ClientCookie\_MSIECookieJar.py",
line 245, in load
self._really_load(index, filename, ignore_discard, ignore_expires)
File "C:\PYTHON23\Lib\site-packages\ClientCookie\_MSIECookieJar.py",
line 258, in _really_load
user_name = string.lower(os.environ['USERNAME'])
File "C:\PYTHON23\Lib\os.py", line 417, in __getitem__
return self.data[key.upper()]
KeyError: 'USERNAME'

Basically, it doesn't like USERNAME as an environment variable.
 
S

Syver Enstad

I am using Windows 98, python 2.3, ClientCookie 0.4.3a.

When I do:
import ClientCookie
import os
c = ClientCookie.MSIECookieJar(delayload=1)
c.load_from_registry()

I get the response:
Traceback (most recent call last):
File "C:\My Documents\mycookies.py", line 33, in ?
c.load_from_registry()
File "C:\PYTHON23\Lib\site-packages\ClientCookie\_MSIECookieJar.py",
line 230, in load_from_registry
self.load(filename, ignore_discard, ignore_expires)
File "C:\PYTHON23\Lib\site-packages\ClientCookie\_MSIECookieJar.py",
line 245, in load
self._really_load(index, filename, ignore_discard, ignore_expires)
File "C:\PYTHON23\Lib\site-packages\ClientCookie\_MSIECookieJar.py",
line 258, in _really_load
user_name = string.lower(os.environ['USERNAME'])
File "C:\PYTHON23\Lib\os.py", line 417, in __getitem__
return self.data[key.upper()]
KeyError: 'USERNAME'

Basically, it doesn't like USERNAME as an environment variable.

I suspect this only works on "real" win32 (the NT family) operating
systems, which define USERNAME, and not hybrid dos/win32 systems like
the win9x family.
 
J

John J. Lee

Syver Enstad said:
I am using Windows 98, python 2.3, ClientCookie 0.4.3a.

When I do:
import ClientCookie
import os
c = ClientCookie.MSIECookieJar(delayload=1)
c.load_from_registry()

I get the response: [...]
KeyError: 'USERNAME'

Basically, it doesn't like USERNAME as an environment variable.

I suspect this only works on "real" win32 (the NT family) operating
systems, which define USERNAME, and not hybrid dos/win32 systems like
the win9x family.

Anybody know the best way to get the username without win32all
installed?

Mark: I don't have a win9x box, but try replacing that
os.environ['USERNAME'] with getpass.getuser(). You'll need to stick
an import getpass in there too, of course. Let me know if that works.

The rest of the code should work OK on win9x.


John
 
M

Mark Carter

I am using Windows 98, python 2.3, ClientCookie 0.4.3a.
I suspect this only works on "real" win32 (the NT family) operating

I have a fix/workaround for this - and would like it included in ClientCookie.
It my problems out real nice! Interested?

BTW, I can't seem to locate the project on Sourceforge.
 
M

Mark Carter

Syver Enstad said:
I am using Windows 98, python 2.3, ClientCookie 0.4.3a.

When I do:
import ClientCookie
import os
c = ClientCookie.MSIECookieJar(delayload=1)
c.load_from_registry()

I get the response: [...]
KeyError: 'USERNAME'

Basically, it doesn't like USERNAME as an environment variable.

I suspect this only works on "real" win32 (the NT family) operating
systems, which define USERNAME, and not hybrid dos/win32 systems like
the win9x family.

Anybody know the best way to get the username without win32all
installed?

Mark: I don't have a win9x box, but try replacing that
os.environ['USERNAME'] with getpass.getuser(). You'll need to stick
an import getpass in there too, of course. Let me know if that works.

The rest of the code should work OK on win9x.


John

My apologies - the bug exists in my own brain, not in the code. I had in mind
that a fix would be required in ClientCookie - but now I know better.

To get it to work, I simply called:
os.environ['USERNAME'] = 'mark carter' # or whatever your name is
before calling
c.load_from_registry() # c is a MSIECookieJar

It's an embarassingly simple solution to the problem - but it took me a while
to figure it out. Although there is nothing wrong with code, perhaps it would
help new users of the library to mention it. Maybe the solution is somewhat
obvious in retrospect - but it took me a while before the penny dropped.

Another approach - which I actually prefer - is to copy the cookie to the local
directory and load it using load_cookie_data(). At first, I was having no end
of problems with it - until I discovered that you should use the (binary)
cookie, not an ASCII cookie that you obtain from performing a cookie export
from MSIE. ... and don't forget to call ClientCookie.MSIECookieJar() WITHOUT
the delayload argument.

So, sorry for the false alarm - but hope my investigations will prove useful to
others.
 
J

John J. Lee

(e-mail address removed) (John J. Lee) wrote in message news:<[email protected]>... [...]
My apologies - the bug exists in my own brain, not in the code. I had in mind

Your brain is fine, Mark. It's a real bug.

that a fix would be required in ClientCookie - but now I know better.

To get it to work, I simply called:
os.environ['USERNAME'] = 'mark carter' # or whatever your name is
before calling
c.load_from_registry() # c is a MSIECookieJar

That's a workaround, but it shouldn't be necessary. Would you mind
trying the getpass fix I suggested? I don't know if getpass.getuser
works on w9x, but it's worth a shot.


[...]
Another approach - which I actually prefer - is to copy the cookie to the local
directory and load it using load_cookie_data(). At first, I was having no end

Should be no need for that.

of problems with it - until I discovered that you should use the (binary)
cookie, not an ASCII cookie that you obtain from performing a cookie export
from MSIE. ... and don't forget to call ClientCookie.MSIECookieJar() WITHOUT
the delayload argument.
[...]

Why without the delayload argument? There are features that don't yet
work with that, but that's documented. Is something else not working?


John
 
M

Mark Carter

I'll try it on a win 98 machine, and report back the results asap.

---
USERNAME

I promised that I would try getuser() on a win 98 machine. Alas,
typing:

import getpass
print getpass.getuser()

produces:

ImportError: No module named pwd

So, it's the same problem as occurs on win95 (unsuprisingly)

---
DELAYLOAD

John: Why without the delayload argument?

My response:

The following code:

import ClientCookie
c = ClientCookie.MSIECookieJar(delayload=1)
c.load_cookie_data("hemscott-cookie.bin")
import urllib2
url = 'http://businessplus.hemscott.net/corp/crp03733.htm'
request = urllib2.Request(url)
response = urllib2.urlopen(request)
request2 = urllib2.Request(url)
c.add_cookie_header(request2)
response2 = urllib2.urlopen(request2)
print response2.geturl()
print response2.info() # headers
for line in response2.readlines(): # body
print line


produces the error:

Traceback (most recent call last):
File "C:\My Documents\markc\ClientCookie\clp04.py", line 9, in ?
c.add_cookie_header(request2)
File "C:\PYTHON23\Lib\site-packages\ClientCookie\_ClientCookie.py",
line 1170, in add_cookie_header
cookies.extend(self._get_cookies_for_domain(
File "C:\PYTHON23\Lib\site-packages\ClientCookie\_ClientCookie.py",
line 1050, in _get_cookies_for_domain
return self._cookies_for_domain(domain, request, unverifiable)
File "C:\PYTHON23\lib\site-packages\ClientCookie\_MSIECookieJar.py",
line 112, in _cookies_for_domain
if self.delayload and cookies["//+delayload"] is not None:
KeyError: '//+delayload'

whereas running it without delayload=1 causes it to run successfully.
 
J

John J. Lee

John: Why without the delayload argument?

My response:

The following code:

import ClientCookie
c = ClientCookie.MSIECookieJar(delayload=1)
c.load_cookie_data("hemscott-cookie.bin")
import urllib2
url = 'http://businessplus.hemscott.net/corp/crp03733.htm'
request = urllib2.Request(url)
response = urllib2.urlopen(request)
request2 = urllib2.Request(url)
c.add_cookie_header(request2)
response2 = urllib2.urlopen(request2)
print response2.geturl()
print response2.info() # headers
for line in response2.readlines(): # body
print line

Take note of this comment from the web page:

| # Don't copy this blindly! You probably want to follow the examples
| # above, not this one.

No matter how many times and how many places I say this, everybody
seems to be driven to make the same mistakes. I guess people don't
believe it's as simple as calling urlopen, though I explictly say it
*is* that simple in most cases in the second sentence of the
documentation.

DON'T USE *BOTH* add_cookie_header/extract_cookies *AND* urlopen.
urlopen is all you need, unless you're not using urllib2. Probably
won't do any actual harm, but it's completely pointless and
obfuscatory. You're also failing to use ClientCookie.urlopen (which
unlike urllib2.urlopen, knows about cookies), because you blindly
copied the third example from the web page. You're also using Request
objects for no apparent reason, and you're attempting to fetch the
same url twice, again because you're blindly copying. Sigh -- if
people have to blindly copy, why not copy from the code I actually
*tell* people to copy from, instead of the code I explictly tell
people *not* to copy from? Ngghhh! :)

You want something like this:

import ClientCookie
c = ClientCookie.MSIECookieJar(delayload=1)
c.load_cookie_data("hemscott-cookie.bin")
url = 'http://businessplus.hemscott.net/corp/crp03733.htm'
response = ClientCookie.urlopen(url)

print response.read()
response.close()

produces the error:

Traceback (most recent call last): [...]
whereas running it without delayload=1 causes it to run successfully.

Anyway, you have found another bug, so all is forgiven.

(the MSIE delayload feature is still a bit of a mess -- luckily, I can
pin the blame on the original author of that code, since it's pretty
directly ported from Perl ;)


John
 
M

Mark Carter

*tell* people to copy from, instead of the code I explictly tell
people *not* to copy from? Ngghhh! :)

Anyway, you have found another bug, so all is forgiven.
(the MSIE delayload feature is still a bit of a mess -- luckily, I can
pin the blame on the original author of that code, since it's pretty
directly ported from Perl ;)


If we shadows have offended,
Think but this, and all is mended-
That you have but slumb'red here
While these visions did appear.
And this weak and idle theme,
No more yielding but a dream,
Gentles, do not reprehend.
If you pardon, we will mend.
And, as I am an honest Puck,
If we have unearned luck
Now to scape the serpent's tongue,
We will make amends ere long;
Else the Puck a liar call.
So, good night unto you all.
Give me your hands, if we be friends,
And Robin shall restore amends.


But seriously ... please standby and I will inform you of the outcome
of your changes.
 
M

Mark Carter

You want something like this:
import ClientCookie
c = ClientCookie.MSIECookieJar(delayload=1)
c.load_cookie_data("hemscott-cookie.bin")
url = 'http://businessplus.hemscott.net/corp/crp03733.htm'
response = ClientCookie.urlopen(url)

print response.read()
response.close()


It doesn't work on my XP machine at least. This is proboably why
people have been doing it all wrong, allegedly.

I'll investigate further. Apologies for the time lag.
 
J

John J. Lee

It doesn't work on my XP machine at least. This is proboably why
people have been doing it all wrong, allegedly.

I don't see how -- the examples in question (the one everybody seems
driven to copy from, and the ones people should copy from) does not
include any reference to any CookieJar.

I'll investigate further. Apologies for the time lag.

Thanks


John
 
J

John J. Lee

Gary Feldman said:
Since the purpose of that example seems to be to show how things work under
the hood, may I suggest putting it on a separate page, and replacing it
here with something like "Here is a <a href...>lower level example</a> that
shows how this works, though you would rarely want to implement things at
this low level."

Hmm, good idea, but I really don't want to split the documentation up
-- one page is simpler -- and the example is instructive for people
who actually want to understand what the module does. And if people
manage to miss that comment (highlighted in orange, for heavens sake),
well...


John
 
M

Mark Carter

I'll investigate further.

Here are the results from running tests in ClientCookie 0.4.4.a:

def go7():
#ClientCookie 0.4.4.a:
#works from win xp and win 98

# I prefer this method as a better way than using
load_from_registry()
#use this method!!

import ClientCookie
c = ClientCookie.MSIECookieJar() # do NOT set delayload
#c.user_name = "mark carter"
c.load_cookie_data("hemscott-cookie.bin")
#c.load_from_registry()
print c

import urllib2
url = 'http://businessplus.hemscott.net/corp/crp03733.htm'
request = urllib2.Request(url)
response = urllib2.urlopen(request)
c.extract_cookies(response, request)
#let's say this next request requires a cookie that was set in
response
request2 = urllib2.Request(url)
c.add_cookie_header(request2)
response2 = urllib2.urlopen(request2)

print response2.geturl()
print response2.info() # headers
print response2.read()
response2.close()



def go8():
#ClientCookie 0.4.4.a:
#contains bug in Win98 when environ variable is commented out,
#but works in win98 when environ variable is set
#Works in win xp , regardless of the USERNAME line

#this works - I can now import into Hemscott
import ClientCookie
c = ClientCookie.MSIECookieJar(delayload=1)
#c.user_name = "mark carter"
#os.environ['USERNAME'] = 'mcarter' #needed by
load_from_registry()
c.load_from_registry()

import urllib2
url = 'http://businessplus.hemscott.net/corp/crp03733.htm'
request = urllib2.Request(url)
response = urllib2.urlopen(request)
#c.extract_cookies(response, request)
# let's say this next request requires a cookie that was set in
response
request2 = urllib2.Request(url)
c.add_cookie_header(request2)
response2 = urllib2.urlopen(request2)

print response2.geturl()
print response2.info() # headers
for line in response2.readlines(): # body
print line


The upshot of this is that load_cookie_data() now works in win 98 and
xp.
load_from_registry() works from win xp; it works from win 98 if and
only if you set the USERNAME environment variable.

I appreciate that all the stuff about request2 and response2 may not
be to your liking - but at the moment I'm just trying to figure out
what works, and what doesn't. We can also worry about the delayload
business later.

What do you think about the idea of actually setting up an Aapache web
page to test these things 'for real'?
 
G

Gary Feldman

Hmm, good idea, but I really don't want to split the documentation up
-- one page is simpler -- and the example is instructive for people

Then definitely blockquote it (or indent it some other way), and consider
putting it into a smaller font, or using a grey background, or something
else to indicate that it's a digression. Orange would draw attention to
it; you want the opposite.

Gary
 
J

John J. Lee

Gary Feldman said:
Then definitely blockquote it (or indent it some other way), and consider
putting it into a smaller font, or using a grey background, or something
else to indicate that it's a digression.

Again, that would be a good idea if it *were* a digression, but it's
necessary for understanding what the module gets up to. Without that
understanding in the reader's mind, it's hard to explain the code that
one uses in practice if it's any more complicated than urlopen. And
the very top of the page says:

| import ClientCookie
| response = ClientCookie.urlopen("http://foo.bar.com/")
|
|This function behaves identically to urllib2.urlopen, except that it
|deals with cookies automatically. That's probably all you need to
|know.

So you don't even have to read further than that for most purposes. I
can't see how to improve on that, but I'm happy to learn how!

Orange would draw attention to
it; you want the opposite.

Only the comment is in emacs-orange (well, my copy of python-mode uses
that kind of rust-orange for Python comments), so it doesn't
particularly draw attention to that block of code more than the rest.
And I *do* want to draw attention to the comment, so people read the
comment before the code.

Admittedly, it doesn't seem to work ;-) (on a sample of one
misinterpreter, Mark, so far -- I only just added that comment
recently, though there are several other warnings that cover the same
ground elsewhere).


John
 
A

Anand Pillai

I am working on a Cookie module which works *with* urllib2 rather
than on top of it like the existing ClientCookie module. It uses
the Cookie module which comes with python standard library.

This module is written as an extension of my Harvestman webcrawler.
The alpha code is ready. We are doing testing right now.

Details will be posted to my website at
http://members.lycos.co.uk/anandpillai within say 2 weeks or so.

-Anand


I'll investigate further.

Here are the results from running tests in ClientCookie 0.4.4.a: [...]
The upshot of this is that load_cookie_data() now works in win 98 and
xp.
load_from_registry() works from win xp; it works from win 98 if and
only if you set the USERNAME environment variable.

You missed the username argument.

cookiejar.load_from_registry(username="mark")

(should only be required for win9x family)

I appreciate that all the stuff about request2 and response2 may not
be to your liking - but at the moment I'm just trying to figure out
what works, and what doesn't. We can also worry about the delayload
business later.

No really, I wasn't joking: you *never* need to use add_cookie_header
/ extract_cookies if you're using urllib2 (at least, I can't think of
any possible reason to do so). It can only break things.

What do you think about the idea of actually setting up an Aapache web
page to test these things 'for real'?

I've done limited testing on Windows with 'fake' cookies from a local
Apache server, and on wine on linux. As I said, though, I don't have
a networked Windows OS, so it's inconvenient to test these things in a
'real' situation. And my machine currently doesn't boot into Windows
without physically switching cables around (security & obscure
hardware issues, not software ones), which means I currently can't be
bothered to test it on Windows :p. So, your feedback is appreciated.


John
 
M

Mark Carter

No really, I wasn't joking: you *never* need to use add_cookie_header
/ extract_cookies if you're using urllib2 (at least, I can't think of
any possible reason to do so). It can only break things.

I must admit that I don't really know what I am doing. How would you
simplify the following code:

def go8():
import ClientCookie
c = ClientCookie.MSIECookieJar(delayload=1)
c.load_from_registry(username='mcarter') #only need username for
win9x

import urllib2
url = 'http://businessplus.hemscott.net/corp/crp03733.htm'
request = urllib2.Request(url)
response = urllib2.urlopen(request)
request2 = urllib2.Request(url)
c.add_cookie_header(request2)
response2 = urllib2.urlopen(request2)

print response2.geturl()
print response2.info() # headers
for line in response2.readlines(): # body
print line
 
A

Anand Pillai

Hi John

I wanted to add cookies support to harvestman, your module
looked ideal for it.

'We', nothing royal about it. It is just me and my friend
& co-developer Nirmal Chidambaram. Apparently he has found
a way around some of the bugs in Clientcookie. He has written
a new module using the existing Cookie module of python &
urllib2. One of the problems 'we' had with Clientcookie is that
it uses its own 'urlopen' methods which does not fit our
applications needs, so 'we' had to find a way around it.

Once the code is ready, I will post it on my webpage, and
of course it is not a module in itself, so I think an
announcement to c.l.py is out of place.

Regards

-Anand


Interesting, though I don't know quite what you mean.

First, if there's a way to work more closely with urllib2 than I've
figured out (which is quite possible), this patch needs to know about
it, so please post a comment:

http://www.python.org/sf/759792

If I understand what you mean, ClientCookie only works 'on top of'
rather than 'with' urllib2 to the extent that it currently has to
cut-n-paste code to add cookie handling to urllib2. That patch is
designed to remove the need to cut-n-paste, which would mean you'd do
urllib2.urlopen (after building an OpenerDirector that has an
HTTPCookieProcessor from ClientCookie) instead of ClientCookie.urlopen
as is required at present.

Seco
nd: is your module intended to do what ClientCookie does
(ie. figure out what cookies should be set and returned, and do so),
or is it just a more OO way of getting and returning Cookie headers?
I guess the latter?

This module is written as an extension of my Harvestman webcrawler.
The alpha code is ready. We are doing testing right now.

Is this the Royal We? ;-)

Details will be posted to my website at
http://members.lycos.co.uk/anandpillai within say 2 weeks or so.
[...]

Please do post an announcement to c.l.py.announce, or I'll forget.


John
 
J

John J. Lee

'We', nothing royal about it. It is just me and my friend
& co-developer Nirmal Chidambaram. Apparently he has found
a way around some of the bugs in Clientcookie. He has written

It'd be great if you made me aware what those bugs are!

(BTW, no intent to offend with my comment about your plurality, or
lack thereof -- it's just that the convention of using 'we' in source
code comments is common enough that I've sometimes found myself using
it even when writing code alone, which is funny.)

a new module using the existing Cookie module of python &
urllib2. One of the problems 'we' had with Clientcookie is that
it uses its own 'urlopen' methods which does not fit our
applications needs, so 'we' had to find a way around it.

As I said before, if you know how to do that, please comment on the
RFE I referenced in my last post. Jeremy Hylton is planning to look
at the patch associated with that RFE in detail sometime, and you
could save him some time if you know a way to do this without patching
urllib2. And I'd like to know how to do it, too :)

Once the code is ready, I will post it on my webpage, and
of course it is not a module in itself, so I think an
announcement to c.l.py is out of place.
[...]

Would you mind sending me an email?

Thanks


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top