urllib2/cookies - surely there's a better way ?

R

Richard Shea

Hi - I'm writing a script which fetches a page from a web server and
takes note of any set-cookies which are served in the headers so that
when I next request a page I can send those cookies back to the
server. This is so that the usage analysis software on the server
(based on cookies) will take account of the scripts activities.

Now the thing is I'm halfway through doing this but I'm thinking there
must be a more refined mechanism than the one I'm using (see below).

I'm not really asking if there's a way to smarten up the rather clunky
splits (although if necessary that would be welcome) I'm more asking
is there not a more refined interface to the whole area of cookies.

Strangely enought the doco says that f.info() "return the
meta-information of the page, as a dictionary-like object" - well as
far as I can see it's a string and the f.info().headers is a list. I
don't usually find errors in the doco so this makes me wonder if
there's something I'm doing fundamentally wrong ?

Anyway any ideas would be welcome. Here goes with the work in progress
....


import urllib2
from string import split
from string import upper


req_headers = {
'User-Agent': 'Mozilla/4.0 (compatible; MSIE 5.5; Windows
NT)',
'Referer':''
}

TARGETURL = 'http://www.somedomain.com/a/b/'

req = urllib2.Request(TARGETURL, None, req_headers)
f = urllib2.urlopen(req)
lstHeaders=f.info().headers

for h in lstHeaders:

lstHeaderContents = split(h,":",1)
if upper(lstHeaderContents[0]) == "SET-COOKIE":
print lstHeaderContents[1]
#get the keyword value pair to the left of the first
';'
lstYetAnother = split(lstHeaderContents[1],";",1)
#put the keyword value pair into a list
lstOneMore = split(lstYetAnother[0],"=",1)
print "Keyword=" + lstOneMore[0] + ". Value = " +
lstOneMore[1] + "."



That's the script - thanks for readind this far.

regards

richard.
 
R

Richard Shea

Peter Hansen said:
Richard Shea wrote:

(about urllib2 and cookies)

Search for "ClientCookie"...

That's great ! I actually laughed when I read the doco - I was only
looking for something to parse cookies with but this does the whole
thing ! I haven't yet used it but I've had a few wriggles getting to
where the 'import ClientCookie' works so I thought I might tell the
newsgroup what I did to make it work (although that might be fairly
obvious to many).

First of all I was installing on a W98 machine. I tried using the
install procedure "python setup.py build" but I got the message
"error: package directory 'ClientCookie' does not exist". I do have a
slightly weird setup so I wasn't all that surprised.

Anyway I then took the fallback option of copying the 'ClientCookie'
directory from the .ZIP manually. In order to make this work you need
to ensure that sys.path contains a path to ClientCookie before it
finds the standard libraries. I chose to do that by editing the
registry at

HKLM/SOFTWARE/Python/PytonCore/2.3/PytonPath

and adding a new entry there with a value which pointed at the
clientcookie directory (ie C:/a/b/ClientCookie-0.4.18/ClientCookie)
however although I got an extra entry at the 'right' place in sys.path
this didn't allow me to "import ClientCookie" and eventually I
modified the Registry entry to read C:/a/b/ClientCookie-0.4.18 and now
everthing seems to be fine.

This is probably pretty straightforward stuff for most pepole but I
still find some aspects of 'import' a dark art so I thoguht I was
worth sticking it into the archives.

Thanks again for the tip.

Regards

Richard.
 
P

Peter Hansen

Richard said:
That's great ! I actually laughed when I read the doco - I was only
looking for something to parse cookies with but this does the whole
thing ! I haven't yet used it but I've had a few wriggles getting to
where the 'import ClientCookie' works so I thought I might tell the
newsgroup what I did to make it work (although that might be fairly
obvious to many).

First of all I was installing on a W98 machine. I tried using the
install procedure "python setup.py build" but I got the message
"error: package directory 'ClientCookie' does not exist". I do have a
slightly weird setup so I wasn't all that surprised.

Hmm... I believe on Windows all you should have to do is "python
setup.py install", not "build". That will pretty much do whatever
you did to manually install it, but it's done for you... and won't
muck with your registry since it will just install it under
your site-packages folder where it's supposed to go.

-Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top