Use existing IE cookie

KB · Jul 30, 2009

Hi there,

Relevant versions: Python 2.5, Vista Home, IE7

I am trying to scrape a website I have browsed manually in the past,
and also manually selected my options, and now want python to use my
existing cookie from the manual browse when downloading data.

Using: http://code.activestate.com/recipes/80443/ I have found the
"name" of the relevant cookie, just after reading urllib2 docs, I
can't see how to "send" or have my python instance use "MY" existing
cookie.

Using the following:

***
import re
import urllib2, cookielib

# set things up for cookies

opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
urllib2.install_opener(opener)

reply = urllib2.urlopen('foo.html').read()

print reply

***

This does return data, just default data, not the data from the
options I set up when manually browsing.

My sense is that I need "something" in the () part of
HTTPCookieProcessor() but I have no idea as to what... the docs say
"cookiejar" but the only code examples I have found are to create a
cookiejar for the existing Python session, not to use the cookies from
my prior manual meanderings.

Any help greatly appreciated.

Diez B. Roggisch · Jul 30, 2009

KB said:
Hi there,

Relevant versions: Python 2.5, Vista Home, IE7

I am trying to scrape a website I have browsed manually in the past,
and also manually selected my options, and now want python to use my
existing cookie from the manual browse when downloading data.

Using: http://code.activestate.com/recipes/80443/ I have found the
"name" of the relevant cookie, just after reading urllib2 docs, I
can't see how to "send" or have my python instance use "MY" existing
cookie.

Using the following:

***
import re
import urllib2, cookielib

# set things up for cookies

opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
urllib2.install_opener(opener)

reply = urllib2.urlopen('foo.html').read()

print reply

***

This does return data, just default data, not the data from the
options I set up when manually browsing.

My sense is that I need "something" in the () part of
HTTPCookieProcessor() but I have no idea as to what... the docs say
"cookiejar" but the only code examples I have found are to create a
cookiejar for the existing Python session, not to use the cookies from
my prior manual meanderings.

Because this is a completely different beast. You need to find out if and
how to access IE-cookies from python - I guess some win32-road is to be
walked down for that.

Once you get a hold on them, you can build up whatever cookiejar urllib2
needs.

Diez

KB · Jul 30, 2009

Thanks for the prompt reply, Diez! Using the above I have found the
name of the cookie (I did google how to use IE cookies in python and
that was the best match) but it only tells me the name of the cookie,
not how to use it.

Any clues?

TIA!

Diez B. Roggisch · Jul 30, 2009

KB said:
Thanks for the prompt reply, Diez! Using the above I have found the
name of the cookie (I did google how to use IE cookies in python and
that was the best match) but it only tells me the name of the cookie,
not how to use it.

Ah, sorry, should have read the recipe also.

For me it looks as if findIECookie from that recipe is to be called with the
name. Then it should return the value, or None

What does you full example look like, including the
cookie-acquisition-stuff?

Diez

KB · Jul 30, 2009

What does you full example look like, including the
cookie-acquisition-stuff?

Diez

I ran them seperately, hoping for a clue as to what my "cookiejar"
was.

The cookie-acquisition stuff returns "screener.ashx?v=151" when I
search with my domain I am interested in. I have tried
urllib2.HTTPCookieProcessor('screener.ashx?v=151') but that failed
with attr has no cookie header.

From the HTTPCookieProcessor doco, it appears that non-IE browsers
have a cookie file (and example code) but from what I can tell IE uses
a hidden folder. (you can set your location in IE but it appends a
folder "\Temporary Internet Files" -

From: http://docs.python.org/dev/library/cookielib.html

***
This example illustrates how to open a URL using your Netscape,
Mozilla, or Lynx cookies (assumes Unix/Netscape convention for
location of the cookies file):

import os, cookielib, urllib2
cj = cookielib.MozillaCookieJar()
cj.load(os.path.join(os.environ["HOME"], ".netscape/cookies.txt"))
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/")
***

Not sure how to adapt this for IE.

Diez B. Roggisch · Jul 30, 2009

KB said:
What does you full example look like, including the
cookie-acquisition-stuff?

Diez

Click to expand...

I ran them seperately, hoping for a clue as to what my "cookiejar"
was.

The cookie-acquisition stuff returns "screener.ashx?v=151" when I
search with my domain I am interested in. I have tried
urllib2.HTTPCookieProcessor('screener.ashx?v=151') but that failed
with attr has no cookie header.

From the HTTPCookieProcessor doco, it appears that non-IE browsers
have a cookie file (and example code) but from what I can tell IE uses
a hidden folder. (you can set your location in IE but it appends a
folder "\Temporary Internet Files" -

From: http://docs.python.org/dev/library/cookielib.html

***
This example illustrates how to open a URL using your Netscape,
Mozilla, or Lynx cookies (assumes Unix/Netscape convention for
location of the cookies file):

import os, cookielib, urllib2
cj = cookielib.MozillaCookieJar()
cj.load(os.path.join(os.environ["HOME"], ".netscape/cookies.txt"))
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/")
***

Not sure how to adapt this for IE.

You could create a file that resembles the cookies.txt - no idea how that
looks, but I guess it's pretty simple.

Diez

KB · Jul 30, 2009

I ran them seperately, hoping for a clue as to what my "cookiejar"
was.

Click to expand...

The cookie-acquisition stuff returns "screener.ashx?v=151" when I
search with my domain I am interested in. I have tried
urllib2.HTTPCookieProcessor('screener.ashx?v=151') but that failed
with attr has no cookie header.

Click to expand...

From the HTTPCookieProcessor doco, it appears that non-IE browsers
have a cookie file (and example code) but from what I can tell IE uses
a hidden folder. (you can set your location in IE but it appends a
folder "\Temporary Internet Files" -

***
This example illustrates how to open a URL using your Netscape,
Mozilla, or Lynx cookies (assumes Unix/Netscape convention for
location of the cookies file):

Click to expand...

import os, cookielib, urllib2
cj = cookielib.MozillaCookieJar()
cj.load(os.path.join(os.environ["HOME"], ".netscape/cookies.txt"))
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/")
***

Click to expand...

Not sure how to adapt this for IE.

Click to expand...

You could create a file that resembles the cookies.txt - no idea how that
looks, but I guess it's pretty simple.

Diez- Hide quoted text -

- Show quoted text -

Yeah unfortunately I just tried Firefox and it uses cookies.sqlite
now... more dead ends

KB · Jul 30, 2009

Winner, winner, chicken dinner... resolved.

http://wwwsearch.sourceforge.net/mechanize/doc.html

import mechanize
cj = mechanize.MSIECookieJar(delayload=True)
cj.load_from_registry() # finds cookie index file from registry

Thanks Mr Lee!

Gabriel Genellina · Jul 31, 2009

For IE you may use the pywin32 package:

py> import win32inet
py> win32inet.InternetGetCookie("http://sldm/", None)
'__ac_name="softlab"'

Or use ctypes to call the function of the same name in wininet.dll; see
http://msdn.microsoft.com/en-us/library/aa384710(VS.85).aspx

Multiple cookie headers and urllib2	0	Nov 2, 2010
[cookielib] How to add cookies myself?	0	Dec 16, 2008
urllib2 opendirector versus request object	0	Jun 9, 2011
python - fetching, post, cookie question	0	Dec 22, 2009
Problem with reading CSV file from URL, last record truncated.	2	Aug 3, 2009
python: HTTP connections through a proxy server requiring authentication	3	Jan 26, 2013
Need urllib.urlretrieve and urllib2.OpenerDirector together	0	Dec 26, 2010
Create a cookie with cookielib	6	Feb 3, 2007

Use existing IE cookie

KB

Diez B. Roggisch

KB

Diez B. Roggisch

KB

Diez B. Roggisch

KB

KB

Gabriel Genellina

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads