urrlib2 multithreading error

viscanti · Jan 16, 2007

Hi,

I'm using urllib2 to retrieve some data usign http in a multithreaded
application.
Here's a piece of code:
req = urllib2.Request(url, txdata, txheaders)
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', user_agent)]
request = opener.open(req)
data = request.read(1024)

I'm trying to read only the first 1024 bytes to retrieve http headers
(if is html then I will retrieve the entire page).
When I use it on a single thread everything goes ok, when I create
multiple threads the execution halts and the program terminates, just
before the last line (when I execute the request.read(.) ). Obviously I
tried to catch the exception but it doesn't work, the interpreter exits
without any exception or message.
How can I solve this?

lv

Gabriel Genellina · Jan 17, 2007

At said:
When I use it on a single thread everything goes ok, when I create
multiple threads the execution halts and the program terminates, just
before the last line (when I execute the request.read(.) ). Obviously I
tried to catch the exception but it doesn't work, the interpreter exits
without any exception or message.

Ouch... Can you reduce your program to the minimum code that fails,
and post it?

--
Gabriel Genellina
Softlab SRL

__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas

Facundo Batista · Jan 17, 2007

I'm using urllib2 to retrieve some data usign http in a multithreaded
application.
Here's a piece of code:
req = urllib2.Request(url, txdata, txheaders)
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', user_agent)]
request = opener.open(req)
data = request.read(1024)

I'm trying to read only the first 1024 bytes to retrieve http headers
(if is html then I will retrieve the entire page).

Why so much bother? You just can create the Request, open it, and ask
for the headers:

req = urllib2.Request("http://www.google.com.ar")
u = urllib2.urlopen(req)
u.headers["content-type"] 'text/html'

Click to expand...

Click to expand...

Take into account that you can add the headers where you put
"txheaders", it's not necessary to use "addheaders".

And see that I'm not reading the page at all, urllib2.urlopen just
retrieves the headers...

Regards,

loading a url using urllib2	1	Mar 31, 2007
urllib2 opendirector versus request object	0	Jun 9, 2011
python proxy checker ,change to threaded version	8	Dec 7, 2009
Question about using urllib2 to load a url	2	Apr 1, 2007
Multiple cookie headers and urllib2	0	Nov 2, 2010
urllib2 error	2	Nov 17, 2010
platform issues?	1	Dec 1, 2011
UnicodeEncodeError - opening encoded URLs	3	Mar 27, 2009

urrlib2 multithreading error

viscanti

Gabriel Genellina

Facundo Batista

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads