urrlib2 multithreading error

V

viscanti

Hi,

I'm using urllib2 to retrieve some data usign http in a multithreaded
application.
Here's a piece of code:
req = urllib2.Request(url, txdata, txheaders)
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', user_agent)]
request = opener.open(req)
data = request.read(1024)

I'm trying to read only the first 1024 bytes to retrieve http headers
(if is html then I will retrieve the entire page).
When I use it on a single thread everything goes ok, when I create
multiple threads the execution halts and the program terminates, just
before the last line (when I execute the request.read(.) ). Obviously I
tried to catch the exception but it doesn't work, the interpreter exits
without any exception or message.
How can I solve this?

lv
 
G

Gabriel Genellina

At said:
When I use it on a single thread everything goes ok, when I create
multiple threads the execution halts and the program terminates, just
before the last line (when I execute the request.read(.) ). Obviously I
tried to catch the exception but it doesn't work, the interpreter exits
without any exception or message.

Ouch... Can you reduce your program to the minimum code that fails,
and post it?


--
Gabriel Genellina
Softlab SRL






__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas
 
F

Facundo Batista

I'm using urllib2 to retrieve some data usign http in a multithreaded
application.
Here's a piece of code:
req = urllib2.Request(url, txdata, txheaders)
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', user_agent)]
request = opener.open(req)
data = request.read(1024)

I'm trying to read only the first 1024 bytes to retrieve http headers
(if is html then I will retrieve the entire page).

Why so much bother? You just can create the Request, open it, and ask
for the headers:
req = urllib2.Request("http://www.google.com.ar")
u = urllib2.urlopen(req)
u.headers["content-type"] 'text/html'

Take into account that you can add the headers where you put
"txheaders", it's not necessary to use "addheaders".

And see that I'm not reading the page at all, urllib2.urlopen just
retrieves the headers...

Regards,
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,773
Messages
2,569,594
Members
45,123
Latest member
Layne6498
Top