urllib2 spinning CPU on read

K

kdotsky

Hello All,
I've ran into this problem on several sites where urllib2 will hang
(using all the CPU) trying to read a page. I was able to reproduce it
for one particular site. I'm using python 2.4

import urllib2
url = 'http://www.wautomas.info'
request = urllib2.Request(url)
opener = urllib2.build_opener()
result = opener.open(request)
data = result.read()

It never returns from this read call.

I did some profiling to try and see what was going on and make sure it
wasn't my code. There was a huge number of calls to (and amount of
time spent in) socket.py:315(readline) and to recv. A large amount of
time was also spent in httplib.py:482(_read_chunked). Here's the
significant part of the statistics:

32564841 function calls (32563582 primitive calls) in 545.250
CPU seconds

Ordered by: internal time
List reduced from 416 to 50 due to restriction <50>

ncalls tottime percall cumtime percall filename:lineno(function)
10844775 233.920 0.000 447.440 0.000 socket.py:315(readline)
10846078 152.430 0.000 152.430 0.000 :0(recv)
3 97.330 32.443 544.730 181.577
httplib.py:482(_read_chunked)
10844812 61.090 0.000 61.090 0.000 :0(join)


Also, where should I go to see if something like this has already been
reported as a bug?

Thanks for any help you can give me.
 
J

John J. Lee

kdotsky said:
Hello All,
I've ran into this problem on several sites where urllib2 will hang
(using all the CPU) trying to read a page. I was able to reproduce it
for one particular site. I'm using python 2.4

import urllib2
url = 'http://www.wautomas.info' [...]
Also, where should I go to see if something like this has already been
reported as a bug?

I didn't try looking at your example, but I think it's likely a bug
both in that site's HTTP server and in httplib. If it's the same one
I saw, it's already reported, but nobody fixed it yet.

http://python.org/sf/1411097


John
 
K

kdotsky

I didn't try looking at your example, but I think it's likely a bug
both in that site's HTTP server and in httplib. If it's the same one
I saw, it's already reported, but nobody fixed it yet.

http://python.org/sf/1411097


John

Thanks. I tried the example in the link you gave, and it appears to be
the same behavior.

Do you have any suggestions on how I could avoid this in the meantime?
 
J

John J. Lee

kdotsky said:
Thanks. I tried the example in the link you gave, and it appears to be
the same behavior.

Do you have any suggestions on how I could avoid this in the meantime?

Yes: read the recent messages on the tracker I linked to, and apply
the fix I suggest there.


John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,577
Members
45,052
Latest member
LucyCarper

Latest Threads

Top