urllib timeout hole - long timeout if site doesn't send headers.

J

John Nagle

urllib has a "hole" in its timeout protection.

Using "socket.setdefaulttimeout" will make urllib time out if a
site doesn't open a TCP connection in the indicated time. But if the site
opens the TCP connection and never sends HTTP headers, it takes about
20 minutes for the read in urllib's "open" to time out.

There are some web servers that produce this behavior, and
many seem to be associated with British universities and nonprofits.
With these, requesting "http://example.com" opens a TCP connection
on which nothing is ever sent, while "http://www.example.com"
yields a proper web page.

Even Firefox doesn't time this out properly. Try "http://soton.ac.uk"
in Firefox, and be prepared for a long wait.

There was some active work in the urllib timeout area last summer.
What happened to that?

John Nagle
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top