[python] How to detect a remote webpage is accessible? (in HTTP)

Ì

Ìð¹Ï

Howdy, all,
I want to use python to detect the accessibility of website.
Currently, I use urllib
to obtain the remote webpage, and see whether it fails. But the problem is that
the webpage may be very large; it takes too long time. Certainly, it
is no need to download
the entire page. Could you give me a good and fast solution?
Thank you.
 
J

Jarek Zgoda

甜瓜 napisał(a):
Howdy, all,
I want to use python to detect the accessibility of website.
Currently, I use urllib
to obtain the remote webpage, and see whether it fails. But the problem is that
the webpage may be very large; it takes too long time. Certainly, it
is no need to download
the entire page. Could you give me a good and fast solution?
Thank you.

Issue HTTP HEAD request.
 
J

John Nagle

?? said:
Howdy, all,
I want to use python to detect the accessibility of website.
Currently, I use urllib
to obtain the remote webpage, and see whether it fails. But the problem is that
the webpage may be very large; it takes too long time. Certainly, it
is no need to download
the entire page. Could you give me a good and fast solution?
Thank you.

If you can get through "urlopen", you've already received the HTTP headers.
Just open, then use "info()" on the file descriptor to get the header info.
Don't read the content at all.

Setting the socket timeout will shorten the timeout when the requested
domain won't respond at all. But if the remote host opens an HTTP connection,
then sends nothing, the socket timeout is ineffective and you wait for a while.
This is rare, but it happens.

John Nagle
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,173
Latest member
GeraldReund
Top