T
Thomas Lindgaard
Hello
Python is not my native language, so I'm kinda stuck right now.
I have a small web crawler, that opens pages using urllib.urlopen (1). The
problem is that when a page doesn't exist, I would like to do this and
that. So I have done the following:
import urllib
import socket
socket.setdefaulttimeout(10)
def getPage(self):
try:
self.page = urllib.urlopen(self.link)
self.body = self.page.read()
self.page.close()
except <this is the problem>:
singSongAboutTimeout()
I have tried "except socket.timeout" but this is ignored. The call to
getPage is wrapped in its own try-except:
try:
self.page = getPage(self.link)
self.body = self.page.read()
self.page.close()
except IOError, (errno, errmsg):
# print error
This block catches the exception, but I would really like to catch it
earlier.
How do I do that? Is there a way to check which exception is being caught
- something along the lines of:
try:
# something throws an exception
except:
printHumanReadableDescriptionOfException()
And finally, I have a few problems finding what to write after except...
how can I know what the exception returns (ie. errno and errmsg are
returned by IOError)?
Regards
/Thomas
(1) I have specified a new user-agent as described in
http://python.org/doc/2.3.4/lib/module-urllib.html under _urlopener
Python is not my native language, so I'm kinda stuck right now.
I have a small web crawler, that opens pages using urllib.urlopen (1). The
problem is that when a page doesn't exist, I would like to do this and
that. So I have done the following:
import urllib
import socket
socket.setdefaulttimeout(10)
def getPage(self):
try:
self.page = urllib.urlopen(self.link)
self.body = self.page.read()
self.page.close()
except <this is the problem>:
singSongAboutTimeout()
I have tried "except socket.timeout" but this is ignored. The call to
getPage is wrapped in its own try-except:
try:
self.page = getPage(self.link)
self.body = self.page.read()
self.page.close()
except IOError, (errno, errmsg):
# print error
This block catches the exception, but I would really like to catch it
earlier.
How do I do that? Is there a way to check which exception is being caught
- something along the lines of:
try:
# something throws an exception
except:
printHumanReadableDescriptionOfException()
And finally, I have a few problems finding what to write after except...
how can I know what the exception returns (ie. errno and errmsg are
returned by IOError)?
Regards
/Thomas
(1) I have specified a new user-agent as described in
http://python.org/doc/2.3.4/lib/module-urllib.html under _urlopener