How to except the unexpected?

R

Rene Pijlman

One of the things I dislike about Java is the need to declare exceptions
as part of an interface or class definition. But perhaps Java got this
right...

I've writen an application that uses urllib2, urlparse, robotparser and
some other modules in the battery pack. One day my app failed with an
urllib2.HTTPError. So I catch that. But then I get a urllib2.URLError, so
I catch that too. The next day, it encounters a urllib2.HTTPError, then a
IOError, a socket.timeout, httplib.InvalidURL,...

How do you program robustly with these modules throwing all those
different (and sometimes undocumented) exceptions at you?

A catchall seems like a bad idea, since it also catches AttributeErrors
and other bugs in the program.
 
J

James Stroud

Rene said:
One of the things I dislike about Java is the need to declare exceptions
as part of an interface or class definition. But perhaps Java got this
right...

I've writen an application that uses urllib2, urlparse, robotparser and
some other modules in the battery pack. One day my app failed with an
urllib2.HTTPError. So I catch that. But then I get a urllib2.URLError, so
I catch that too. The next day, it encounters a urllib2.HTTPError, then a
IOError, a socket.timeout, httplib.InvalidURL,...

How do you program robustly with these modules throwing all those
different (and sometimes undocumented) exceptions at you?

A catchall seems like a bad idea, since it also catches AttributeErrors
and other bugs in the program.

The relevant lines of urllib2, for example, look as such:

class URLError(IOError):
class HTTPError(URLError, addinfourl):
class GopherError(URLError):

This suggests that catching URLError should have caught your HTTPError,
so you might have the chronology backwards above.

E.g.:

py> class BobError(Exception): pass
....
py> class CarolError(BobError): pass
....
py> try:
.... raise CarolError
.... except BobError:
.... print 'got it'
....
got it


Now,

% cat httplib.py | grep -e '^\s*class'

produces the following at one point in its output:

class HTTPException(Exception):
class NotConnected(HTTPException):
class InvalidURL(HTTPException):
class UnknownProtocol(HTTPException):
class UnknownTransferEncoding(HTTPException):
class UnimplementedFileMode(HTTPException):
class IncompleteRead(HTTPException):
class ImproperConnectionState(HTTPException):
class CannotSendRequest(ImproperConnectionState):
class CannotSendHeader(ImproperConnectionState):
class ResponseNotReady(ImproperConnectionState):
class BadStatusLine(HTTPException):

Which suggests that "try: except HTTPException:" will be specific enough
as a catchall for this module.

The following, then, should catch everything you mentioned except the
socket timeout:

try:
whatever()
except URLError, HTTPException:
alternative()

But it seems to me that working with the internet as you are doing is
fraught with peril anyway.

James
 
B

Ben Caradoc-Davies

James said:
except URLError, HTTPException:

Aieee! This catches only URLError and binds the name HTTPException to
the detail of that error. You must write

except (URLError, HTTPException):

to catch both.
 
J

James Stroud

Ben said:
Aieee! This catches only URLError and binds the name HTTPException to
the detail of that error. You must write

except (URLError, HTTPException):

to catch both.

Oops.
 
R

Roy Smith

Ben Caradoc-Davies said:
Aieee! This catches only URLError and binds the name HTTPException to
the detail of that error. You must write

except (URLError, HTTPException):

to catch both.

This exact issue came up just within the past week or so. I think that
qualifies it as a wart, but I think it's a double wart.

It's certainly a wart that the try statement syntax allows for such
ambiguity. But, I think it's also a wart in how the exceptions were
defined. I like to create a top-level exception class to encompass all the
possible errors in a given module, then subclass that. This way, if you
want to catch anything to goes wrong in a call, you can catch the top-level
exception class without having to enumerate them all.
 
P

Peter Hansen

Rene said:
One of the things I dislike about Java is the need to declare exceptions
as part of an interface or class definition. But perhaps Java got this
right...

I've writen an application that uses urllib2, urlparse, robotparser and
some other modules in the battery pack. One day my app failed with an
urllib2.HTTPError. So I catch that. But then I get a urllib2.URLError, so
I catch that too. The next day, it encounters a urllib2.HTTPError, then a
IOError, a socket.timeout, httplib.InvalidURL,...

How do you program robustly with these modules throwing all those
different (and sometimes undocumented) exceptions at you?

I do it by not micromanaging things. Presumably if you plan to catch an
exception, you have a specific procedure in mind for handling the
problem. Maybe a retry, maybe an alternate way of attempting the same
thing? Look to the code that you are putting in those except:
statements (or that you think you want to put in them) to decide what to
do about this situation. If each type of exception will be handled in a
different manner, then you definitely want to identify each type by
looking at the source or the docs, or doing it empirically.

Most of the time there isn't a whole lot of real "handling" going on in
an exception handler, but merely something like logging and/or reporting
it onscreen in a cleaner fashion than a traceback, then failing anyway.
This is one reason Java does get it wrong: 95% of exceptions don't
need and shouldn't have special handling anyway.

Good code should probably have a very small set of real exception
handling cases, and one or two catchalls at a higher level to avoid
barfing a traceback at the user.
A catchall seems like a bad idea, since it also catches AttributeErrors
and other bugs in the program.

Generally speaking this won't be a problem if you have your catchalls at
a fairly high level and have proper unit tests for the lower level code
which is getting called. You are doing unit testing, aren't you? ;-)

-Peter
 
S

Steven D'Aprano

I've writen an application that uses urllib2, urlparse, robotparser and
some other modules in the battery pack. One day my app failed with an
urllib2.HTTPError. So I catch that. But then I get a urllib2.URLError, so
I catch that too. The next day, it encounters a urllib2.HTTPError, then a
IOError, a socket.timeout, httplib.InvalidURL,...

How do you program robustly with these modules throwing all those
different (and sometimes undocumented) exceptions at you?

How robust do you want to be? Do you want to take a leaf out of Firefox
and Windows XP by generating an error report and transmitting it back to
the program maintainer?
A catchall seems like a bad idea, since it also catches AttributeErrors
and other bugs in the program.

ExpectedErrors = (URLError, IOError)
ErrorsThatCantHappen = (LookupError, ArithmeticError, AssertionError)

try:
process_things()
except ExpectedErrors:
recover_from_error_gracefully()
except ErrorsThatCantHappen:
print "Congratulations! You have found a program bug!"
print "For a $327.68 reward, please send the following " \
"traceback to Professor Donald Knuth."
raise
except:
print "An unexpected error occurred."
print "This probably means the Internet is broken."
print "If the bug still occurs after fixing the Internet, " \
"it may be a program bug."
log_error()
sys.exit()
 
P

Paul Rubin

Steven D'Aprano said:
try:
process_things()
except ExpectedErrors:
recover_from_error_gracefully()
except ErrorsThatCantHappen:
print "Congratulations! You have found a program bug!"
print "For a $327.68 reward, please send the following " \
"traceback to Professor Donald Knuth."
raise
except:
print "An unexpected error occurred."
print "This probably means the Internet is broken."

But this isn't good, it catches asynchronous exceptions like the user
hitting ctrl-C, which you might want to handle elsewhere. What you
want is a way to catch only actual exceptions raised from inside the
try block.
 
S

Steven D'Aprano

But this isn't good, it catches asynchronous exceptions like the user
hitting ctrl-C, which you might want to handle elsewhere. What you
want is a way to catch only actual exceptions raised from inside the
try block.


It will only catch the KeyboardInterrupt exception if the user actually
hits ctrl-C during the time the code running inside the try block is
executing. It certainly won't catch random ctrl-Cs happening at other
times.

The way to deal with it is to add another except clause to deal with the
KeyboardInterrupt, or to have recover_from_error_gracefully() deal with
it. The design pattern still works. I don't know if it has a fancy name,
but it is easy to describe:-

catch specific known errors that you can recover from, and recover from
them whatever way you like (including, possibly, re-raising the exception
and letting higher-level code deal with it);

then catch errors that cannot possibly happen unless there is a bug,
and treat them as a bug;

and lastly catch unexpected errors that you don't know how to handle and
die gracefully.

My code wasn't meant as production level code, nor was ExpectedErrors
meant as an exhaustive list. I thought that was too obvious to need
commenting on.

Oh, in case this also wasn't obvious, Donald Knuth won't really pay
$327.68 for bugs in your Python code. He only pays for bugs in his own
code. *wink*
 
P

Paul Rubin

Steven D'Aprano said:
The way to deal with it is to add another except clause to deal with the
KeyboardInterrupt, or to have recover_from_error_gracefully() deal with
it.

I think adding another except clause for KeyboardInterrupt isn't good
because maybe in Python 2.6 or 2.6 or whatever there will be some
additional exceptions like that and your code will break. For example,
proposals have floated for years of adding ways for threads to raise
exceptions in other threads.

I put up a proposal for adding an AsynchronousException class to
contain all of these types of exceptions, so you can check for that.
Oh, in case this also wasn't obvious, Donald Knuth won't really pay
$327.68 for bugs in your Python code. He only pays for bugs in his own
code. *wink*

The solution to that one is obvious. We have to get Knuth using Python.
Anyone want to write a PEP? ;-)
 
R

Rene Pijlman

Roy Smith:
I like to create a top-level exception class to encompass all the
possible errors in a given module, then subclass that. This way, if you
want to catch anything to goes wrong in a call, you can catch the top-level
exception class without having to enumerate them all.

What do you propose to do with exceptions from modules called by the given
module?
 
R

Rene Pijlman

James Stroud:
Which suggests that "try: except HTTPException:" will be specific enough
as a catchall for this module.

The following, then, should catch everything you mentioned except the
socket timeout:

Your conclusion may be (almost) right in this case. I just don't like this
approach. Basically this is reverse engineering the interface from the
source at the time of writing the app. Even if you get it right, it may
fail next week when someone added an exception to a module.
But it seems to me that working with the internet as you are doing is
fraught with peril anyway.

Why? It shouldn't be.
 
R

Rene Pijlman

Peter Hansen:
Good code should probably have a very small set of real exception
handling cases, and one or two catchalls at a higher level to avoid
barfing a traceback at the user.

Good point.
Generally speaking this won't be a problem if you have your catchalls at
a fairly high level and have proper unit tests for the lower level code
which is getting called. You are doing unit testing, aren't you? ;-)

With low coverage, yes. But unit testing isn't the answer for this
particular problem. For example, yesterday my app was surprised by an
httplib.InvalidURL since I hadn't noticed this could be raised by
robotparser (this is undocumented). If that fact goes unnoticed when
writing the exception handling, it will also go unnoticed when designing
test cases. I probably wouldn't have thought of writing a test case with a
first url with some external domain (that triggers robots.txt-fetching)
that's deemed invalid by httplib.
 
R

Rene Pijlman

Steven D'Aprano:
ExpectedErrors = (URLError, IOError)
ErrorsThatCantHappen =

try:
process_things()
except ExpectedErrors:
recover_from_error_gracefully()
except ErrorsThatCantHappen:
print "Congratulations! You have found a program bug!"
print "For a $327.68 reward, please send the following " \
"traceback to Professor Donald Knuth."
raise
except:
print "An unexpected error occurred."
print "This probably means the Internet is broken."
print "If the bug still occurs after fixing the Internet, " \
"it may be a program bug."
log_error()
sys.exit()

Yes, I think I'll do something like this. Perhaps combined with Peter's
advice to not micromanage, like so:

Reraise = (LookupError, ArithmeticError, AssertionError) # And then some

try:
process_things()
except Reraise:
raise
except:
log_error()
 
J

Jorge Godoy

Rene Pijlman said:
With low coverage, yes. But unit testing isn't the answer for this
particular problem. For example, yesterday my app was surprised by an
httplib.InvalidURL since I hadn't noticed this could be raised by
robotparser (this is undocumented). If that fact goes unnoticed when

It isn't undocumented in my module. From 'pydoc httplib':


CLASSES
exceptions.Exception
HTTPException
BadStatusLine
ImproperConnectionState
CannotSendHeader
CannotSendRequest
ResponseNotReady
IncompleteRead
InvalidURL <------- HERE
NotConnected
UnimplementedFileMode
UnknownProtocol
UnknownTransferEncoding
HTTPException
BadStatusLine
ImproperConnectionState
CannotSendHeader
CannotSendRequest
ResponseNotReady
IncompleteRead
InvalidURL
NotConnected
UnimplementedFileMode
UnknownProtocol
UnknownTransferEncoding
HTTP
HTTPConnection
HTTPSConnection
HTTPResponse


(Yes, it appears twice, don't ask me why...)

--
Jorge Godoy <[email protected]>

"Quidquid latine dictum sit, altum sonatur."
- Qualquer coisa dita em latim soa profundo.
- Anything said in Latin sounds smart.
 
R

Roy Smith

Rene Pijlman said:
A catchall seems like a bad idea, since it also catches AttributeErrors
and other bugs in the program.

All of the things like AttributeError are subclasses of StandardError. You
can catch those first, and then catch everything else. In theory, all
exceptions which represent problems with the external environment (rather
than programming mistakes) should derive from Exception, but not from
StandardError. In practice, some very old code may raise things which do
not derive from Exception, which complicates things somewhat.

--------------------------------------------------
#!/usr/bin/env python

import socket

try:
x = []
y = x[42]
except StandardError, foo:
print "Caught a StandardError: ", foo
except Exception, foo:
print "Caught something else: ", foo

try:
socket.socket (9999)
except StandardError, foo:
print "Caught a StandardError: ", foo
except Exception, foo:
print "Caught something else: ", foo

try:
raise "I'm a string pretending to be an exception"
except StandardError, foo:
print "Caught a StandardError: ", foo
except Exception, foo:
print "Caught something else: ", foo
--------------------------------------------------

Roy-Smiths-Computer:play$ ./ex.py
Caught a StandardError: list index out of range
Caught something else: (43, 'Protocol not supported')
Traceback (most recent call last):
File "./ex.py", line 21, in ?
raise "I'm a string pretending to be an exception"
I'm a string pretending to be an exception
 
R

Roy Smith

Rene Pijlman said:
Roy Smith:

Are you sure?

"""
The class hierarchy for built-in exceptions is:

Exception
+-- StandardError
| +-- KeyboardInterrupt
| +-- ImportError
| +-- EnvironmentError
| | +-- IOError
"""
http://www.python.org/doc/current/lib/module-exceptions.html

Hmmm, OK, I missed EnvironmentError. So, what you need to do is:

try:
whatever()
except EnvironmentError:
...
except StandardError:
...
except Exception:
...

or something like that.

I do agree with you that there is some value in Java's "must catch or
re-export all exceptions" semantics, and this would be one of those places
where it would be useful. In general, however, I've always found it to be
a major pain in the butt, to the point where I sometimes just punt and
declare all my methods to "throw Exception" (or whatever the correct syntax
is). Not to mention that with a dynamic language like Python, it's
probably impossible to implement.

I think the real problem here is that the on-line docs are incomplete
because they don't list all the exceptions that this module can raise. The
solution to that is to open a bug on sourceforge against the docs.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top