those darn exceptions

C

Chris Torek

Exceptions are great, but...

Sometimes when calling a function, you want to catch some or
even all the various exceptions it could raise. What exceptions
*are* those?

It can be pretty obvious. For instance, the os.* modules raise
OSError on errors. The examples here are slightly silly until
I reach the "real" code at the bottom, but perhaps one will get
the point:
...

[I'm not sure why the interpreter wants more after my comment here.]
Traceback (most recent call last):
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 3] No such process


So now I am ready to write my "is process <pid> running" function:

import os, errno

def is_running(pid):
"Return True if the given pid is running, False if not."
try:
os.kill(pid, 0)
except OSError, err:
# We get an EPERM error if the pid is running
# but we are not allowed to signal it (even with
# signal 0). If we get any other error we'll assume
# it's not running.
if err.errno != errno.EPERM:
return False
return True

This function works great, and never raises an exception itself.
Or does it?
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in is_running
OverflowError: long int too large to convert to int

Oops! It turns out that os.kill() can raise OverflowError (at
least in this version of Python, not sure what Python 3.x does).

Now, I could add, to is_running, the clause:

except OverflowError:
return False

(which is what I did in the real code). But how can I know a priori
that os.kill() could raise OverflowError in the first place? This
is not documented, as far as I can tell. One might guess that
os.kill() would raise TypeError for things that are not integers
(this is the case) but presumably we do NOT want to catch that
here. For the same reason, I certainly do not want to put in a
full-blown:

except Exception:
return False

It would be better just to note somewhere that OverflowError is
one of the errors that os.kill() "normally" produces (and then,
presumably, document just when this happens, so although having
noted that it can, one could make an educated guess).

Functions have a number of special "__" attributes. I think it
might be reasonable to have all of the built-in functions, at least,
have one more, perhaps spelled __exceptions__, that gives you a
tuple of all the exceptions that the function might raise.
Imagine, then:
'kill(pid, sig)\n\nKill a process with a signal.'

[this part exists]
(<type 'exceptions.OSError'>, <type 'exceptions.TypeError'>, <type 'exceptions.OverflowError'>, <type 'exceptions.DeprecationWarning'>)

[this is my new proposed part]

With something like this, a pylint-like tool could compute the
transitive closure of all the exceptions that could occur in any
function, by using __exceptions__ (if provided) or recursively
finding exceptions for all functions called, and doing a set-union.
You could then ask which exceptions can occur at any particular
call site, and see if you have handled them, or at least, all the
ones you intend to handle. (The DeprecationWarning occurs if you
pass a float to os.kill() -- which I would not want to catch.
Presumably the pylint-like tool, which might very well *be* pylint,
would have a comment directive you would put in saying "I am
deliberately allowing these exceptions to pass on to my caller",
for the case where you are asking it to tell you which exceptions
you may have forgotten to catch.)

User functions could set __exceptions__ for documentation purposes
and/or speeding up this pylint-like tool. (Obviously, user-provided
functions might raise exception classes that are only defined in
user-provided code -- but to raise them, those functions have to
include whatever code defines them, so I think this all just works.)
The key thing needed to make this work, though, is the base cases
for system-provided code written in C, which pylint by definition
cannot inspect to find a set of exceptions that might be raised.
 
C

Chris Angelico

It can be pretty obvious.  For instance, the os.* modules raise
OSError on errors.  The examples here are slightly silly until
I reach the "real" code at the bottom, but perhaps one will get
the point:

   >>> import os
   >>> os.kill(getpid(), 0) # am I alive?
   >>> # yep, I am alive.
   ...

[I'm not sure why the interpreter wants more after my comment here.]

It's not wanting more. It responded to your statement "yep, I am
alive" by boggling at you. It said, and I quote, "...". What next?
Reading the obituaries column in search of your own PID?

Yep, slightly silly. And very amusing. But back to the serious:

os.kill(pid,0) doesn't work on Windows, but that just means this whole
function can't be used on Windows. (Actually, the kill call DOES work.
It just doesn't do what you want here... it kills the process.)

os.kill("asdf",0) --> TypeError: an integer is required
os.kill(-1,0) --> no error raised - not sure if you want to propagate
os.kill()'s behaviour on negative PIDs or not - see for instance
http://linux.die.net/man/2/kill

I'm not sure if it's possible to put Python into "secure computing"
mode (with prctl(PR_SET_SECCOMP) on Linux), but if you did, then
there'd be an additional possible result from this: No return at all,
because your process has just been killed for trying to kill someone
else. (Secure Computing: The death penalty for attempted murder.)

Interesting concept of pulling out all possible exceptions. Would be
theoretically possible to build a table that keeps track of them, but
automated tools may have problems:

a=5; b=7; c=12
d=1/(a+b-c) # This could throw ZeroDivisionError

if a+b>c:
d=1/(a+b-c) # This can't, because it's guarded.
else:
d=2

And don't tell me to rewrite that with try/except, because it's not the same :)

I'd be inclined to have comments about the exceptions that this can
itself produce, but if there's exceptions that come from illogical
arguments (like the TypeError above), then just ignore them and let
them propagate. If is_process("asdf") throws TypeError rather than
returning False, I would call that acceptable behaviour.

Chris Angelico
 
C

Chris Torek

Interesting concept of pulling out all possible exceptions. Would be
theoretically possible to build a table that keeps track of them, but
automated tools may have problems:

a=5; b=7; c=12
d=1/(a+b-c) # This could throw ZeroDivisionError

if a+b>c:
d=1/(a+b-c) # This can't, because it's guarded.
else:
d=2

And don't tell me to rewrite that with try/except, because it's not
the same :)

I don't know if pylint is currently (or eventually :) ) smart
enough to realize that the "if" test here guarantees that a+b-c >
0 (if indeed it does guarantee it -- this depends on the types of
a, b, and c and the operations invoked by the + and - operators
here! -- but pylint *does* track all the types, to the extent that
it can, so it has, in theory, enough information to figure this out
for integers, at least).

If not, though, you could simply tell pylint not to complain
I'd be inclined to have comments about the exceptions that this can
itself produce, but if there's exceptions that come from illogical
arguments (like the TypeError above), then just ignore them and let
them propagate. If is_process("asdf") throws TypeError rather than
returning False, I would call that acceptable behaviour.

Right, this is precisely what I want: the ability to determine
which exceptions something might raise, catch some subset of them,
and allow the remaining ones to propagate.

I can do the "catch subset, allow remainder to propagate" but the
first step -- "determine possible exceptions" -- is far too difficult
right now. I have not found any documentation that points out that
os.kill() can raise TypeError, OverflowError, and DeprecationWarning.
TypeError was not a *surprise*, but the other two were.

(And this is only os.kill(). What about, say, subprocess.Popen()?
Strictly speaking, type inference cannot help quite enough here,
because the subprocess module does this:

data = self._read_no_intr(errpipe_read, 1048576)
# Exceptions limited to 1 MB
os.close(errpipe_read)
if data != "":
self._waitpid_no_intr(self.pid, 0)
child_exception = pickle.loads(data)
raise child_exception

and the pickle.loads() can create any exception sent to it from
the child, which can truly be any exception due to catching all
exceptions raised in preexec_fn, if there is one. Pylint can't do
type inference across the error-pipe between child and parent here.
However, it would suffice to set subprocess.__exceptions__ to some
reasonable tuple, and leave the preexec_fn exceptions to the text
documentation. [Of course, strictly speaking, the fact that the
read cuts off at 1 MB means that even the pickle.loads() call might
fail! But a megabyte of exception trace is probably plenty. :) ])
 
S

Steven D'Aprano

Exceptions are great, but...

Sometimes when calling a function, you want to catch some or even all
the various exceptions it could raise. What exceptions *are* those?

[snip much, much interactive code]

TL;DR

*wink*


Shorter version: "Is there any definitive list of what exceptions a
function could raise?"

Shorter answer: "No."


[...]
But how can I know a priori
that os.kill() could raise OverflowError in the first place?

You can't. Even if you studied the source code, you couldn't be sure that
it won't change in the future. Or that somebody will monkey-patch
os.kill, or a dependency, introducing a new exception.

More importantly though, most functions are reliant on their argument.
You *cannot* tell what exceptions len(x) will raise, because that depends
on what type(x).__len__ does -- and that could be anything. So, in
principle, any function could raise any exception.


[...]
Functions have a number of special "__" attributes. I think it might be
reasonable to have all of the built-in functions, at least, have one
more, perhaps spelled __exceptions__, that gives you a tuple of all the
exceptions that the function might raise. Imagine, then:

Or the author of the function could document the exceptions that it
raises. Either way, nothing prevents this list from getting out of sync
with the reality of which exceptions could be raised.

Another question -- is the list of exceptions part of the function's
official API? *All* of the exceptions listed, or only some of them?

Apart from your pylint example below -- which I don't find convincing in
the least, see further comments later -- I don't see the point of this.
You shouldn't have the attitude that "If a function could raise an
exception, I'm going to catch it". You have to understand the
circumstances that a function might raise, and decide whether or not you
want it caught. Hint: almost always, the answer is you don't.

Either way, a mere list of exceptions doesn't give you much. This adds
additional burden on the developer of the function, while giving little
benefit to the user.

'kill(pid, sig)\n\nKill a process with a signal.'

[this part exists]
(<type 'exceptions.OSError'>, <type 'exceptions.TypeError'>, <type
'exceptions.OverflowError'>, <type 'exceptions.DeprecationWarning'>)

[this is my new proposed part]

With something like this, a pylint-like tool could compute the
transitive closure of all the exceptions that could occur in any
function, by using __exceptions__ (if provided) or recursively finding
exceptions for all functions called, and doing a set-union.

In general, you can't do this at compile-time, only at runtime. There's
no point inspecting len.__exceptions__ at compile-time if len is a
different function at runtime.
 
C

Chris Torek

You can't. Even if you studied the source code, you couldn't be sure that
it won't change in the future. Or that somebody will monkey-patch
os.kill, or a dependency, introducing a new exception.

Indeed. However, if functions that "know" which exceptions they
themselves can raise "declare" this (through an __exceptions__
attribute for instance), then whoever changes the source or
monkey-patches os.kill can also make the appropriate change to
os.kill.__exceptions__.
More importantly though, most functions are reliant on their argument.
You *cannot* tell what exceptions len(x) will raise, because that depends
on what type(x).__len__ does -- and that could be anything. So, in
principle, any function could raise any exception.

Yes; this is exactly why you need a type-inference engine to make
this work. In this case, len() is more (though not quite exactly)
like the following user-defined function:

def len2(x):
try:
fn = x.__len__
except AttributeError:
raise TypeError("object of type %r has no len()" % type(x))
return fn()

eg:
Traceback (most recent call last):
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 5, in len2
TypeError: object of type <type 'int'> has no len()

In this case, len would not have any __exceptions__ field (or if
it does, it would not be a one-element tuple, but I currently think
it makes more sense for many of the built-ins to resort to rules
in the inference engine). This is also the case for most operators,
e.g., ordinary "+" (or operator.add) is syntactic sugar for:

first_operand.__add__(second_operand)

or:

second_operand.__radd__(first_operand)

depending on both operands' types and the first operand's __add__.

The general case is clearly unsolveable (being isomorphic to the
halting problem), but this is not in itself an excuse for attempting
to solve more-specific cases. A better excuse -- which may well
be "better enough" :) -- occurs when the specific cases that *can*
be solved are so relatively-rare that the approach degenerates into
uselessness.

It is worth noting that the approach I have in mind does not
survive pickling, which means a very large subset of Python code
is indigestible to a pylint-like exception-inference engine.
Another question -- is the list of exceptions part of the function's
official API? *All* of the exceptions listed, or only some of them?

All the ones directly-raised. What to do about "invisible"
dependencies (such as those in len2() if len2 is "invisible",
e.g., coded in C rather than Python) is ... less obvious. :)
In general, you can't do this at compile-time, only at runtime. There's
no point inspecting len.__exceptions__ at compile-time if len is a
different function at runtime.

Right. Which is why pylint is fallible ... yet pylint is still
valuable. At least, I find it so. It misses a lot of important
things -- it loses types across list operations, for instance --
but it catches enough to help. Here is a made-up example based on
actual errors I have found via pylint:

"doc"
class Frob(object):
"doc"
def __init__(self, arg1, arg2):
self.arg1 = arg1
self.arg2 = arg2

def frob(self, nicate):
"frobnicate the frob"
self.arg1 += nicate

def quux(self):
"return the frobnicated value"
example = self # demonstrate that pylint is not using the *name*
return example.argl # typo, meant arg1
...

$ pylint frob.py
************* Module frob
E1101: 15:Frob.quux: Instance of 'Frob' has no 'argl' member

("Loses types across list operations" means that, e.g.:

def quux(self):
return [self][0].argl

hides the type, and hence the typo, from pylint. At some point I
intend to go in and modify it to track the element-types of list
elements: in "enough" cases, a list's elements all have the same
type, which means we can predict the type of list. If a list
contains mixed types, of course, we have to fall back to the
failure-to-infer case.)

(This also shows that much real code might raise IndexError: any
list subscript that is out of range does so. So a lot of real
functions *might* raise IndexError, etc., which is another argument
that "in real code, an exception inference engine will wind up
concluding that every line might raise every exception". Which
might be true, but I still believe, for the moment, that a tool
for inferring exceptions would have some value.)
 
G

Gregory Ewing

Chris said:
Oops! It turns out that os.kill() can raise OverflowError (at
least in this version of Python, not sure what Python 3.x does).

Seems to me that if this happens it indicates a bug in
your code. It only makes sense to pass kill() something
that you know to be the pid of an existing process,
presumably one returned by some other system call.

So if kill() raises OverflowError, you *don't* want
to catch and ignore it. You want to find out about it,
just as much as you want to find out about a TypeError,
so you can track down the cause and fix it.

Generally I think some people worry far too much about
anticipating and catching exceptions. Don't do that,
just let them happen. If you come across a *specific*
exception that it makes sense to catch, then catch
just that particular one. Let *everything* else propagate.
 
S

Steven D'Aprano

Generally I think some people worry far too much about
anticipating and catching exceptions. Don't do that,
just let them happen. If you come across a specific
exception that it makes sense to catch, then catch
just that particular one. Let everything else propagate.

Good advice.

+1
 
C

Chris Torek

Seems to me that if this happens it indicates a bug in
your code. It only makes sense to pass kill() something
that you know to be the pid of an existing process,
presumably one returned by some other system call.

So if kill() raises OverflowError, you *don't* want
to catch and ignore it. You want to find out about it,
just as much as you want to find out about a TypeError,
so you can track down the cause and fix it.

A bunch of you are missing the point here, perhaps because my
original example was "not the best", as it were. (I wrote it
on the fly; the actual code was elsewhere at the time.)

I do, indeed, want to "find out about it". But in this case
what I want to find out is "the number I thought was a pid,
was not a pid", and I want to find that out early and catch
the OverflowError() in the function in question.

(The two applications here are a daemon and a daemon-status-checking
program. The daemon needs to see if another instance of itself is
already running [*]. The status-checking program needs to see if
the daemon is running [*]. Both open a pid file and read the contents.
The contents might be stale or trash. I can check for trash because
int(some_string) raises ValueError. I can then check the now-valid
pid via os.kill(). However, it turns out that one form of "trash"
is a pid that does not fit within sys.maxint. This was a surprise
that turned up only in testing, and even then, only because I
happened to try a ridiculously large value as one of my test cases.
It *should*, for some value of "should" :) , have turned up much
earlier, such as when running pylint.)

([*] The test does not have to be perfect, but it sure would be
nice if it did not result in a Python stack dump. :) )
 
G

Gregory Ewing

Chris said:
I can then check the now-valid
pid via os.kill(). However, it turns out that one form of "trash"
is a pid that does not fit within sys.maxint. This was a surprise
that turned up only in testing, and even then, only because I
happened to try a ridiculously large value as one of my test cases.

It appears that this situation is not unique to os.kill(),
for example,
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C long

In fact I'd expect it to happen any time you pass a
very large int to something that's wrapping a C function.

You can't really blame the wrappers for this -- it's not
reasonable to expect all of them to catch out-of-range ints
and do whatever the underlying function would have done if
it were given an invalid argument.

I think the lesson to take from this is that you should
probably add OverflowError to the list of things to catch
whenever you're calling a function with input that's not
fully validated.
 
C

Chris Torek

It appears that this situation is not unique to os.kill(),
for example,

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C long

In fact I'd expect it to happen any time you pass a
very large int to something that's wrapping a C function.

You can't really blame the wrappers for this -- it's not
reasonable to expect all of them to catch out-of-range ints
and do whatever the underlying function would have done if
it were given an invalid argument.

I think the lesson to take from this is that you should
probably add OverflowError to the list of things to catch
whenever you're calling a function with input that's not
fully validated.

Indeed. (Provided that your call is the point at which the validation
should occur -- otherwise, let the exception flow upwards as usual.)

But again, this is why I would like to have the ability to use some
sort of automated tool, where one can point at any given line of
code and ask: "what exceptions do you, my faithful tool, believe
can be raised as a consequence of this line of code?"

If you point it at the call to main():

if __name__ == '__main__':
main()

then you are likely to get a useless answer ("why, any exception
at all"); but if you point it at a call to os.read(), then you get
one that is useful -- and tells you (or me) about the OverflowError.
If you point it at a call to len(x), then the tool tells you what
it knows about type(x) and x.__len__. (This last may well be
"nothing": some tools have only limited application. However, if
the call to len(x) is preceded by an "assert isinstance(x,
(some,fixed,set,of,types)) for instance, or if all calls to the
function that in turn calls len(x) are visible and the type of x
can be proven, the tool might tell you something useful agin.)

It is clear at this point that a simple list (or tuple) of "possible
exceptions" is insufficient -- the tool has to learn, somehow, that
len() raises TypeError itself, but also raises whatever x.__len__
raises (where x is the parameter to len()). If I ever get around
to attempting this in pylint (in my Copious Spare Time no doubt
:) ), I will have to start with an external mapping from "built
in function F" to "exceptions that F raises" and figure out an
appropriate format for the table's entries. That is about half
the point of this discussion (to provoke thought about how one
might express this); the other half is to note that the documentation
could probably be improved (as someone else already noted elsethread).

Note that, if nothing else, the tool -- even in limited form,
without the kind of type inference that pylint attempts -- gives
you the ability to automate part of the documentation process.
 
C

Chris Angelico

No. The answer is *still* “why, any exception at all”. The name
‘os.read’ could be re-bound at run-time to any object at all, so a code
checker that you “point at any given line of code” can't know what the
name will be bound to when that line gets executed.

Sure it can. And KeyboardInterrupt could be raised at any time, too.
But this is a TOOL, not a deity. If Function X is known to call
Function Y and built-in method Z, and also raises FooException, then
X's list of "most likely exceptions" would be FooException +
Y.__exceptions__ + Z.__exceptions__. It won't be perfect, but it'd be
something that could go into an autodoc-style facility. Obviously you
can fiddle with things, but in _ordinary usage_ this is what it's
_most likely_ to produce.

Chris Angelico
 
S

steve+comp.lang.python

Chris said:
Sure it can. And KeyboardInterrupt could be raised at any time, too.
But this is a TOOL, not a deity. If Function X is known to call
Function Y and built-in method Z,

Known by whom? You? Me? The author of X, Y or Z? Everybody? The tool?

How is the tool supposed to know which functions are called? What if it
doesn't have access to the source code of Y? It might only be available via
a .pyc file, or it might be written in C, or Fortran, or Java (for Jython),
or C# (for IronPython).

Who is responsible for ensuring that every time the implementation of *any*
of X, Y and Z change, the list of exceptions is updated? What do you think
the chances are that this list will remain accurate after a few years of
maintenance?

Is this list of exceptions part of the API of function X? Should X be held
responsible if Z stops raising (say) AttributeError and starts raising
NameError instead?

Should the *implementation* of X, namely the fact that it calls Y and Z, now
considered part of the public interface?

These are serious questions, not nit-picks. Unless they can be answered
satisfactorily, this hypothetical tool *cannot exist*. It simply won't
work. I believe that you might as well be asking for a deity, because the
tool will need supernatural powers beyond the abilities of ordinary code.

And I haven't even raised the spectre of replacing functions (even builtins)
at runtime, or the use of eval/exec, or any number of other tricks.

and also raises FooException, then
X's list of "most likely exceptions" would be FooException +
Y.__exceptions__ + Z.__exceptions__.

Even if you somehow, magically, know that X calls Y and Z, you can't draw
that conclusion. Lists of exceptions don't add like that. Consider:

def Y(a):
if a is None: raise ValueError
return a
Y.__exceptions__ = (ValueError,)

def X(a):
if a is None: raise TypeError
return Y(a)
X.__exceptions__ = (TypeError,)


You claim that X's "most likely exceptions" are given by X.__exceptions__ +
Y.__exceptions__. Under what circumstances do you think X could raise
ValueError?

For bonus points, identify the lies in the above code. (Hint: there are at
least two.)

It won't be perfect, but it'd be
something that could go into an autodoc-style facility. Obviously you
can fiddle with things, but in _ordinary usage_ this is what it's
_most likely_ to produce.

All this will do is lull people into a false sense of security as they come
to rely on incorrect and out-of-date information. They'll still be in as
ignorant a position re exceptions as they are now, only they won't know it.
 
C

Chris Angelico

Known by whom? You? Me? The author of X, Y or Z? Everybody? The tool?

How is the tool supposed to know which functions are called? What if it
doesn't have access to the source code of Y? It might only be available via
a .pyc file, or it might be written in C, or Fortran, or Java (for Jython),
or C# (for IronPython).

The idea I was toying with was that it would have the source to X, so
it knows that it calls Y. Unfortunately duck typing makes that
difficult for anything where an object is passed in, but it's at least
possible with simpler calls.
Who is responsible for ensuring that every time the implementation of *any*
of X, Y and Z change, the list of exceptions is updated? What do you think
the chances are that this list will remain accurate after a few years of
maintenance?

The tool would be run on a snapshot of code. If you update the code,
you rerun it.
Is this list of exceptions part of the API of function X? Should X be held
responsible if Z stops raising (say) AttributeError and starts raising
NameError instead?

Should the *implementation* of X, namely the fact that it calls Y and Z, now
considered part of the public interface?

Again, not an issue if you don't expect it to be stable. You just look
at how the code functions _now_.
These are serious questions, not nit-picks. Unless they can be answered
satisfactorily, this hypothetical tool *cannot exist*. It simply won't
work. I believe that you might as well be asking for a deity, because the
tool will need supernatural powers beyond the abilities of ordinary code.

Yep. And because of duck typing, the information isn't really there. I
think you're right that it's impossible.
And I haven't even raised the spectre of replacing functions (even builtins)
at runtime, or the use of eval/exec, or any number of other tricks.

Right, but this tool would simply be useless then.
Even if you somehow, magically, know that X calls Y and Z, you can't draw
that conclusion. Lists of exceptions don't add like that. Consider:

Again, that would be a limitation of the tool. I'd prefer that it
listed more exceptions than less.
All this will do is lull people into a false sense of security as they come
to rely on incorrect and out-of-date information. They'll still be in as
ignorant a position re exceptions as they are now, only they won't know it.

Yep. Agreed (except for the out-of-date qualifier), and that probably
means this won't be of much use. But hey, it was an interesting
thought experiment.

ChrisA
 
J

John Nagle

If you passed an integer that was at some time a valid PID
to "os.kill()", and OverflowError was raised, I'd consider that
a bug in "os.kill()". Only OSError, or some subclass thereof,
should be raised for a possibly-valid PID.

If you passed some unreasonably large number, that would be
a legitimate reason for an OverflowError. That's for parameter
errors, though; it shouldn't happen for environment errors.

That's a strong distinction. If something can raise an
exception because the environment external to the process
has a problem, the exception should be an EnvironmentError
or a subclass thereof. This maintains a separation between
bugs (which usually should cause termination or fairly
drastic recovery action) and normal external events (which
have to be routinely handled.)

It's quite possible to get a OSError on "os.kill()" for
a number of legitimate reasons. The target process may have
exited since the PID was obtained, for example.

John Nagle
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,900
Latest member
Nell636132

Latest Threads

Top