How to pop the interpreter's stack?

kj · Dec 14, 2010

Consider this code:

def spam(*args, **kwargs):
args, kwargs = __pre_spam(*args, **kwargs)

# args & kwargs are OK: proceed
# ...

def __pre_spam(*args, **kwargs):
# validate args & kwargs;
# return canonicalized versions of args & kwargs;
# on failure, raise some *informative* exception
# ...

return canonicalized_args, canonicalized_kwargs

I write functions like __pre_spam for one reason only: to remove
clutter from a corresponding spam function that has a particularly
complex argument-validation/canonicalization stage. In effect,
spam "outsources" to __pre_spam the messy business of checking and
conditioning its arguments.

The one thing I don't like about this strategy is that the tracebacks
of exceptions raised during the execution of __pre_spam include one
unwanted stack level (namely, the one corresponding to __pre_spam
itself).

__pre_spam should be completely invisible and unobtrusive, as if
it had been textually "inlined" into spam prior to the code's
interpretation. And I want to achieve this without in any way
cluttering spam with try/catches, decorators, and whatnot. (After
all, the whole point of introducing __pre_spam is to declutter
spam.)

It occurs to me, in my innocence (since I don't know the first
thing about the Python internals), that one way to achieve this
would be to have __pre_spam trap any exceptions (with a try/catch
around its entire body), and somehow pop its frame from the
interpreter stack before re-raising the exception. (Or some
clueful/non-oxymoronic version of this.) How feasible is this?
And, if it is quite unfeasible, is there some other way to achieve
the same overall design goals described above?

TIA!

~kj

Ethan Furman · Dec 15, 2010

kj said:
The one thing I don't like about this strategy is that the tracebacks
of exceptions raised during the execution of __pre_spam include one
unwanted stack level (namely, the one corresponding to __pre_spam
itself).

__pre_spam should be completely invisible and unobtrusive

I am unaware of any way to accomplish what you desire. I also think
this is one of those things that's not worth fighting -- how often are
you going to see such a traceback? When somebody makes a coding
mistake? I would say change the name (assuming yours was a real
example) to something more meaningful like _spam_arg_verifier and call
it good.

Alternatively, perhaps you could make a more general arg_verifier that
could be used for all such needs, and then your traceback would have:

caller

spam

arg_verifier

and that seems useful to me (it is, in fact, how I have mine set up).

Hope this helps!

~Ethan~

Steven D'Aprano · Dec 15, 2010

Consider this code:

def spam(*args, **kwargs):
args, kwargs = __pre_spam(*args, **kwargs)

# args & kwargs are OK: proceed
# ...

def __pre_spam(*args, **kwargs):
# validate args & kwargs;
# return canonicalized versions of args & kwargs; # on failure,
raise some *informative* exception # ...
return canonicalized_args, canonicalized_kwargs

Double leading underscores don't have any special meaning in the global
scope. Save yourself an underscore and call it _pre_spam instead

In fact, even if spam and __pre_spam are methods, it's probably a good
idea to avoid the double-underscore name mangling. It's usually more
trouble than it's worth.

I write functions like __pre_spam for one reason only: to remove clutter
from a corresponding spam function that has a particularly complex
argument-validation/canonicalization stage. In effect, spam
"outsources" to __pre_spam the messy business of checking and
conditioning its arguments.

A perfectly sensible arrangement.

The one thing I don't like about this strategy is that the tracebacks of
exceptions raised during the execution of __pre_spam include one
unwanted stack level (namely, the one corresponding to __pre_spam
itself).

But why is it unwanted? The traceback shows where the error occurs -- it
occurs in __pre_spam, not spam, or __post_spam, or spam_caller, or
anywhere else. Even if it's possible, having the traceback *lie* about
where it occurs is a bad idea which will cause confusion to anyone trying
to maintain the software in the future.

I can't think of any way to do it, but frankly I haven't thought too hard
about it. I'm glad I can't think of any way of doing it, because the
thought of having tracebacks lie about where they come from gives me the
shivers. Imagine debugging when you've edited the source but are still
running the old version, and now the reported line numbers don't match up
with the source file -- it would be like that, only worse.

Tim Arnold · Dec 15, 2010

Ethan Furman said:
I am unaware of any way to accomplish what you desire. I also think this
is one of those things that's not worth fighting -- how often are you
going to see such a traceback? When somebody makes a coding mistake? I
would say change the name (assuming yours was a real example) to something
more meaningful like _spam_arg_verifier and call it good.

Alternatively, perhaps you could make a more general arg_verifier that
could be used for all such needs, and then your traceback would have:

caller

spam

arg_verifier

and that seems useful to me (it is, in fact, how I have mine set up).

Hope this helps!

~Ethan~

I thought people would advise using a decorator for this one. Wouldn't that
work?
thanks,
--Tim

Ethan Furman · Dec 16, 2010

Tim said:
I thought people would advise using a decorator for this one. Wouldn't that
work?
thanks,
--Tim

A decorator was one of the items kj explicity didn't want. Also, while
it would have a shallower traceback for exceptions raised during the
__pre_spam portion, any exceptions raised during spam itself would then
be one level deeper than desired... while that could be masked by
catching and (re-?)raising the exception in the decorator, Steven had a
very good point about why that is a bad idea -- namely, tracebacks
shouldn't lie about where the error is.

~Ethan~

Steven D'Aprano · Dec 16, 2010

[...]
A decorator was one of the items kj explicity didn't want. Also, while
it would have a shallower traceback for exceptions raised during the
__pre_spam portion, any exceptions raised during spam itself would then
be one level deeper than desired... while that could be masked by
catching and (re-?)raising the exception in the decorator, Steven had a
very good point about why that is a bad idea -- namely, tracebacks
shouldn't lie about where the error is.

True, very true... but many hours later, it suddenly hit me that what KJ
was asking for wasn't *necessarily* such a bad idea. My thought is,
suppose you have a function spam(x) which raises an exception. If it's a
*bug*, then absolutely you need to see exactly where the error occurred,
without the traceback being mangled or changed in any way.

But what if the exception is deliberate, part of the function's
documented behaviour? Then you might want the exception to appear to come
from the function spam even if it was actually generated inside some
private sub-routine.

So, with qualifications, I have half changed my mind.

Robert Kern · Dec 16, 2010

Tim said:
Tim said:

kj wrote:
The one thing I don't like about this strategy is that the tracebacks
of exceptions raised during the execution of __pre_spam include one
unwanted stack level (namely, the one corresponding to __pre_spam
itself).

Click to expand...

[...]
A decorator was one of the items kj explicity didn't want. Also, while
it would have a shallower traceback for exceptions raised during the
__pre_spam portion, any exceptions raised during spam itself would then
be one level deeper than desired... while that could be masked by
catching and (re-?)raising the exception in the decorator, Steven had a
very good point about why that is a bad idea -- namely, tracebacks
shouldn't lie about where the error is.

Click to expand...

True, very true... but many hours later, it suddenly hit me that what KJ
was asking for wasn't *necessarily* such a bad idea. My thought is,
suppose you have a function spam(x) which raises an exception. If it's a
*bug*, then absolutely you need to see exactly where the error occurred,
without the traceback being mangled or changed in any way.

But what if the exception is deliberate, part of the function's
documented behaviour? Then you might want the exception to appear to come
from the function spam even if it was actually generated inside some
private sub-routine.

Obfuscating the location that an exception gets raised prevents a lot of
debugging (by inspection or by pdb), even if the exception is deliberately
raised with an informative error message. Not least, the code that decides to
raise that exception may be buggy. But even if the actual error is outside of
the function (e.g. the caller is passing bad arguments), you want to at least
see what tests the __pre_spam function is doing in order to decide to raise that
exception.

Tracebacks are inherently over-verbose. This is necessarily true because no
algorithm (or clever programmer) can know all the pieces of information that the
person debugging may want to know a priori. Most customizations of tracebacks
*add* more verbosity rather than reduce it. Removing one stack level from the
traceback barely makes the traceback more readable and removes some of the most
relevant information.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

Steven D'Aprano · Dec 17, 2010

Tim Arnold wrote:
kj wrote:
The one thing I don't like about this strategy is that the
tracebacks of exceptions raised during the execution of __pre_spam
include one unwanted stack level (namely, the one corresponding to
__pre_spam itself). [...]
A decorator was one of the items kj explicity didn't want. Also,
while it would have a shallower traceback for exceptions raised during
the __pre_spam portion, any exceptions raised during spam itself would
then be one level deeper than desired... while that could be masked by
catching and (re-?)raising the exception in the decorator, Steven had
a very good point about why that is a bad idea -- namely, tracebacks
shouldn't lie about where the error is.

Click to expand...

True, very true... but many hours later, it suddenly hit me that what
KJ was asking for wasn't *necessarily* such a bad idea. My thought is,
suppose you have a function spam(x) which raises an exception. If it's
a *bug*, then absolutely you need to see exactly where the error
occurred, without the traceback being mangled or changed in any way.

But what if the exception is deliberate, part of the function's
documented behaviour? Then you might want the exception to appear to
come from the function spam even if it was actually generated inside
some private sub-routine.

Click to expand...

Obfuscating the location that an exception gets raised prevents a lot of
debugging (by inspection or by pdb), even if the exception is
deliberately raised with an informative error message. Not least, the
code that decides to raise that exception may be buggy. But even if the
actual error is outside of the function (e.g. the caller is passing bad
arguments), you want to at least see what tests the __pre_spam function
is doing in order to decide to raise that exception.

And how do you think you see that from the traceback? The traceback
prints the line which actually raises the exception (and sometimes not
even that!), which is likely to be a raise statement:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "example.py", line 3, in func
raise ValueError('bad value for x')
ValueError: bad value for x

The actual test is:

def func(x):
if x > 10 and x%2 == 0:
raise ValueError('bad value for x')

but you can't get that information from the traceback.

Python's exception system has to handle two different situations: buggy
code, and bad data. It's not even clear whether there is a general
distinction to be made between the two, but even if there's not a general
distinction, there's certainly a distinction which we can *sometimes*
make. If a function contains a bug, we need all the information we can
get, including the exact line that causes the fault. But if the function
deliberately raises an exception due to bad input, we don't need any
information regarding the internals of the function (assuming that the
exception is sufficiently detailed, a big assumption I grant you!). If I
re-wrote the above func() like this:

def func(x):
if !(x <= 10):
if x%2 != 0:
pass
else:
raise ValueError('bad value for x')
return

I would have got the same traceback, except the location of the exception
would have been different (line 6, in a nested if-block). To the caller,
whether I had written the first version of func() or the second is
irrelevant. If I had passed the input validation off to a second
function, that too would be irrelevant.

I don't expect Python to magically know whether an exception is a bug or
not, but there's something to be said for the ability to turn Python
functions into black boxes with their internals invisible, like C
functions already are. If (say) math.atan2(y, x) raises an exception, you
have no way of knowing whether atan2 is a single monolithic function, or
whether it is split into multiple pieces. The location of the exception
is invisible to the caller: all you can see is that atan2 raised an
exception.

Tracebacks are inherently over-verbose. This is necessarily true because
no algorithm (or clever programmer) can know all the pieces of
information that the person debugging may want to know a priori. Most
customizations of tracebacks *add* more verbosity rather than reduce it.
Removing one stack level from the traceback barely makes the traceback
more readable and removes some of the most relevant information.

Right. But I have thought of a clever trick to get the result KJ was
asking for, with the minimum of boilerplate code. Instead of this:

def _pre_spam(args):
if condition(args):
raise SomeException("message")
if another_condition(args):
raise AnotherException("message")
if third_condition(args):
raise ThirdException("message")

def spam(args):
_pre_spam(args)
do_useful_work()

you can return the exceptions instead of raising them (exceptions are
just objects, like everything else!), and then add one small piece of
boilerplate to the spam() function:

def _pre_spam(args):
if condition(args):
return SomeException("message")
if another_condition(args):
return AnotherException("message")
if third_condition(args):
return ThirdException("message")

def spam(args):
exc = _pre_spam(args)
if exc: raise exc
do_useful_work()

Carl Banks · Dec 17, 2010

Double leading underscores don't have any special meaning in the global
scope. Save yourself an underscore and call it _pre_spam instead

In fact, even if spam and __pre_spam are methods, it's probably a good
idea to avoid the double-underscore name mangling. It's usually more
trouble than it's worth.

A perfectly sensible arrangement.

But why is it unwanted? The traceback shows where the error occurs -- it
occurs in __pre_spam, not spam, or __post_spam, or spam_caller, or
anywhere else. Even if it's possible, having the traceback *lie* about
where it occurs is a bad idea which will cause confusion to anyone trying
to maintain the software in the future.

I don't agree with kj's usage, but I have customized the traceback to
remove items before. In my case it was to remove lines for endemic
wrapper functions.

The traceback lines showing the wrapper functions in the stack were
useless, and since pretty much every function was wrapped it meant
half the lines in that traceback were useless. (Really. I was scanning
the loaded modules and adding wrappers to every function found. Never
mind why.) I only printed the wrapper line if it was the very top of
the stack.

I can't think of any way to do it,

You override sys.excepthook to print lines from the traceback
selectively.

Carl Banks

John Nagle · Dec 17, 2010

I am unaware of any way to accomplish what you desire. I also think this
is one of those things that's not worth fighting -- how often are you
going to see such a traceback? When somebody makes a coding mistake?

Right. If you are worried about what the user sees in a traceback,
you are doing it wrong.

Consider reporting detailed error information via the logging
module, for example.

John Nagle

Robert Kern · Dec 17, 2010

On Thu, 16 Dec 2010 07:29:25 -0800, Ethan Furman wrote:

Tim Arnold wrote:
kj wrote:
The one thing I don't like about this strategy is that the
tracebacks of exceptions raised during the execution of __pre_spam
include one unwanted stack level (namely, the one corresponding to
__pre_spam itself).
[...]
A decorator was one of the items kj explicity didn't want. Also,
while it would have a shallower traceback for exceptions raised during
the __pre_spam portion, any exceptions raised during spam itself would
then be one level deeper than desired... while that could be masked by
catching and (re-?)raising the exception in the decorator, Steven had
a very good point about why that is a bad idea -- namely, tracebacks
shouldn't lie about where the error is.

True, very true... but many hours later, it suddenly hit me that what
KJ was asking for wasn't *necessarily* such a bad idea. My thought is,
suppose you have a function spam(x) which raises an exception. If it's
a *bug*, then absolutely you need to see exactly where the error
occurred, without the traceback being mangled or changed in any way.

But what if the exception is deliberate, part of the function's
documented behaviour? Then you might want the exception to appear to
come from the function spam even if it was actually generated inside
some private sub-routine.

Click to expand...

Obfuscating the location that an exception gets raised prevents a lot of
debugging (by inspection or by pdb), even if the exception is
deliberately raised with an informative error message. Not least, the
code that decides to raise that exception may be buggy. But even if the
actual error is outside of the function (e.g. the caller is passing bad
arguments), you want to at least see what tests the __pre_spam function
is doing in order to decide to raise that exception.

Click to expand...

And how do you think you see that from the traceback? The traceback
prints the line which actually raises the exception (and sometimes not
even that!), which is likely to be a raise statement:
Traceback (most recent call last):
File "<stdin>", line 1, in<module>
File "example.py", line 3, in func
raise ValueError('bad value for x')
ValueError: bad value for x

The actual test is:

def func(x):
if x> 10 and x%2 == 0:
raise ValueError('bad value for x')

but you can't get that information from the traceback.

But I can get the line number and trivially go look it up. If we elide that
stack frame, I have to go hunting and possibly make some guesses. Depending on
the organization of the code, I may have to make some guesses anyways, but if I
keep the decision to raise an exception close to the actual raising of the
exception, it makes things a lot easier.

Python's exception system has to handle two different situations: buggy
code, and bad data. It's not even clear whether there is a general
distinction to be made between the two, but even if there's not a general
distinction, there's certainly a distinction which we can *sometimes*
make. If a function contains a bug, we need all the information we can
get, including the exact line that causes the fault. But if the function
deliberately raises an exception due to bad input, we don't need any
information regarding the internals of the function (assuming that the
exception is sufficiently detailed, a big assumption I grant you!). If I
re-wrote the above func() like this:

def func(x):
if !(x<= 10):
if x%2 != 0:
pass
else:
raise ValueError('bad value for x')
return

I would have got the same traceback, except the location of the exception
would have been different (line 6, in a nested if-block). To the caller,
whether I had written the first version of func() or the second is
irrelevant. If I had passed the input validation off to a second
function, that too would be irrelevant.

The caller doesn't care about tracebacks one way or the other, either. Only
someone *viewing* the traceback cares as well as debuggers like pdb. Eliding the
stack frame neither helps nor harms the caller, but it does substantially harm
the developer viewing tracebacks or using a debugger.

I don't expect Python to magically know whether an exception is a bug or
not, but there's something to be said for the ability to turn Python
functions into black boxes with their internals invisible, like C
functions already are. If (say) math.atan2(y, x) raises an exception, you
have no way of knowing whether atan2 is a single monolithic function, or
whether it is split into multiple pieces. The location of the exception
is invisible to the caller: all you can see is that atan2 raised an
exception.

And that has frustrated my debugging efforts more often than I can count. I
would dearly love to have a debugger that can traverse both Python and C stack
frames. This is a deficiency, not a benefit to be extended to pure Python functions.

Right. But I have thought of a clever trick to get the result KJ was
asking for, with the minimum of boilerplate code. Instead of this:

def _pre_spam(args):
if condition(args):
raise SomeException("message")
if another_condition(args):
raise AnotherException("message")
if third_condition(args):
raise ThirdException("message")

def spam(args):
_pre_spam(args)
do_useful_work()

you can return the exceptions instead of raising them (exceptions are
just objects, like everything else!), and then add one small piece of
boilerplate to the spam() function:

def _pre_spam(args):
if condition(args):
return SomeException("message")
if another_condition(args):
return AnotherException("message")
if third_condition(args):
return ThirdException("message")

def spam(args):
exc = _pre_spam(args)
if exc: raise exc
do_useful_work()

And that makes post-mortem pdb debugging into _pre_spam impossible. Like I said,
whether the bug is inside _pre_spam or is in the code that is passing the bad
argument, being able to navigate stack frames to where the code is deciding that
there is an exceptional condition is important.

Kern's First Maxim: Raise exceptions close to the code that decides to raise an
exception.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

kj · Dec 22, 2010

In said:
Obfuscating the location that an exception gets raised prevents a lot of
debugging...

The Python interpreter does a lot of that "obfuscation" already, and I
find the resulting tracebacks more useful for it.

An error message is only useful to a given audience if that audience
can use the information in the message to modify what they are
doing to avoid the error. It is of no use (certainly no *immediate*
use) to this audience to see tracebacks that go deep into code that
they don't know anything about and cannot change.

For example, consider this:

#-----------------------------------------------------------------
def foo(x, **k): pass

def bar(*a, **k):
if len(a) > 1: raise TypeError('too many args')

def baz(*a, **k): _pre_baz(*a, **k)

def _pre_baz(*a, **k):
if len(a) > 1: raise TypeError('too many args')

if __name__ == '__main__':
from traceback import print_exc
try: foo(1, 2)
except: print_exc()
try: bar(1, 2)
except: print_exc()
try: baz(1, 2)
except: print_exc()
#-----------------------------------------------------------------

(The code in the "if __name__ == '__main__'" section is meant to
simulate the general case in which the functions defined in this file
are called by third-party code.) When you run this code the output is
this (a few blank lines added for clarity):

Traceback (most recent call last):
File "/tmp/ex2.py", line 5, in <module>
try: foo(1, 2)
TypeError: foo() takes exactly 1 argument (2 given)

Traceback (most recent call last):
File "/tmp/ex2.py", line 7, in <module>
try: bar(1, 2)
File "/tmp/example.py", line 4, in bar
if len(a) > 1: raise TypeError('too many args')
TypeError: too many args

Traceback (most recent call last):
File "/tmp/ex2.py", line 9, in <module>
try: baz(1, 2)
File "/tmp/example.py", line 6, in baz
def baz(*a, **k): _pre_baz(*a, **k)
File "/tmp/example.py", line 9, in _pre_baz
if len(a) > 1: raise TypeError('too many args')
TypeError: too many args

In all cases, the *programming* errors are identical: functions called
with the wrong arguments. The traceback from foo(1, 2) tells me this
very clearly, and I'm glad that Python is not also giving me the
traceback down to where the underlying C code throws the exception: I
don't need to see all this machinery.

In contrast, the tracebacks from bar(1, 2) and baz(1, 2) obscure the
fundamental problem with useless detail. From the point of view of
the programmer that is using these functions, it is of no use to know
that the error resulted from some "raise TypeError" statement
somewhere, let alone that this happened in some obscure, private
function _pre_baz.

Perhaps I should have made it clearer in my original post that the
tracebacks I want to clean up are those from exceptions *explicitly*
raised by my argument-validating helper function, analogous to
_pre_baz above. I.e. I want that when my spam function is called
(by code written by someone else) with the wrong arguments, the
traceback looks more like this

Traceback (most recent call last):
File "/some/python/code.py", line 123, in <module>
spam(some, bad, args)
TypeError: the second argument is bad

than like this:

Traceback (most recent call last):
File "/some/python/code.py", line 123, in <module>
spam(some, bad, args)
File "/my/niftymodule.py", line 456, in niftymodule
_pre_spam(*a, **k)
File "/my/niftymodule.py", line 789, in __pre_spam
raise TypeError('second argument to spam is bad')
TypeError: the second argument is bad

In my opinion, the idea that more is always better in a traceback
is flat out wrong. As the example above illustrates, the most
useful traceback is the one that stops at the deepest point where
the *intended audience* for the traceback can take action, and goes
no further. The intended audience for the errors generated by my
argument-checking functions should see no further than the point
where they called a function incorrectly.

~kj

Carl Banks · Dec 22, 2010

The Python interpreter does a lot of that "obfuscation" already, and I
find the resulting tracebacks more useful for it.

An error message is only useful to a given audience if that audience
can use the information in the message to modify what they are
doing to avoid the error.

So when the audience files a bug report it's not useful for them to
include the whole traceback?

It is of no use (certainly no *immediate*
use) to this audience to see tracebacks that go deep into code that
they don't know anything about and cannot change.

Seriously, quit trying to do the user favors. There's nothing that
pisses me off than a self-important developer thinking he knows what
the user wants more than the user does.

Carl Banks

kj · Dec 22, 2010

So when the audience files a bug report it's not useful for them to
include the whole traceback?

Learn to read, buster. I wrote *immediate* use.

~kj

Steven D'Aprano · Dec 23, 2010

So when the audience files a bug report it's not useful for them to
include the whole traceback?

Well, given the type of error KJ has been discussing, no, it isn't useful.

Fault: function raises documented exception when passed input that
is documented as being invalid

What steps will reproduce the problem?
1. call the function with invalid input
2. read the exception that is raised
3. note that it is the same exception as documented

What is the expected output? What do you see instead?

Excepted somebody to hit me on the back of the head and tell me
not to call the function with invalid input. Instead I just got
an exception.

You seem to have completely missed that there will be no bug report,
because this isn't a bug. (Or if it is a bug, the bug is elsewhere,
external to the function that raises the exception.) It is part of the
promised API. The fact that the exception is generated deep down some
chain of function calls is irrelevant.

The analogy is this: imagine a function that delegates processing of the
return result to different subroutines:

def func(arg):
if arg > 0:
return _inner1(arg)
else:
return _inner2(arg)

This is entirely irrelevant to the caller. When they receive the return
result from calling func(), they have no way of knowing where the result
came from, and wouldn't care even if they could. Return results hide
information about where the result was calculated, as they should. Why
shouldn't deliberate, explicit, documented exceptions be treated the same?

Tracebacks expose the implementation details of where the exception was
generated. This is the right behaviour if the exception is unexpected --
a bug internal to func -- since you need knowledge of the implementation
of func in order to fix the unexpected exception. So far so good -- we
accept that Python's behaviour under these circumstances is correct.

But this is not the right behaviour when the exception is expected, e.g.
an explicitly raised exception in response to an invalid argument. In
this case, the traceback exposes internal details of no possible use to
the caller. What does the caller care if func() delegates (e.g.) input
checking to a subroutine? The subroutine is an irrelevant implementation
detail. The exception is promised output of the function, just as much so
as if it were a return value.

Consider the principle that exceptions should be dealt with as close as
possible to the actual source of the problem:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in f
File "<stdin>", line 2, in g
File "<stdin>", line 2, in h
File "<stdin>", line 2, in i
File "<stdin>", line 2, in j
File "<stdin>", line 2, in k <=== error occurs here, and shown here
ValueError

But now consider the scenario where the error is not internal to f, but
external. The deeper down the stack trace you go, the further away from
the source of the error you get. The stack trace now obscures the source
of the error, rather than illuminating it:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in f
File "<stdin>", line 2, in g
File "<stdin>", line 2, in h
File "<stdin>", line 2, in i
File "<stdin>", line 2, in j
File "<stdin>", line 2, in k <=== far from the source of error
ValueError

There's no point in inspecting function k for a bug when the problem has
nothing to do with k. The problem is that the input fails to match the
pre-conditions for f. From the perspective of the caller, the error has
nothing to do with k, k is a meaningless implementation detail, and the
source of the error is the mismatch between the input and what f expects.
And so by the principle of dealing with exceptions as close as possible
to the source of the error, we should desire this traceback instead:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in f <=== matches where the error occurs
ValueError

In the absence of any practical way for function f to know whether an
arbitrary exception in a subroutine is a bug or not, the least-worst
decision is Python's current behaviour: take the conservative, risk-
adverse path and assume the worst, treat the exception as a bug in the
subroutine, and expose the entire stack trace.

But, I suggest, we can do better using the usual Python strategy of
implementing sensible default behaviour but allowing objects to customize
themselves. Objects can declare themselves to be instances of some other
class, or manipulate what names are reported by dir. Why shouldn't a
function deliberately and explicitly take ownership of an exception
raised by a subroutine?

There should be a mechanism for Python functions to distinguish between
unexpected exceptions (commonly known as "bugs"), which should be
reported as coming from wherever they come from, and documented, expected
exceptions, which should be reported as coming from the function
regardless of how deep the function call stack really is.

Carl Banks · Dec 23, 2010

There should be a mechanism for Python functions to distinguish between
unexpected exceptions (commonly known as "bugs"), which should be
reported as coming from wherever they come from, and documented, expected
exceptions, which should be reported as coming from the function
regardless of how deep the function call stack really is.

No, -100. The traceback isn't the place for this. I've never
disagreed with you more, and I've disagreed with you and awful lot.

Carl Banks

Steven D'Aprano · Dec 24, 2010

No, -100. The traceback isn't the place for this. I've never disagreed
with you more, and I've disagreed with you and awful lot.

Okay, it's your right to disagree, but I am trying to understand your
reasons for disagreeing, and I simply don't get it.

I'm quite frustrated that you don't give any reasons for why you think
it's not just unnecessary but actually *horrible* to hide implementation
details such as where data validation is performed. I hope you'll try to
explain *why* you think it's a bad idea, rather than just continue
throwing out dismissive statements about "self-important" programmers
(your earlier post to KJ) and "never disagreed more" (to me).

Do you accept that, as a general principle, unhandled errors should be
reported as close as possible to where the error occurs?

If your answer to that is No, then where do you think unhandled errors
should be reported?

Now, given the scenario I proposed earlier:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in f
File "<stdin>", line 2, in g
File "<stdin>", line 2, in h
File "<stdin>", line 2, in i
File "<stdin>", line 2, in j
File "<stdin>", line 2, in k <=== far from the source of error
ValueError

do you concede that the actual error occurs at the time 'bad input' is
passed to f, and not further down the stack where k happens to raise an
exception? If not, where do you think the error occurs, and why?

Carl Banks · Dec 24, 2010

Okay, it's your right to disagree, but I am trying to understand your
reasons for disagreeing, and I simply don't get it.

I'm quite frustrated that you don't give any reasons for why you think
it's not just unnecessary but actually *horrible* to hide implementation
details such as where data validation is performed. I hope you'll try to
explain *why* you think it's a bad idea, rather than just continue
throwing out dismissive statements about "self-important" programmers
(your earlier post to KJ) and "never disagreed more" (to me).

Do you accept that, as a general principle, unhandled errors should be
reported as close as possible to where the error occurs?
If your answer to that is No, then where do you think unhandled errors
should be reported?

"No", and "where the error is detected". That is, what Python does
now. Trying to figure out where the error "occurred" is fool's
errand. The former isn't even well-defined, let alone something a
compiler or user can be expected to reliably report. Sometimes the
error doesn't even "occur" in the same call stack.

There's a similar fine line between a bug exception and bad input
exception, and it's foolish to distinguish them in a reliable way: in
particular bugs can easily be mistaken for bad input.

OTOH, going the extra mile to hide useful information from a user is
asinine. As a user, I will decide for myself how I want to use
implementation-defined information, and I don't want the implementor
to decide this for me. It's bad enough if an implementor fails to
provide information out of laziness, but when they deliberately do
extra work to hide information, that's self-importance and arrogance.

The traceback IS NOT THE PLACE for these kinds of games.

Now, given the scenario I proposed earlier:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in f
File "<stdin>", line 2, in g
File "<stdin>", line 2, in h
File "<stdin>", line 2, in i
File "<stdin>", line 2, in j
File "<stdin>", line 2, in k <=== far from the source of error
ValueError

do you concede that the actual error occurs at the time 'bad input' is
passed to f, and not further down the stack where k happens to raise an
exception? If not, where do you think the error occurs, and why?

This question is irrelevant. It doesn't matter where the mistake is
made.

Carl Banks

Steven D'Aprano · Dec 24, 2010

"No", and "where the error is detected". That is, what Python does now.
Trying to figure out where the error "occurred" is fool's errand.

But isn't that what debugging is all about -- finding where the error
occurred and fixing it? Hardly a fool's errand.

The
former isn't even well-defined, let alone something a compiler or user
can be expected to reliably report. Sometimes the error doesn't even
"occur" in the same call stack.

Thank you for taking the time to respond. I think your objection misses
the point I'm trying to make completely. But since this argument is
rather academic, and it's Christmas Eve here, I'll just make one last
comment and leave it at that:

OTOH, going the extra mile to hide useful information from a user is
asinine. As a user, I will decide for myself how I want to use
implementation-defined information, and I don't want the implementor to
decide this for me. It's bad enough if an implementor fails to provide
information out of laziness, but when they deliberately do extra work to
hide information, that's self-importance and arrogance.

But that of course is nonsense, because as the user you don't decide
anything of the sort. The developer responsible for writing the function
decides what information he provides you, starting with whether you get
an exception at all, where it comes from, the type of exception, and the
error message (if any). Once this information has been passed on to you,
you're free to do anything you like with it, but you never get to choose
what information you get -- I'm not suggesting any change there. All I'm
suggesting is that there should be a way of reducing the boilerplate
needed for this idiom:

def _validate_arg(x):
if x == 'bad input': return False
return True

def f(arg):
if not _validate_arg(arg):
raise ValueError
process(arg)

to something more natural that doesn't needlessly expose implementation
details that are completely irrelevant to the caller.

Carl Banks · Dec 24, 2010

But that of course is nonsense, because as the user you don't decide
anything of the sort.

As a user I can criticize the decision of the implementor to
needlessly filter information, and declare that it's borne out of the
author's arrogance in thinking he knows what I want when I get a
traceback.

I can also opine that Python language shouldn't make it easy for
library implementors to be arrogant like this.

The developer responsible for writing the function
decides what information he provides you, starting with whether you get
an exception at all, where it comes from, the type of exception, and the
error message (if any). Once this information has been passed on to you,
you're free to do anything you like with it, but you never get to choose
what information you get -- I'm not suggesting any change there. All I'm
suggesting is that there should be a way of reducing the boilerplate
needed for this idiom:

def _validate_arg(x):
if x == 'bad input': return False
return True

def f(arg):
if not _validate_arg(arg):
raise ValueError
process(arg)

to something more natural that doesn't needlessly expose implementation
details that are completely irrelevant to the caller.

Arrogance. Who gave you the right to decide what is completely
irrelevant to user? I, as the user, decide what's relevant. If I
want implementation-dependent information, it's my business.

I don't want the language to make it easy for arrogant people, who
think they know what information I want better than I do, to hide that
information from me.

Carl Banks

how to prevent the "extended call syntax" (*) from expanding a stringinto a list of characters	3	Jul 22, 2010
finding the object corresponding to a stack frame	0	Aug 4, 2011
Is crawling the stack "bad"? Why?	13	Feb 25, 2008
Cond: Resolve errors without unwinding the stack	0	Mar 23, 2009
Why does std::stack::pop() not throw an exception if the stack is empty?	36	Feb 3, 2011
How to get dynamically-created fxn's source?	4	Nov 5, 2010
How to stop an [Rpyc] server thread?	3	Sep 7, 2006
Ruby C extension crash: how to create a stack trace?	5	Apr 6, 2007

How to pop the interpreter's stack?

kj

Ethan Furman

Steven D'Aprano

Tim Arnold

Ethan Furman

Steven D'Aprano

Robert Kern

Steven D'Aprano

Carl Banks

John Nagle

Robert Kern

kj

Carl Banks

kj

Steven D'Aprano

Carl Banks

Steven D'Aprano

Carl Banks

Steven D'Aprano

Carl Banks

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads