painful debugging: techniques?

H

Humpty Dumpty

Hello, I've been using python for about 3 months now. Prior to that I did
C++ for over 5 years. I'm wondering if my debugging techniques are too C++
oriented.

E.g., it took me about 9 hrs of debugging to figure out that the second
parameter to weakref.ref() was being passed as None. This is odd because the
second parameter is optional, yet its default value is not documented, so I
assumed that passing None would be fine, and then I forgot that I was doing
that. It took that long mostly because the exception message was "Exception
TypeError: 'NoneType not a callable' in None being ignored", ie it was being
ignored and function from which it was being raised was unknown to Python
(somehow), *and* it was only being raised upon exit of the test (cleanup).
So I had no way of knowing where this was being raised. It was nasty.

I'm wondering if there are tricks people use to track down problems like
this.

Thanks,

Oliver
 
R

Roger Binns

I'm wondering if there are tricks people use to track down problems like

I do two things. One is that I catch exceptions and print out the
variables for each frame, in addition to the traceback. The recipe
is at http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52215

The second thing is that I never start the program in the debugger.
If I actually need to debug somewhere, I add the following line:

import pdb ; pdb.set_trace()

You will then pop into the debugger at that point, and can step
through the code. The above statement can be inside an if
or any other condition, so I only triggers at the point that
is relevant.

Roger
 
P

Peter Otten

Humpty said:
Hello, I've been using python for about 3 months now. Prior to that I did
C++ for over 5 years. I'm wondering if my debugging techniques are too C++
oriented.

E.g., it took me about 9 hrs of debugging to figure out that the second
parameter to weakref.ref() was being passed as None. This is odd because
the second parameter is optional, yet its default value is not documented,
so I assumed that passing None would be fine, and then I forgot that I was
doing that. It took that long mostly because the exception message was
"Exception TypeError: 'NoneType not a callable' in None being ignored", ie
it was being ignored and function from which it was being raised was
unknown to Python (somehow), *and* it was only being raised upon exit of
the test (cleanup). So I had no way of knowing where this was being
raised. It was nasty.

I'm wondering if there are tricks people use to track down problems like
this.

The best strategy would have been to avoid the error altogether :)
Python makes that easy; just test a chunk of code on the command line:
.... pass
....Exception exceptions.TypeError: "'NoneType' object is not callable" in None
ignored

The next best thing is to have a proper test suite and making only small
changes to your program, so that you can easily attribute an error to a
newly added chunk of code. IMHO this is the most promising aproach for
anything comprising more than one module. I admit that I don't know how to
devise test to discover your problem.

So what if all else fails? Let's stick to your example.
I would see not getting a traceback as a strong hint that something unusual
is going on. In this case you can leverage the fact that python is open
source and search for the place where the error message is generated.
Looking for "%r ignored" and "%s ignored" in *.py of Python's source
distribution yields no result. At this point I would extend the search to
"ignored" in the C source: 48 hits, but most of them in comments. The one
in error.c seems promising. Unfortunately the comment in
PyErr_WriteUnraisable() gives the "wrong" __del__() example, so you need
one more iteration to hunt for PyErr_WriteUnraisable. As the variable names
are well chosen, the match in weakrefobject.c
(PyErr_WriteUnraisable(callback);) gives you a strong clue.

I admit this is tedious and that I had a headstart as you already provided
the connection to weakref, but I reckon that this approach would still take
less than one hour. As a hidden benefit, you learn something about Python's
architecture. Also, in my experience your problem is a particularly hard
one to track down (as far as pure Python is concerned).

At last, what I would call the Columbus approach: do you really need
weakrefs in your application, or is the error an indication that you are
introducing unnecessary complexity in your app? Few eggs actually need to
stand on their tip :)

Peter
 
H

Hung Jung Lu

Roger Binns said:
I do two things. One is that I catch exceptions and print out the
variables for each frame, in addition to the traceback. The recipe
is at http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52215

I think this will not work for the original poster's case. His
exception happens at the moment of garbage collection, out of his
control. That is, the exception is not coming from one of his
statements, so he can't wrap a try...except block around it. Example:

import weakref

class A: pass

def f():
a = A()
w = weakref.ref(a, None)

try:
f()
except:
print 'This will never be printed'

--------------------------

The original poster spent 9 hours to find the source of this bug. So
it is a tough problem, and it may well happen to other people. The
question now is: how can we stop this from happening to other people?

This bug does not seem to be catchable by sys.excepthook, which makes
it even tougher.

So far, I think Peter Otten's approach (search through the source
code) may be the most practical way. The fact that this bug cannot be
caught with sys.excepthook should probably be considered a bug in
Python?

I guess the only consolation is that this kind of system bugs don't
seem to happen too often. :)

regards,

Hung Jung
 
R

Richie Hindle

[Hung Jung]
The question now is: how can we stop this from happening to other people?
[...] So far, I think Peter Otten's approach (search through the source
code) may be the most practical way.

Make weakref.ref() check that its second argument is a callable?

(I know, "patches welcome" :cool:
 
D

David Bolen

Humpty Dumpty said:
I'm wondering if there are tricks people use to track down problems like
this.

Perhaps not so much a trick as much as experience, but after working
with Python a bit, you'll find that any time you see something like
'NoneType not a callable' (or really any type) it pretty much says
that some object in your program should be a callable (function,
method, etc..) but is None (or the other type) instead. So in general
I'd immediately start looking for uses of None in my code (or ways in
which my code could generate None at runtime).

Now, depending on how much new code you had written before the error
started happening could help bound your search, and/or depending on
the complexity of the code being tested you might be able to insert
judicious prints at various points to check the value of objects. As
someone else pointed out, if you were proceeding in a test first
fashion you'd be at an advantage here since you would have high
confidence in existing code and that the error was introduced (or at
least tickled) specifically by whatever small amount of new code you
wrote since the last test execution.

Also, since the error was occurring on cleanup, it's related to
objects still alive at that time. So that would help indicate that
you should focus your search on objects that live throughout the
duration of the test, as opposed to more transient code.

Not sure how much that would have streamlined this case (it is fairly
tricky given the gap between source of error and eventual failure),
but it sounds like just focusing on your uses of None would have at
least selected the weakref.ref call as a potential issue fairly
quickly.

-- David
 
M

Michael Hudson

Richie Hindle said:
[Hung Jung]
The question now is: how can we stop this from happening to other people?
[...] So far, I think Peter Otten's approach (search through the source
code) may be the most practical way.

Make weakref.ref() check that its second argument is a callable?

(I know, "patches welcome" :cool:

I *believe* that in current CVS python, a None second argument to
weakref.ref() is treated as if there was no argument (i.e. how the OP
expected).

Someone who's not behind a modem and several layers of terminal
emulation (none of which agree on the encoding being used) should
check :)

Cheers,
mwh
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top