Exhaustive Unit Testing

Emanuele D'Arrigo · Nov 27, 2008

Hi everybody,

another question on unit testing, admittedly not necessarily a python-
specific one...

I have a class method about 60 lines long (*) and due to some 9 non-
trivial IFs statements (3 and 2 of which nested) the number of
possible paths the program flow can take is uncomfortably large, each
path characterized by an uncomfortably long list of careful turns to
take. Given that all paths have large portions of them overlapping
with each other, is there a strategy that would allow me to collapse
the many cases into only a few representative ones?

I can sense that exhaustively testing all paths is probably overkill.
On the other hand just going through all nodes and edges between the
nodes -at least once- is probably not enough. Is there any technique
to find the right(ish) middle?

Thanks for your help!

Manu

(*) for the 50/500 purists: it's my only method/function with more
than 50 lines so far... and it include comments!

Steven D'Aprano · Nov 27, 2008

Hi everybody,

another question on unit testing, admittedly not necessarily a python-
specific one...

I have a class method about 60 lines long (*) and due to some 9 non-
trivial IFs statements (3 and 2 of which nested) the number of possible
paths the program flow can take is uncomfortably large, each path
characterized by an uncomfortably long list of careful turns to take.

Nine non-trivial if statements, some of which are nested... uncomfortably
large... uncomfortably long... these are all warning signs that your
method is a monster.

I would say the obvious solution is to refactor that method into multiple
simpler methods, then call those. Not only does that make maintaining
that class method easier, but you can also unit-test each method
individually.

With nine if-statements, you have 512 program paths that need testing.
(In practice, there may be fewer because some of the statements are
nested, and presumably some paths will be mutually exclusive.) Refactor
the if...else blocks into methods, and you might have as many as 18 paths
to test. So in theory you should reduce the number of unit-tests
significantly.

Given that all paths have large portions of them overlapping with each
other, is there a strategy that would allow me to collapse the many
cases into only a few representative ones?

That would be impossible to answer without knowing what the method
specifically does and how the many cases interact with each other.

I can sense that exhaustively testing all paths is probably overkill.

I don't agree. If you don't test all the paths, then by definition you
have program paths which have never been tested. Unless those paths are
so trivially simple that you can see that they must be correct just by
looking at the code, then the chances are very high that they will hide
bugs.

On
the other hand just going through all nodes and edges between the nodes
-at least once- is probably not enough. Is there any technique to find
the right(ish) middle?

Refactor until your code is simple enough to unit-test effectively, then
unit-test effectively.

Emanuele D'Arrigo · Nov 27, 2008

Refactor until your code is simple enough to unit-test effectively, then
unit-test effectively.

<sigh> I suspect you are right...

Ok, thank you!

Manu

Stefan Behnel · Nov 27, 2008

Steven said:
If you don't test all the paths, then by definition you
have program paths which have never been tested. Unless those paths are
so trivially simple that you can see that they must be correct just by
looking at the code, then the chances are very high that they will hide
bugs.

Not to mention that you can sometimes look at awfully trivial code three
times and only see the obvious bug in that code the fourth time you put an
eye on it a good night's sleep later.

Expecting code paths in a nested if statement with nine conditions that are
not worth testing due to obvious triviality, is pretty hubristic.

Stefan

Roy Smith · Nov 27, 2008

Stefan Behnel said:
Not to mention that you can sometimes look at awfully trivial code three
times and only see the obvious bug in that code the fourth time you put an
eye on it a good night's sleep later.

Or never see it.

Lately, I've been using Coverity (static code analysis tool) on a large C++
project. I'm constantly amazed at the stuff it finds which is *so* obvious
after it's pointed out, that I just never saw before.

Expecting code paths in a nested if statement with nine conditions that are
not worth testing due to obvious triviality, is pretty hubristic.

There's a well known theory in studies of the human brain which says people
are capable of processing about 7 +/- 2 pieces of information at once.
With much handwaving, that could be applied to testing by saying most
people can think through 2 conditionals (4 paths), but it's pushing it when
you get to 3 conditionals (8 paths), and once you get to 4 or more, it's
probably hopeless to expect somebody to be able to fully understand all the
possible code paths.

Emanuele D'Arrigo · Nov 27, 2008

Refactor until your code is simple enough to unit-test effectively, then
unit-test effectively.

Ok, I've taken this wise suggestion on board and of course I found
immediately ways to improve the method. -However- this generates
another issue. I can fragment the code of the original method into one
public method and a few private support methods. But this doesn't
reduce the complexity of the testing because the number and complexity
of the possible path stays more or less the same. The solution to this
would be to test the individual methods separately, but is the only
way to test private methods in python to make them (temporarily) non
private? I guess ultimately this would only require the removal of the
appropriate double-underscores followed by method testing and then
adding the double-underscores back in place. There is no "cleaner"
way, is there?

Manu

Terry Reedy · Nov 27, 2008

Emanuele said:
Ok, I've taken this wise suggestion on board and of course I found
immediately ways to improve the method. -However- this generates
another issue. I can fragment the code of the original method into one
public method and a few private support methods. But this doesn't
reduce the complexity of the testing because the number and complexity
of the possible path stays more or less the same. The solution to this
would be to test the individual methods separately, but is the only
way to test private methods in python to make them (temporarily) non
private? I guess ultimately this would only require the removal of the
appropriate double-underscores followed by method testing and then
adding the double-underscores back in place. There is no "cleaner"
way, is there?

Use single underscore names to mark methods as private. The intent of
double-underscore mangling is to avoid name clashes in multiple
inheritance. If you insist on double underscores, use the mangled name
in your tests. 'Private' is not really private in Python.

IDLE 3.0rc31

Terry

bearophileHUGS · Nov 27, 2008

Emanuele D'Arrigo:

I can fragment the code of the original method into one public method and a few private support methods.<

Python also support nested functions, that you can put into your
method. The problem is that often unit test functions aren't able to
test nested functions.

A question for other people: Can Python change a little to allow
nested functions to be tested? I think this may solve some of my
problems.

Bye,
bearophile

Terry Reedy · Nov 27, 2008

Emanuele D'Arrigo:

Python also support nested functions, that you can put into your
method. The problem is that often unit test functions aren't able to
test nested functions.

A question for other people: Can Python change a little to allow
nested functions to be tested? I think this may solve some of my
problems.

The problem is that inner functions do not exist until the outer
function is called and the inner def is executed. And they cease to
exist when the outer function returns unless returned or associated with
a global name or collection.

A 'function' only needs to be nested if it is intended to be different
(different default or closure) for each execution of its def.

Benjamin · Nov 28, 2008

The problem is that inner functions do not exist until the outer
function is called and the inner def is executed. And they cease to
exist when the outer function returns unless returned or associated with
a global name or collection.

Of course, you could resort to terrible evil like this:

import types

def f():
def sub_function():
return 4

nested_function = types.FunctionType(f.func_code.co_consts[1], {})
print nested_function() # Prints 4

But don't do that!

Steven D'Aprano · Nov 28, 2008

A question for other people: Can Python change a little to allow nested
functions to be tested? I think this may solve some of my problems.

Remember that nested functions don't actually exist as functions until
the outer function is called, and when the outer function is called they
go out of scope and cease to exist. For this to change wouldn't be a
little change, it would be a large change.

I can see benefit to the idea. Unit testing, as you say. I also like the
idea of doing this:

def foo(x):
def bar(y):
return y+1
return x**2+bar(x)

a = foo.bar(7)

However you can get the same result (and arguably this is the Right Way
to do it) with a class:

class foo:
def bar(y):
return y+1
def __call__(self, x):
return x**2 + self.bar(x)
foo = foo()

Unfortunately, this second way can't take advantage of nested scopes in
the same way that functions can.

Steven D'Aprano · Nov 28, 2008

Ok, I've taken this wise suggestion on board and of course I found
immediately ways to improve the method. -However- this generates another
issue. I can fragment the code of the original method into one public
method and a few private support methods. But this doesn't reduce the
complexity of the testing because the number and complexity of the
possible path stays more or less the same. The solution to this would be
to test the individual methods separately,
Naturally.

but is the only way to test
private methods in python to make them (temporarily) non private? I
guess ultimately this would only require the removal of the appropriate
double-underscores followed by method testing and then adding the
double-underscores back in place. There is no "cleaner" way, is there?

"Private" methods in Python are only private by convention.

.... def _private(self): # private, don't touch
.... pass
.... def __reallyprivate(self): # name mangled
.... pass
....Traceback (most recent call last):
<unbound method Parrot.__reallyprivate>

In Python, the philosophy is "We're all adults here" and consequently if
people *really* want to access your private methods, they can. This is a
Feature, not a Bug

Consequently, I almost always use single-underscore "private by
convention" names, rather than double-underscore names. The name-mangling
is, in my experience, just a nuisance.

bearophileHUGS · Nov 28, 2008

Terry Reedy:

The problem is that inner functions do not exist until the outer function is called and the inner def is executed. And they cease to exist when the outer function returns unless returned or associated with a global name or collection.<
OK.

A 'function' only needs to be nested if it is intended to be different (different default or closure) for each execution of its def.<

Or maybe because you want to denote a logical nesting, or maybe
because you want to keep the outer namespace cleaner, etc etc.

-----------------------

Benjamin:

Of course, you could resort to terrible evil like this:<

My point was of course to ask about possible changes to CPython, so
you don't need evil hacks anymore.

-----------------------

Steven D'Aprano:

For this to change wouldn't be a little change, it would be a large change.<

I see, then my proposal has little hope, I presume. I'll have to keep
moving functions outside to test them and move them inside again when
I want to run them.

However you can get the same result (and arguably this is the Right Way to do it) with a class:<

Of course, that's the Right Way only for languages that support only a
strict form of the Object Oriented paradigm, like for example Java.

Thank you to all the people that have answered.

Bye,
bearophile

Steven D'Aprano · Nov 28, 2008

I see, then my proposal has little hope, I presume. I'll have to keep
moving functions outside to test them and move them inside again when I
want to run them.

Not so. You just need to prevent the nested functions from being garbage
collected.

def foo(x):
if not hasattr(foo, 'bar'):
def bar(y):
return x+y
def foobar(y):
return 3*y-y**2
foo.bar = bar
foo.foobar = foobar
return foo.bar(3)+foo.foobar(3)

However, note that there is a problem here: foo.bar will always use the
value of x as it were when it was called the first time, *not* the value
when you call it subsequently:
5

Because foo.foobar doesn't use x, it's safe to use; but foo.bar is a
closure and will always use the value of x as it was first used. This
could be a source of puzzling bugs.

There are many ways of dealing with this. Here is one:

def foo(x):
def bar(y):
return x+y
def foobar(y):
return 3*y-y**2
foo.bar = bar
foo.foobar = foobar
return bar(3)+foobar(3)

I leave it as an exercise for the reader to determine how the behaviour
of this is different to the first version.

Nigel Rantor · Nov 28, 2008

Roy said:
There's a well known theory in studies of the human brain which says people
are capable of processing about 7 +/- 2 pieces of information at once.

It's not about processing multiple taks, it's about the amount of things
that can be held in working memory.

n

Bruno Desthuilliers · Nov 28, 2008

Consequently, I almost always use single-underscore "private by
convention" names, rather than double-underscore names. The name-mangling
is, in my experience, just a nuisance.

s/just/most often than not/ and we'll agree on this !-)

Roy Smith · Nov 28, 2008

Nigel Rantor said:
It's not about processing multiple taks, it's about the amount of things
that can be held in working memory.

Yes, and that's what I'm talking about here. If there's N possible code
paths, can you think about all of them at the same time while you look at
your code and simultaneously understand them all? If I write:

if foo:
do_this()
else:
do_that()

it's easy to get a good mental picture of all the consequences of foo being
true or false. As the number of paths go up, it becomes harder to think of
them all as a coherent piece of code and you have to resort to examining
the paths sequentially.

Terry Reedy · Nov 28, 2008

Terry Reedy:

Or maybe because you want to denote a logical nesting, or maybe
because you want to keep the outer namespace cleaner, etc etc.

I was already aware of those *wants*, but they are not *needs*, in the
sense I meant. A single constant function does not *need* to be nested
and regenerated with each call.

A standard idiom, I think, is give the foo-associated function a private
foo-derived name such as _foo or _foo_bar. This keeps the public
namespace clean and denotes the logical nesting. I *really* would not
move things around after testing.

For the attribute approach, you could lobby for the so-far rejected

def foo(x):
return foo.bar(3*x)

def foo.bar(x):
return x*x

In the meanwhile...

def foo(x):
return foo.bar(3*x)

def _(x):
return x*x
foo.bar = _

Or write a decorator so you can write

@funcattr(foo. 'bar')
def _

(I don't see any way a function can delete a name in the namespace that
is going to become the class namespace, so that
@fucattr(foo)
def bar...
would work.)

Terry Jan Reedy

Fuzzyman · Nov 29, 2008

Ok, I've taken this wise suggestion on board and of course I found
immediately ways to improve the method. -However- this generates
another issue. I can fragment the code of the original method into one
public method and a few private support methods. But this doesn't
reduce the complexity of the testing because the number and complexity
of the possible path stays more or less the same. The solution to this
would be to test the individual methods separately, but is the only
way to test private methods in python to make them (temporarily) non
private? I guess ultimately this would only require the removal of the
appropriate double-underscores followed by method testing and then
adding the double-underscores back in place. There is no "cleaner"
way, is there?

Manu

Your experiences are one of the reasons that writing the tests *first*
can be so helpful. You think about the *behaviour* you want from your
units and you test for that behaviour - *then* you write the code
until the tests pass.

This means that your code is testable, but which usually means simpler
and better.

By the way, to reduce the number of independent code paths you need to
test you can use mocking. You only need to test the logic inside the
methods you create (testing behaviour), and not every possible
combination of paths.

Michael Foord

Emanuele D'Arrigo · Nov 29, 2008

Thank you to everybody who has replied about the original problem. I
eventually refactored the whole (monster) method over various smaller
and simpler ones and I'm now testing each individually. Things have
gotten much more tractable. =)

Thank you for nudging me in the right direction! =)

Manu

Code correctness, and testing strategies	20	May 24, 2008
word_set = set() def should_preceed_with_an(phrase): first_word =	1	Jan 26, 2013
[QUIZ] Testing DiGraph (#73)	18	Mar 31, 2006
[ANN] Multidimensional Array - MDArray 0.5.5	0	Nov 19, 2013
Incremental build systems, infamake	20	Jul 25, 2011
python-dev Summary for 2006-09-01 through 2006-09-15	0	Nov 3, 2006
python-dev Summary for 2005-06-16 through 2005-06-30	2	Jul 9, 2005
Difference between to_s and to_str	6	Apr 5, 2004

Exhaustive Unit Testing

Emanuele D'Arrigo

Steven D'Aprano

Emanuele D'Arrigo

Stefan Behnel

Roy Smith

Emanuele D'Arrigo

Terry Reedy

bearophileHUGS

Terry Reedy

Benjamin

Steven D'Aprano

Steven D'Aprano

bearophileHUGS

Steven D'Aprano

Nigel Rantor

Bruno Desthuilliers

Roy Smith

Terry Reedy

Fuzzyman

Emanuele D'Arrigo

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads