Failing unittest Test cases

  • Thread starter Scott David Daniels
  • Start date
S

Scott David Daniels

There has been a bit of discussion about a way of providing test cases
in a test suite that _should_ work but don't. One of the rules has been
the test suite should be runnable and silent at every checkin. Recently
there was a checkin of a test that _should_ work but doesn't. The
discussion got around to means of indicating such tests (because the
effort of creating a test should be captured) without disturbing the
development flow.

The following code demonstrates a decorator that might be used to
aid this process. Any comments, additions, deletions?

from unittest import TestCase


class BrokenTest(TestCase.failureException):
def __repr__(self):
return '%s: %s: %s works now' % (
(self.__class__.__name__,) + self.args)

def broken_test_XXX(reason, *exceptions):
'''Indicates unsuccessful test cases that should succeed.
If an exception kills the test, add exception type(s) in args'''
def wrapper(test_method):
def replacement(*args, **kwargs):
try:
test_method(*args, **kwargs)
except exceptions + (TestCase.failureException,):
pass
else:
raise BrokenTest(test_method.__name__, reason)
replacement.__doc__ = test_method.__doc__
replacement.__name__ = 'XXX_' + test_method.__name__
replacement.todo = reason
return replacement
return wrapper


You'd use it like:
class MyTestCase(unittest.TestCase):
def test_one(self): ...
def test_two(self): ...
@broken_test_XXX("The thrumble doesn't yet gsnort")
def test_three(self): ...
@broken_test_XXX("Using list as dictionary", TypeError)
def test_four(self): ...

It would also point out when the test started succeeding.

--Scott David Daniels
(e-mail address removed)
 
E

Eric

There has been a bit of discussion about a way of providing test cases
in a test suite that _should_ work but don't. One of the rules has been
the test suite should be runnable and silent at every checkin. Recently
there was a checkin of a test that _should_ work but doesn't. The
discussion got around to means of indicating such tests (because the
effort of creating a test should be captured) without disturbing the
development flow.

The following code demonstrates a decorator that might be used to
aid this process. Any comments, additions, deletions?

Interesting idea. I have been prepending 'f' to my test functions that
don't yet work, so they simply don't run at all. Then when I have time
to add new functionality, I grep for 'ftest' in the test suite.

- Eric
 
F

Frank Niessink

Scott said:
There has been a bit of discussion about a way of providing test cases
in a test suite that _should_ work but don't. One of the rules has been
the test suite should be runnable and silent at every checkin. Recently
there was a checkin of a test that _should_ work but doesn't. The
discussion got around to means of indicating such tests (because the
effort of creating a test should be captured) without disturbing the
development flow.

There is just one situation that I can think of where I would use this,
and that is the case where some underlying library has a bug. I would
add a test that succeeds when the bug is present and fails when the bug
is not present, i.e. it is repaired. That way you get a notification
automatically when a new version of the library no longer contains the
bug, so you know you can remove your workarounds for that bug. However,
I've never used a decorator or anything special for that because I never
felt the need for it, a regular testcase like this also works for me:

class SomeThirdPartyLibraryTest(unittest.TestCase):
def testThirdPartyLibraryCannotComputeSquareOfZero(self):
self.assertEqual(-1, tplibrary.square(0),
'They finally fixed that bug in tplibrary.square')

Doesn't it defy the purpose of unittests to give them a easy switch so
that programmers can turn them off whenever they want to?

Cheers, Frank
 
P

Paul Rubin

Scott David Daniels said:
Recently there was a checkin of a test that _should_ work but
doesn't. The discussion got around to means of indicating such
tests (because the effort of creating a test should be captured)
without disturbing the development flow.

Do you mean "shouldn't work but does"? Anyway I don't understand
the question. What's wrong with using assertRaises if you want to
check that a test raises a particular exception?
 
P

Peter Otten

Scott said:
There has been a bit of discussion about a way of providing test cases
in a test suite that should work but don't.  One of the rules has been
the test suite should be runnable and silent at every checkin.  Recently
there was a checkin of a test that should work but doesn't.  The
discussion got around to means of indicating such tests (because the
effort of creating a test should be captured) without disturbing the
development flow.

The following code demonstrates a decorator that might be used to
aid this process.  Any comments, additions, deletions?

Marking a unittest as "should fail" in the test suite seems just wrong to
me, whatever the implementation details may be. If at all, I would apply a
"I know these tests to fail, don't bother me with the messages for now"
filter further down the chain, in the TestRunner maybe. Perhaps the code
for platform-specific failures could be generalized?

Peter
 
F

Fredrik Lundh

Paul said:
Do you mean "shouldn't work but does"?

no, he means exactly what he said: support for "expected failures"
makes it possible to add test cases for open bugs to the test suite,
without 1) new bugs getting lost in the noise, and 2) having to re-
write the test once you've gotten around to fix the bug.
Anyway I don't understand the question.

it's a process thing. tests for confirmed bugs should live in the test
suite, not in the bug tracker. as scott wrote, "the effort of creating
a test should be captured".

(it's also one of those things where people who have used this in
real life find it hard to believe that others don't even want to under-
stand why it's a good thing; similar to indentation-based structure,
static typing, not treating characters as bytes, etc).

</F>
 
D

Duncan Booth

Scott said:
There has been a bit of discussion about a way of providing test cases
in a test suite that _should_ work but don't. One of the rules has been
the test suite should be runnable and silent at every checkin. Recently
there was a checkin of a test that _should_ work but doesn't. The
discussion got around to means of indicating such tests (because the
effort of creating a test should be captured) without disturbing the
development flow.

I like the concept. It would be useful when someone raises an issue which
can be tested for easily but for which the fix is non-trivial (or has side
effects) so the issue gets shelved. With this decorator you can add the
failing unit test and then 6 months later when an apparently unrelated bug
fix actually also fixes the original one you get told 'The thrumble doesn't
yet gsnort(see issue 1234)' and know you should now go and update that
issue.

It also means you have scope in an open source project to accept an issue
and incorporate a failing unit test for it before there is an acceptable
patch. This shifts the act of accepting a bug from putting it onto some
nebulous list across to actually recognising in the code that there is a
problem. Having a record of the failing issues actually in the code would
also helps to tie together bug fixes across different development branches.

Possible enhancements:

add another argument for associated issue tracker id (I know you could put
it in the string, but a separate argument would encourage the programmer to
realise that every broken test should have an associated tracker entry),
although I suppose since some unbroken tests will also have associated
issues this might just be a separate decorator.

add some easyish way to generate a report of broken tests.
 
P

Paul Rubin

Fredrik Lundh said:
no, he means exactly what he said: support for "expected failures"
makes it possible to add test cases for open bugs to the test suite,
without 1) new bugs getting lost in the noise, and 2) having to re-
write the test once you've gotten around to fix the bug.

Oh I see, good idea. But in that case maybe the decorator shouldn't
be attached to the test like that. Rather, the test failures should
be filtered in the test runner as someone suggested, or the filtering
could even integrated with the bug database somehow.
 
D

Duncan Booth

Peter said:
Marking a unittest as "should fail" in the test suite seems just wrong
to me, whatever the implementation details may be. If at all, I would
apply a "I know these tests to fail, don't bother me with the messages
for now" filter further down the chain, in the TestRunner maybe.
Perhaps the code for platform-specific failures could be generalized?

It isn't marking the test as "should fail" it is marking it as "should
pass, but currently doesn't" which is a very different thing.
 
F

Fredrik Lundh

Paul said:
Oh I see, good idea. But in that case maybe the decorator shouldn't
be attached to the test like that. Rather, the test failures should
be filtered in the test runner as someone suggested, or the filtering
could even integrated with the bug database somehow.

separate filter lists or connections between the bug database and the
code base introduces unnecessary couplings, and complicates things
for the developers (increased risk for checkin conflicts, mismatch be-
tween the code in a developer's sandbox and the "official" bug status,
etc).

this is Python; annotations belong in the annotated code, not in some
external resource.

</F>
 
P

Peter Otten

Duncan said:
It isn't marking the test as "should fail" it is marking it as "should
pass, but currently doesn't" which is a very different thing.

You're right of course. I still think the "currently doesn't pass" marker
doesn't belong into the test source.

Peter
 
M

Michele Simionato

Scott David Daniels about marking expected failures:

<snip>

I am +1, I have wanted this feature for a long time. FWIW,
I am also +1 to run the tests in the code order.

Michele Simionato
 
R

Roy Smith

Peter Otten said:
You're right of course. I still think the "currently doesn't pass" marker
doesn't belong into the test source.

The agile people would say that if a test doesn't pass, you make fixing it
your top priority. In an environment like that, there's no such thing as a
test that "currently doesn't pass". But, real life is not so kind.

These days, I'm working on a largish system (I honestly don't know how many
lines of code, but full builds take about 8 hours). A fairly common
scenario is some unit test fails in a high level part of the system, and we
track it down to a problem in one of the lower levels. It's a different
group that maintains that bit of code. We understand the problem, and know
we're going to fix it before the next release, but that's not going to
happen today, or tomorrow, or maybe even next week.

So, what do you do? The world can't come to a screeching halt for the next
couple of weeks while we're waiting for the other group to fix the problem.
What we typically do is just comment out the offending unit test. If the
developer who does that is on the ball, a PR (problem report) gets opened
too, to track the need to re-instate the test, but sometimes that doesn't
happen. A better solution would be a way to mark the test "known to fail
because of xyz". That way it continues to show up on every build report
(so it's not forgotten about), but doesn't break the build.
 
S

skip

Michele> I am also +1 to run the tests in the code order.

Got any ideas how that is to be accomplished short of jiggering the names so
they sort in the order you want them to run?

Skip
 
P

Paul Rubin

Got any ideas how that is to be accomplished short of jiggering the
names so they sort in the order you want them to run?

How about with a decorator instead of the testFuncName convention,
i.e. instead of

def testJiggle(): # "test" in the func name means it's a test case
...

use:

@test
def jiggletest(): # nothing special about the name "jiggletest"
...

The hack of searching the module for functions with special names was
always a big kludge and now that Python has decorators, that seems
like a cleaner way to do it.

In the above example, the 'test' decorator would register the
decorated function with the test framework, say by appending it to a
list. That would make it trivial to run them in code order.
 
M

Michele Simionato

Michele> I am also +1 to run the tests in the code order.

Got any ideas how that is to be accomplished short of jiggering the names so
they sort in the order you want them to run?

Skip

Well, it could be done with a decorator, but unittest is already
cumbersome how it is,
I would not touch it. Instead, I would vote for py.test in the standard
library.

Michele Simionato
 
S

Scott David Daniels

Duncan said:
... Possible enhancements:
add another argument for associated issue tracker id ... some unbroken tests
> will also have associated issues this might just be a separate decorator.

This is probably easier to do as a separate decoration which would have to
precede the "failing test" decoration:
def tracker(identifier):
def markup(function):
function.tracker = identifier
return function
return markup
add some easyish way to generate a report of broken tests.

Here's a generator for all the "marked broken" tests in a module:

import types, unittest

def marked_broken(module):
for class_name in dir(module):
class_ = getattr(module, class_name)
if (isinstance(class_, (type, types.ClassType)) and
issubclass(class_, unittest.TestCase)):
for test_name in dir(class_):
if test_name.startswith('test'):
test = getattr(class_, test_name)
if (hasattr(test, '__name__') and
test.__name__.startswith('XXX_')):
yield class_name, test_name, test.todo


You could even use it like this:

import sys
import mytests

for module_name, module in sys.modules.iteritems():
last_class = ''
for class_name, test_name, reason in marked_broken(module):
if module_name:
print 'In module %s:' % module_name
module_name = ''
if last_class != class_name:
print 'class', class_name
last_class = class_name
print ' %s\t %s' % (test_name, reason)


Thanks for the thoughtful feedback.

--Scott David Daniels
(e-mail address removed)
 
B

Bengt Richter

You're right of course. I still think the "currently doesn't pass" marker
doesn't belong into the test source.
Perhaps in a config file that can specify special conditions re running
identified tests? E.g., don't run vs run and report (warn/fail/info)
changed result (e.g. from cached result) vs run and report if pass etc.

Then if code change unexpectedly makes a test work, the config file can just
be updated, not the test.

Regards,
Bengt Richter
 
S

Scott David Daniels

OK I took the code I offered here (tweaked in reaction to some
comments) and put up a recipe on the Python Cookbook. I'll allow
a week or so for more comment, and then possibly pursue adding this
to unittest.

Here is where the recipe is, for those who want to comment further (in
either that forum or this one):
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/466288

--Scott David Daniels
(e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top