Iterating over test data in unit tests

Discussion in 'Python' started by Ben Finney, Dec 6, 2005.

  1. Ben Finney

    Ben Finney Guest

    Howdy all,

    Summary: I'm looking for idioms in unit tests for factoring out
    repetitive iteration over test data. I explain my current practice,
    and why it's unsatisfactory.


    When following test-driven development, writing tests and then coding
    to satisfy them, I'll start with some of the simple tests for a class.

    import unittest

    import bowling # Module to be tested

    class Test_Frame(unittest.TestCase):

    def test_instantiate(self):
    """ Frame instance should be created """
    instance = bowling.Frame()
    self.failUnless(instance)

    class Test_Game(unittest.TestCase):

    def test_instantiate(self):
    """ Game instance should be created """
    instance = bowling.Game()
    self.failUnless(instance)

    As I add tests for more interesting functionality, they become more
    data dependent.

    class Test_Game(unittest.TestCase):

    # ...

    def test_one_throw(self):
    """ Single throw should result in expected score """
    game = bowling.Game()
    throw = 5
    game.add_throw(throw)
    self.failUnlessEqual(throw, game.get_score())

    def test_three_throws(self):
    """ Three throws should result in expected score """
    game = bowling.Game()
    throws = (5, 7, 4)
    game.add_throw(throws[0])
    game.add_throw(throws[1])
    game.add_throw(throws[2])
    self.failUnlessEqual(sum(throws), game.get_score())

    This cries out, of course, for a test fixture to set up instances.

    class Test_Game(unittest.TestCase):

    def setUp(self):
    """ Set up test fixtures """
    self.game = bowling.Game()

    def test_one_throw(self):
    """ Single throw should result in expected score """
    throw = 5
    score = 5
    self.game.add_throw(throw)
    self.failUnlessEqual(score, game.get_score())

    def test_three_throws(self):
    """ Three throws should result in expected score """
    throws = [5, 7, 4]
    score = sum(throws)
    for throw in throws:
    game.add_throw(throw)
    self.failUnlessEqual(score, game.get_score())

    def test_strike(self):
    """ Strike should add the following two throws """
    throws = [10, 7, 4, 7]
    score = 39
    for throw in throws:
    game.add_throw(throw)
    self.failUnlessEqual(score, game.get_score())

    So far, this is just following what I see to be common practice for
    setting up *instances* to test.

    But the repetition of the test *inputs* also cries out to me to be
    refactored. I see less commonality in doing this.

    My initial instinct is just to put it in the fixtures.

    class Test_Game(unittest.TestCase):

    def setUp(self):
    """ Set up test fixtures """
    self.game = bowling.Game()

    self.game_data = {
    'one': dict(score=5, throws=[5]),
    'three': dict(score=17, throws=[5, 7, 5]),
    'strike': dict(score=39, throws=[10, 7, 5, 7]),
    }

    def test_one_throw(self):
    """ Single throw should result in expected score """
    throws = self.game_data['one']['throws']
    score = self.game_data['one']['score']
    for throw in throws:
    self.game.add_throw(throw)
    self.failUnlessEqual(score, game.get_score())

    def test_three_throws(self):
    """ Three throws should result in expected score """
    throws = self.game_data['three']['throws']
    score = self.game_data['three']['score']
    for throw in throws:
    game.add_throw(throw)
    self.failUnlessEqual(score, game.get_score())

    def test_strike(self):
    """ Strike should add the following two throws """
    throws = self.game_data['strike']['throws']
    score = self.game_data['strike']['score']
    for throw in throws:
    game.add_throw(throw)
    self.failUnlessEqual(score, game.get_score())

    But this now means that the test functions are almost identical,
    except for choosing one data set or another. Maybe that means I need
    to have a single test:

    def test_score_throws(self):
    """ Game score should be calculated from throws """
    for dataset in self.game_data:
    score = dataset['score']
    for throw in dataset['throws']:
    self.game.add_throw(throw)
    self.failUnlessEqual(score, self.game.get_score())

    Whoops, now I'm re-using a fixture instance. Maybe I need an instance
    of the class for each test case.

    def setUp(self):
    """ Set up test fixtures """
    self.game_data = {
    'one': dict(score=5, throws=[5]),
    'three': dict(score=17, throws=[5, 7, 5]),
    'strike': dict(score=39, throws=[10, 7, 5, 7]),
    }

    self.game_params = {}
    for key, dataset in self.game_data.items():
    params = {}
    instance = bowling.Game()
    params['instance'] = instance
    params['dataset'] = dataset
    self.game_params[key] = params

    def test_score_throws(self):
    """ Game score should be calculated from throws """
    for params in self.game_params.values():
    score = params['dataset']['score']
    instance = params['instance']
    for throw in params['dataset']['throws']:
    instance.add_throw(throw)
    self.failUnlessEqual(score, instance.get_score())

    Good, now the tests for different sets of throws are in a dictionary
    that's easy to add to. Of course, now I need to actually know which
    one is failing.

    def test_score_throws(self):
    """ Game score should be calculated from throws """
    for key, params in self.game_params.items():
    score = params['dataset']['score']
    instance = params['instance']
    for throw in params['dataset']['throws']:
    instance.add_throw(throw)
    self.failUnlessEqual(score, instance.get_score(),
    msg="Score mismatch for set '%s'" % key
    )

    It works. It's rather confusing though, since the actual test --
    iterate over the throws and check the score -- is in the midst of the
    iteration over data sets.

    Also, that's just *one* type of test I might need to do. Must I then
    repeat all that iteration code for other tests I want to do on the
    same data?

    Maybe I need to factor out the iteration into a generic iteration
    function, taking the actual test as a function object. That way, the
    dataset iterator doesn't need to know about the test function, and
    vice versa.

    def iterate_test(self, test_func, test_params=None):
    """ Iterate a test function for all the sets """
    if not test_params:
    test_params = self.game_params
    for key, params in test_params.items():
    dataset = params['dataset']
    instance = params['instance']
    test_func(key, dataset, instance)

    def test_score_throws(self):
    """ Game score should be calculated from throws """
    def test_func(key, dataset, instance):
    score = dataset['score']
    for throw in dataset['throws']:
    instance.add_throw(throw)
    self.failUnlessEqual(score, instance.get_score())

    self.iterate_test(test_func)

    That's somewhat clearer; the test function actually focuses on what
    it's testing. Those layers of indirection are annoying, but they allow
    the data sets to grow without writing more code to handle them.


    Testing a rules-based system involves lots of data sets, and each data
    set represents a separate test case; but the code for each of those
    test cases is mindlessly repetitive. Factoring them out seems like it
    needs a lot of indirection, and seems to make each test harder to
    read. Different *types* of tests would need multiple iterators, more
    complex test parameter dicts, or some more indirection. Those all
    sound ugly, but so does repetitively coding every test function
    whenever some new data needs to be tested.

    How should this be resolved?

    --
    \ "I never forget a face, but in your case I'll be glad to make |
    `\ an exception." -- Groucho Marx |
    _o__) |
    Ben Finney
    Ben Finney, Dec 6, 2005
    #1
    1. Advertising

  2. Ben Finney wrote:
    > Summary: I'm looking for idioms in unit tests for factoring out
    > repetitive iteration over test data....


    How about something like:

    > import unittest, bowling
    > class Test_Game(unittest.TestCase):
    > def setUp(self):
    > """ Set up test fixtures """
    > self.game = bowling.Game()
    >

    def runs(self, throws):
    """Run a series of scores and return the result"""
    for throw in throws:
    self.game.add_throw(throw)
    return self.game.get_score()

    > def test_one_throw(self):
    > """ Single throw should result in expected score """

    self.assertEqual(5, self.runs([5]))

    > def test_three_throws(self):
    > """ Three throws should result in expected score """

    self.assertEqual(5 + 7 + 4, self.runs([5, 7, 4]))

    > def test_strike(self):
    > """ Strike should add the following two throws """

    self.assertEqual(39, self.runs([10, 7, 4, 7]))


    There is no reason you cannot write support functions.

    --
    -Scott David Daniels
    Scott David Daniels, Dec 6, 2005
    #2
    1. Advertising

  3. Ben Finney

    Ben Finney Guest

    Scott David Daniels <> wrote:
    > Ben Finney wrote:
    > > Summary: I'm looking for idioms in unit tests for factoring out
    > > repetitive iteration over test data....

    >
    > How about something like:
    >
    > > class Test_Game(unittest.TestCase):

    [...]
    > def runs(self, throws):
    > """Run a series of scores and return the result"""

    [...]
    > > def test_one_throw(self):
    > > """ Single throw should result in expected score """

    > self.assertEqual(5, self.runs([5]))
    >
    > > def test_three_throws(self):
    > > """ Three throws should result in expected score """

    > self.assertEqual(5 + 7 + 4, self.runs([5, 7, 4]))
    >
    > > def test_strike(self):
    > > """ Strike should add the following two throws """

    > self.assertEqual(39, self.runs([10, 7, 4, 7]))


    Yes, I'm quite happy that I can factor out iteration *within* a single
    data set. That leaves a whole lot of test cases identical except for
    the data they use.

    The question remains: how can I factor out iteration of *separate test
    cases*, where the test cases are differentiated only by the data they
    use? I know at least one way: I wrote about it in my (long) original
    post. How else can I do it, with less ugliness?

    --
    \ "I went to a garage sale. 'How much for the garage?' 'It's not |
    `\ for sale.'" -- Steven Wright |
    _o__) |
    Ben Finney
    Ben Finney, Dec 6, 2005
    #3
  4. Bengt Richter, Dec 6, 2005
    #4
  5. Ben Finney wrote:
    > Maybe I need to factor out the iteration into a generic iteration
    > function, taking the actual test as a function object. That way, the
    > dataset iterator doesn't need to know about the test function, and
    > vice versa.
    >
    > def iterate_test(self, test_func, test_params=None):
    > """ Iterate a test function for all the sets """
    > if not test_params:
    > test_params = self.game_params
    > for key, params in test_params.items():
    > dataset = params['dataset']
    > instance = params['instance']
    > test_func(key, dataset, instance)
    >
    > def test_score_throws(self):
    > """ Game score should be calculated from throws """
    > def test_func(key, dataset, instance):
    > score = dataset['score']
    > for throw in dataset['throws']:
    > instance.add_throw(throw)
    > self.failUnlessEqual(score, instance.get_score())
    >
    > self.iterate_test(test_func)
    >
    > That's somewhat clearer; the test function actually focuses on what
    > it's testing. Those layers of indirection are annoying, but they allow
    > the data sets to grow without writing more code to handle them.


    Don't know if this helps, but I'd be more likely to write this as
    something like (untested)::

    def get_tests(self, test_params=None):
    """ Iterate a test function for all the sets """
    if not test_params:
    test_params = self.game_params
    for key, params in test_params.items():
    dataset = params['dataset']
    instance = params['instance']
    yield key, dataset, instance

    def test_score_throws(self):
    """ Game score should be calculated from throws """
    for key, dataset, instance in self.get_tests()
    score = dataset['score']
    for throw in dataset['throws']:
    instance.add_throw(throw)
    self.failUnlessEqual(score, instance.get_score())

    That is, make an interator to the various test information, and just put
    your "test_func" code inside a for-loop.

    STeVe
    Steven Bethard, Dec 6, 2005
    #5
  6. Ben Finney

    Ben Finney Guest

    Ben Finney <> wrote:
    > Summary: I'm looking for idioms in unit tests for factoring out
    > repetitive iteration over test data.


    Thanks to those who've offered suggestions, especially those who
    suggested I look at generator functions. This leads to::

    import unittest

    import bowling # Module to be tested

    class Test_Game(unittest.TestCase):
    """ Test case for the Game class """

    def setUp(self):
    """ Set up test fixtures """
    self.game_data = {
    'none': dict(score=0, throws=[], frame=1),
    'one': dict(score=5, throws=[5], frame=1),
    'two': dict(score=9, throws=[5, 4], frame=2),
    'three': dict(score=14, throws=[5, 4, 5], frame=2),
    'strike': dict(score=26, throws=[10, 4, 5, 7], frame=3),
    }

    self.game_params = {}
    for key, dataset in self.game_data.items():
    params = {}
    instance = bowling.Game()
    params['instance'] = instance
    params['dataset'] = dataset
    self.game_params[key] = params

    def iterate_params(test_params=None):
    """ Yield the test parameters """
    if not test_params:
    test_params = self.game_params
    for key, params in test_params.items():
    dataset = params['dataset']
    instance = params['instance']
    yield key, dataset, instance

    def test_score_throws(self):
    """ Game score should be calculated from throws """
    for key, dataset, instance in self.iterate_params():
    score = dataset['score']
    for throw in dataset['throws']:
    instance.add_throw(throw)
    self.failUnlessEqual(score, instance.get_score())

    def test_current_frame(self):
    """ Current frame should be as expected """
    for key, dataset, instance in self.iterate_params():
    frame = dataset['frame']
    for throw in dataset['throws']:
    instance.add_throw(throw)
    self.failUnlessEqual(frame, instance.current_frame)

    That's much better. Each test is now clearly about looping through the
    datasets, but the infrastructure to do so is factored out. Adding a
    test case modelled on the existing cases just means adding a new entry
    to the game_data dictionary. Setting up a different kind of test --
    e.g. for invalid game data -- just means setting up a new params
    dictionary and feeding that to the same generator function.

    I like it. Can it be improved? Are there readability problems that can
    be fixed? Is the test fixture setup too complex? Should the iterator
    become even more general, and be refactored out to a test framework
    for the project?

    --
    \ "Those who can make you believe absurdities can make you commit |
    `\ atrocities." -- Voltaire |
    _o__) |
    Ben Finney
    Ben Finney, Dec 6, 2005
    #6
  7. Ben Finney wrote:
    > Ben Finney <> wrote:
    >> Summary: I'm looking for idioms in unit tests for factoring out
    >> repetitive iteration over test data.

    >
    > Thanks to those who've offered suggestions, especially those who
    > suggested I look at generator functions. This leads to::


    Here's another way (each test should independently test one feature):

    class Test_Game(unittest.TestCase):
    """ Test case for the Game class """
    score = 0
    throws = []
    frame = 1

    def setUp(self):
    """ Set up test fixtures """

    self.game = bowling.Game()

    def test_score_throws(self):
    """ Game score should be calculated from throws """
    for throw in self.throws:
    self.game.add_throw(throw)
    self.assertEqual(self.score, self.game.get_score())

    def test_current_frame(self):
    """ Current frame should be as expected """
    frame = dataset['frame']
    for throw in self.throws:
    self.game.add_throw(throw)
    self.assertEqual(self.frame, self.game.current_frame)

    class Test_one(Test_Game):
    score = 5
    throws = [5]
    frame = 1

    class Test_two(Test_Game):
    score = 9
    throws = [5, 4]
    frame = 2

    class Test_three(Test_Game):
    score = 14
    throws = [5, 4, 5]
    frame = 2

    class Test_strike(Test_Game):
    score = 26
    throws = [10, 4, 5, 7]
    frame = 3

    --Scott David Daniels
    Scott David Daniels, Dec 6, 2005
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Alexey Verkhovsky

    Parameterized tests with test/unit

    Alexey Verkhovsky, Jul 24, 2004, in forum: Ruby
    Replies:
    8
    Views:
    325
    Alexey Verkhovsky
    Jul 30, 2004
  2. Simon Strandgaard
    Replies:
    14
    Views:
    183
    Nathaniel Talbott
    Aug 16, 2004
  3. Raphael Bauduin
    Replies:
    6
    Views:
    98
    Nathaniel Talbott
    Feb 2, 2005
  4. Matt Berney

    passing data to tests with test/unit

    Matt Berney, Mar 28, 2007, in forum: Ruby
    Replies:
    5
    Views:
    114
    Matt Berney
    Mar 30, 2007
  5. timr
    Replies:
    2
    Views:
    164
Loading...

Share This Page