Test-driven development of random algorithms

Discussion in 'Python' started by Steven D'Aprano, Nov 14, 2006.

  1. I'm working on some functions that, essentially, return randomly generated
    strings. Here's a basic example:

    def rstr():
    """Return a random string based on a pseudo
    normally-distributed random number.
    """
    x = 0.0
    for i in range(12):
    x += random.random()
    return str(int(x)+6))

    I want to do test-driven development. What should I do? Generally, any
    test I do of the form

    assert rst() == '1'

    will fail more often than not (about 85% of the time, by my estimate). An
    easy work around would be to do this:

    assert rstr() in [str(n) for n in range(-6, 6)]

    but (1) that doesn't scale very well (what if rstr() could return one of
    a billion different strings?) and (2) there could be bugs which only show
    up probabilistically, e.g. if I've got the algorithm wrong, rstr() might
    return '6' once in a while.

    Does anyone have generic advice for the testing and development of this
    sort of function?


    --
    Steven D'Aprano
     
    Steven D'Aprano, Nov 14, 2006
    #1
    1. Advertising

  2. Steven D'Aprano

    Robert Kern Guest

    Steven D'Aprano wrote:
    > I'm working on some functions that, essentially, return randomly generated
    > strings. Here's a basic example:
    >
    > def rstr():
    > """Return a random string based on a pseudo
    > normally-distributed random number.
    > """
    > x = 0.0
    > for i in range(12):
    > x += random.random()
    > return str(int(x)+6))
    >
    > I want to do test-driven development. What should I do? Generally, any
    > test I do of the form
    >
    > assert rst() == '1'
    >
    > will fail more often than not (about 85% of the time, by my estimate). An
    > easy work around would be to do this:
    >
    > assert rstr() in [str(n) for n in range(-6, 6)]
    >
    > but (1) that doesn't scale very well (what if rstr() could return one of
    > a billion different strings?) and (2) there could be bugs which only show
    > up probabilistically, e.g. if I've got the algorithm wrong, rstr() might
    > return '6' once in a while.
    >
    > Does anyone have generic advice for the testing and development of this
    > sort of function?


    "Design for Testability". In library code, never call the functions in the
    random module. Always take as an argument a random.Random instance. When
    testing, you can seed your own Random instance and all of your numbers will be
    the same for every test run.

    This kind of design is A Good Thing(TM) outside of unit tests, too. They aren't
    the only places where one might want to have full control over the sequence of
    random numbers.

    --
    Robert Kern

    "I have come to believe that the whole world is an enigma, a harmless enigma
    that is made terrible by our own mad attempt to interpret it as though it had
    an underlying truth."
    -- Umberto Eco
     
    Robert Kern, Nov 14, 2006
    #2
    1. Advertising

  3. Steven D'Aprano

    Ben Finney Guest

    Robert Kern <> writes:

    > Steven D'Aprano wrote:
    > > Does anyone have generic advice for the testing and development of
    > > this sort of function?

    >
    > "Design for Testability". In library code, never call the functions
    > in the random module. Always take as an argument a random.Random
    > instance. When testing, you can seed your own Random instance and
    > all of your numbers will be the same for every test run.


    Even better, you can pass a stub Random instance (that fakes it) or a
    mock Random instance (that fakes it, and allows subsequent assertion
    that the client code used it as expected). This way, you can ensure
    that your fake Random actually gives a sequence of numbers designed to
    quickly cover the extremes and corner cases, as well as some normal
    cases.

    This applies to any externality (databases, file systems, input
    devices): in your unit tests, dont pass the real externality. Pass a
    convincing fake that will behave entirely predictably, but will
    nevertheless exercise the functionality needed for the unit tests.

    This is one of the main differences between unit tests and other
    kinds. With unit tests, you want each test to exercise as narrow a set
    of the client behaviour as feasible. This means eliminating anything
    else as a possible source of problems. With other tests -- stress
    tests, acceptance tests, and so on -- you want to exercise the entire
    application stack, or some significant chunk of it.

    --
    \ "It is the mark of an educated mind to be able to entertain a |
    `\ thought without accepting it." -- Aristotle |
    _o__) |
    Ben Finney
     
    Ben Finney, Nov 14, 2006
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Yan-Hong Huang[MSFT]
    Replies:
    0
    Views:
    416
    Yan-Hong Huang[MSFT]
    Oct 15, 2003
  2. Sasha
    Replies:
    3
    Views:
    429
    Scott Allen
    Mar 2, 2005
  3. Michael P Smith

    Test-driven Development in C#

    Michael P Smith, Apr 24, 2004, in forum: C Programming
    Replies:
    4
    Views:
    652
    Dan Pop
    Apr 26, 2004
  4. Yannick Tremblay

    Test driven development in C ?

    Yannick Tremblay, Jan 3, 2006, in forum: C Programming
    Replies:
    3
    Views:
    434
    Marco
    Jan 4, 2006
  5. Shaguf
    Replies:
    0
    Views:
    823
    Shaguf
    Nov 28, 2008
Loading...

Share This Page