Probabilistic BDD?

Discussion in 'Ruby' started by Robert Feldt, Nov 1, 2006.

  1. Robert Feldt

    Robert Feldt Guest

    Hi,

    I'm playing around with BDD =E0 la test/spec and foudn that I need to
    specify properties probabilistically ie saying that they are
    likely/unlikely. Has there been any previous work along these lines?

    Should we add something like this to test/spec (Christian are you listening=
    ? :))

    diff -rN old-testspec/lib/test/spec.rb new-testspec/lib/test/spec.rb
    286a287,300
    > def unlikely(specname, probability =3D 0.01, preName =3D "unlikely", =

    &block)
    > specify(preName + " " + specname) do
    > count =3D 0
    > num_repetitions =3D 100
    > num_repetitions.times {count +=3D 1 if (block.call =3D=3D true)}
    > actual_probability =3D count.to_f / num_repetitions
    > assert actual_probability <=3D probability, "Expected probability=

    of #{probability} but was #{actual_probability} (#{count} in #{num_repetit=
    ions} repetitions)"
    > end
    > end
    >
    > def likely(specname, probability =3D 0.99, &block)
    > unlikely(specname, 1.0 - probability, "likely") {!block.call}
    > end
    >


    so that one can write specs like

    > # Example of probabilistic specifications
    >
    > context "random generation" do
    > unlikely "that two consecutive calls to rand gives the same value" do
    > rand(1000) =3D=3D rand(1000)
    > end
    >
    > likely "that two consecutive calls to rand gives different values" do
    > rand(1000) !=3D rand(1000)
    > end
    > end


    ?

    This should probably be generalized so that the number of repetitions
    to run depends on the probability of the event but I think you get the
    idea.

    Comments?

    /Robert Feldt
    Robert Feldt, Nov 1, 2006
    #1
    1. Advertising

  2. Robert Feldt

    Robert Feldt Guest

    > > # Example of probabilistic specifications
    > >
    > > context "random generation" do
    > > unlikely "that two consecutive calls to rand gives the same value" do
    > > rand(1000) == rand(1000)
    > > end
    > >
    > > likely "that two consecutive calls to rand gives different values" do
    > > rand(1000) != rand(1000)
    > > end
    > > end

    >

    Forgot one thing:

    Some decision has to be made if setup code should be run before each
    eval of the block or only before the repeated repetitions. To make it
    more in line with the rest of test/spec maybe setup code should run
    before every eval of the unlikely/likely blocks?

    /Robert Feldt
    Robert Feldt, Nov 1, 2006
    #2
    1. Advertising

  3. Robert Feldt

    Ben Nagy Guest

    > -----Original Message-----
    > From: Robert Feldt [mailto:]=20
    > Sent: Wednesday, November 01, 2006 12:51 PM
    > To: ruby-talk ML
    > Subject: Probabilistic BDD?
    >=20
    > Hi,
    >=20
    > I'm playing around with BDD =E0 la test/spec and foudn that I need to
    > specify properties probabilistically ie saying that they are
    > likely/unlikely. Has there been any previous work along these lines?
    >=20
    > Should we add something like this to test/spec (Christian are=20
    > you listening? :))

    [...]
    > so that one can write specs like
    >=20
    > > # Example of probabilistic specifications
    > >
    > > context "random generation" do
    > > unlikely "that two consecutive calls to rand gives the=20

    > same value" do
    > > rand(1000) =3D=3D rand(1000)
    > > end
    > >
    > > likely "that two consecutive calls to rand gives=20

    > different values" do
    > > rand(1000) !=3D rand(1000)
    > > end
    > > end

    >=20
    > ?
    >=20
    > This should probably be generalized so that the number of repetitions
    > to run depends on the probability of the event but I think you get the
    > idea.
    >=20
    > Comments?


    I think that the example you give is not appropriate for testing rand(), =
    and
    pretty much any code where the result is expected to conform to a set of
    statistical properties. If you take a look at randomness test suites =
    like
    Diehard there are a battery of different tests that should be applied =
    before
    data can be called 'random' with any confidence.

    http://en.wikipedia.org/wiki/Diehard_tests

    The tests as you have written them would be satisfied by any number of
    broken PRNGs, or even NRAAGs (Not Random At All Generators) (eg =
    alternating
    '1' and '2' ;). In particular, unlikely events must occur sometimes and
    likely events must fail to occur sometimes, so some form of =3D=3D=3D =
    seems better
    than <=3D.

    If you wanted to test RNGs then you need to run a whole series of tests =
    -
    either like the Diehard tests or just basic stuff like chi square, =
    binomial,
    monte-carlo calculation of pi etc.

    More generally, I think that 'likely' and 'unlikely' are going to be so
    context dependant that the user would be better off writing their own =
    test
    code, surely? I can see a place for should_be_random, but likely and
    unlikely strike me as a bad idea. In any case, when running test code I
    expect that it will give me the same result every time, so any tests =
    should
    at least have that property.

    Sorry to sound negative. :(

    ben
    Ben Nagy, Nov 1, 2006
    #3
  4. Robert Feldt

    Robert Feldt Guest

    > I think that the example you give is not appropriate for testing rand(), and
    > pretty much any code where the result is expected to conform to a set of
    > statistical properties. If you take a look at randomness test suites like
    > Diehard there are a battery of different tests that should be applied before
    > data can be called 'random' with any confidence.
    >
    > http://en.wikipedia.org/wiki/Diehard_tests
    >
    > The tests as you have written them would be satisfied by any number of
    > broken PRNGs, or even NRAAGs (Not Random At All Generators) (eg alternating
    > '1' and '2' ;). In particular, unlikely events must occur sometimes and
    > likely events must fail to occur sometimes, so some form of === seems better
    > than <=.
    >
    > If you wanted to test RNGs then you need to run a whole series of tests -
    > either like the Diehard tests or just basic stuff like chi square, binomial,
    > monte-carlo calculation of pi etc.
    >

    I don't want to test RNG's; that was just the smallest possible
    example use of the likely/unlikely methods I could think of. I'm
    fairly well versed in RNG testing, thank you.

    > More generally, I think that 'likely' and 'unlikely' are going to be so
    > context dependant that the user would be better off writing their own test
    > code, surely? I can see a place for should_be_random, but likely and
    > unlikely strike me as a bad idea. In any case, when running test code I
    > expect that it will give me the same result every time, so any tests should
    > at least have that property.
    >
    > Sorry to sound negative. :(
    >

    It is ok to be negative but I have run into test situations many times
    where there is an element of varying behavior involved and specifying
    exactly what is to be expected can only be done by running multiple
    tests and making claims about overall properties of the results.

    But it may be the case that people should write their own test code for it yes.

    Still I think this is an important discussion in general since for
    complex algorithms where it is costly to calc the exact expected
    results ways to write partial specs are important.

    Regards,

    Robert
    Robert Feldt, Nov 1, 2006
    #4
  5. On 11/1/06, Robert Feldt <> wrote:
    > > I think that the example you give is not appropriate for testing rand(), and
    > > pretty much any code where the result is expected to conform to a set of
    > > statistical properties. If you take a look at randomness test suites like
    > > Diehard there are a battery of different tests that should be applied before
    > > data can be called 'random' with any confidence.
    > >
    > > http://en.wikipedia.org/wiki/Diehard_tests
    > >
    > > The tests as you have written them would be satisfied by any number of
    > > broken PRNGs, or even NRAAGs (Not Random At All Generators) (eg alternating
    > > '1' and '2' ;). In particular, unlikely events must occur sometimes and
    > > likely events must fail to occur sometimes, so some form of === seems better
    > > than <=.
    > >
    > > If you wanted to test RNGs then you need to run a whole series of tests -
    > > either like the Diehard tests or just basic stuff like chi square, binomial,
    > > monte-carlo calculation of pi etc.
    > >

    > I don't want to test RNG's; that was just the smallest possible
    > example use of the likely/unlikely methods I could think of. I'm
    > fairly well versed in RNG testing, thank you.
    >
    > > More generally, I think that 'likely' and 'unlikely' are going to be so
    > > context dependant that the user would be better off writing their own test
    > > code, surely? I can see a place for should_be_random, but likely and
    > > unlikely strike me as a bad idea. In any case, when running test code I
    > > expect that it will give me the same result every time, so any tests should
    > > at least have that property.
    > >
    > > Sorry to sound negative. :(
    > >

    > It is ok to be negative but I have run into test situations many times
    > where there is an element of varying behavior involved and specifying
    > exactly what is to be expected can only be done by running multiple
    > tests and making claims about overall properties of the results.
    >
    > But it may be the case that people should write their own test code for it yes.
    >
    > Still I think this is an important discussion in general since for
    > complex algorithms where it is costly to calc the exact expected
    > results ways to write partial specs are important.
    >



    Personally, I would do something like:

    def something_run_a_bunch_of_times
    results = []
    100_000.times {results << whatever_is_being_verified}
    results.matches_statistical_requirements_of_domain?
    end

    specify "should be totally awesome" do
    something_run_a_bunch_of_times.should.be true
    end
    Wilson Bilkovich, Nov 1, 2006
    #5
  6. Robert Feldt wrote:
    >> I think that the example you give is not appropriate for testing
    >> rand(), and
    >> pretty much any code where the result is expected to conform to a set of
    >> statistical properties. If you take a look at randomness test suites like
    >> Diehard there are a battery of different tests that should be applied
    >> before
    >> data can be called 'random' with any confidence.
    >>
    >> http://en.wikipedia.org/wiki/Diehard_tests


    Speaking of numerical test suites, do you happen to know of any test
    suites on line for elementary functions? I used to know of one, but
    haven't been able to find it. Nor have I been able to find my copy of
    "Software Manual for the Elementary Functions".

    And I'm *not* talking about "paranoia" ... that just tests arithmetic,
    and I found it.
    M. Edward (Ed) Borasky, Nov 1, 2006
    #6
  7. Robert Feldt

    Robert Feldt Guest

    > Personally, I would do something like:
    >
    > def something_run_a_bunch_of_times
    > results = []
    > 100_000.times {results << whatever_is_being_verified}
    > results.matches_statistical_requirements_of_domain?
    > end
    >
    > specify "should be totally awesome" do
    > something_run_a_bunch_of_times.should.be true
    > end
    >

    Yes, I'll probably keep my own set of test/spec extensions for now.

    lambda {some bool test}.should.be.unlikely

    is kind of tempting though... ;)

    Thanks for your input,

    /Robert
    Robert Feldt, Nov 1, 2006
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. sdffsdf

    Probleme de lecture de champs BDD

    sdffsdf, Apr 7, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    887
    Ken Cox [Microsoft MVP]
    Apr 7, 2005
  2. richard
    Replies:
    1
    Views:
    351
    Jan Burgy
    Sep 7, 2004
  3. Terry Reedy
    Replies:
    9
    Views:
    861
    Gregory Ewing
    Jun 20, 2010
  4. Stefan Behnel

    Re: CI and BDD with Python

    Stefan Behnel, Jul 9, 2011, in forum: Python
    Replies:
    9
    Views:
    451
  5. Nick Mellor

    Probabilistic unit tests?

    Nick Mellor, Jan 11, 2013, in forum: Python
    Replies:
    7
    Views:
    148
    duncan smith
    Jan 12, 2013
Loading...

Share This Page