Re: RNGs: A double KISS

Discussion in 'C++' started by Dann Corbit, Apr 14, 2010.

  1. Dann Corbit

    Dann Corbit Guest

    For reference and comparison, here is a "down and dirty" translation
    into a C class.

    I guess that some C++ guru will take a few minutes to shine the apple a
    bit so that it can be genuinely useful.

    /*
    From: geo <>
    Newsgroups: sci.math,comp.lang.c
    Subject: RNGs: A double KISS
    Date: Wed, 14 Apr 2010 03:41:05 -0700 (PDT)
    Xref: eternal-september.org sci.math:94533 comp.lang.c:57850

    The KISS (Keep-It Simple-Stupid) principle
    appears to be one of the better ways to get
    fast and reliable random numbers in a computer,
    by combining several simple methods.

    Most RNGs produce random integers---usually 32-bit---
    that must be converted to floating point numbers
    for applications. I offer here a posting
    describing my latest version of a KISS RNG that
    produces double-precision uniform random variables
    directly, without need for the usual integer-to-float
    conversion. It is an extension of a method I provided
    for MATLAB, with details described in
    G.Marsaglia and W.W.Tsang, The 64-bit universal RNG,
    Statistics & Probability Letters, V66, Issue 2,2004.

    Rather than combining simple lagged Fibonacci and Weyl
    sequences, as in those versions, this latest version combines
    a lag-1220 CSWB (Complimentary-Subtract-With-Borrow) sequence,
    x[n]=x[n-1220-x[n-1190]-borrow \mod b=2^53, period >10^19462
    with a lag-2 SWB (Subtract-With-Borrow) sequence,
    z[n]=z[n-1]-z[n-2]-borrow mod b=2^53. period>10^30.
    Details on how to maintain exact integer arithmetic
    with only float operations are in the article above.

    This double-precision RNG, called dUNI, has an immense period,
    around 10^19492 versus a previous 10^61, requires just a few
    lines of code, is very fast, some 70 million per second, and
    ---a key feature---behaves very well on tests of randomness.
    The extensive tests of The Diehard Battery, designed for 32-bit
    random integers, are applied to bits 1 to 32, then 2 to 33,
    then 3 to 34,...,then finally bits 21 to 53 for each part of
    the string of 53 bits in each double-precision float dUNI().

    A C version is listed below, but as only tests of magnitude
    and floating point addition or subtraction are required,
    implementations in other languages could be made available.
    I invite such from interested readers.

    If you cut, paste, compile and run the following C code,
    it should take around 14 seconds to generate 1000 million
    dUNI's, followed by the output 0.6203646342357479.
    ------------------------------------------------------------
    */
    /*
    dUNI(), a double-precision uniform RNG based
    on the KISS (Keep-It-Simple-Stupid) principle,
    combining, by subtraction mod 1, a CSWB
    (Complimentary-Subtract-With-Borrow) integer sequence
    x[n]=x[n-1220]-x[n-1190]-borrow mod b=2^53,
    with a SWB (Subtract-With-Borrow) sequence
    z[n]=z[n-2]-z[n-1]-borrow mod b=2^53.

    Both implemented as numerators of double precision
    rationals, t for CSWB, zy for SWB, each formed as
    (integer mod 2^53)/2^53, leading to a returned KISS
    value t-zw mod 1. All denominators floats of b=2^53.

    The CSWB sequence is based on prime p=b^1220-b^1190+1,
    for which the order of b mod p is (p-1)/72, so the CSWB
    sequence has period >2^64653 or 10^19462.

    The SWB-lag2 sequence z[n]=z[n-2]-z[n-1]-zc mod b=2^53,
    is based on b^2-b-1 = 11*299419*24632443746239056514780519
    with period 10*299418*(24632443746239056514780518/2) > 2^101.
    cc is the CSWB and SWB borrow or carry value: cc=1.0/b;
    */

    #include <cstdio>
    #include <cstdlib>
    #include <cassert>

    class kiss2 {

    private:
    double Q[1220];
    int indx;
    double cc;
    double c; /* current CSWB */
    double zc; /* current SWB `borrow` */
    double zx; /* SWB seed1 */
    double zy; /* SWB seed2 */
    size_t qlen;/* length of Q array */

    public:
    kiss2()
    {
    assert(sizeof (double) >= 8);
    cc = 1.0 / 9007199254740992.0; // inverse of 2^53rd power
    int i;
    size_t qlen = indx = sizeof Q / sizeof Q[0];
    for (i = 0; i < qlen; i++)
    Q = 0;
    double c = 0.0, zc = 0.0, /* current CSWB and SWB `borrow` */
    zx = 5212886298506819.0 / 9007199254740992.0, /* SWB seed1 */
    zy = 2020898595989513.0 / 9007199254740992.0; /* SWB seed2 */
    int j;
    double s, t; /* Choose 32 bits for x, 32 for y */
    unsigned long x = 123456789, y = 362436069; /* default seeds * /
    /* Next, seed each Q, one bit at a time, */
    for (i = 0; i < qlen; i++)
    { /* using 9th bit from Cong+Xorshift */
    s = 0.0;
    t = 1.0;
    for (j = 0; j < 52; j++)
    {
    t = 0.5 * t; /* make t=.5/2^j */
    x = 69069 * x + 123;
    y ^= (y << 13);
    y ^= (y >> 17);
    y ^= (y << 5);
    if (((x + y) >> 23) & 1)
    s = s + t; /* change bit of s, maybe */
    } /* end j loop */
    Q = s;
    } /* end i seed loop, Now generate 10^9 dUNI's: */
    }


    double dUNI()
    { /* Takes 14 nanosecs, Intel Q6600,2.40GHz */
    int i, j;
    double t; /* t: first temp, then next CSWB value */
    /* First get zy as next lag-2 SWB */
    t = zx - zy - zc;
    zx = zy;
    if (t < 0)
    {
    zy = t + 1.0;
    zc = cc;
    }
    else
    {
    zy = t;
    zc = 0.0;
    }

    /* Then get t as the next lag-1220 CSWB value */
    if (indx < 1220)
    t = Q[indx++];
    else
    { /* refill Q[n] via Q[n-1220]-Q[n-1190]-c, */
    for (i = 0; i < 1220; i++)
    {
    j = (i < 30) ? i + 1190 : i - 30;
    t = Q[j] - Q + c; /* Get next CSWB element */
    if (t > 0)
    {
    t = t - cc;
    c = cc;
    }
    else
    {
    t = t - cc + 1.0;
    c = 0.0;
    }
    Q = t;
    } /* end i loop */
    indx = 1;
    t = Q[0]; /* set indx, exit 'else' with t=Q[0] */
    } /* end else segment; return t-zy mod 1 */
    return ((t < zy) ? 1.0 + (t - zy) : t - zy);
    } /* end dUNI() */
    };

    int main(void)
    { /* You must provide at least two 32-bit seeds */
    kiss2 krng;
    double t;
    int i;
    for (i = 0; i < 1000000000; i++)
    t = krng.dUNI ();
    printf ("%.16f\n", krng.dUNI ());
    }

    /*
    Output, after 10^9 random uniforms:
    0.6203646342357479
    Should take about 14 seconds,
    e.g. with gcc -O3 opimizing compiler
    --------------------------------------------------------
    */


    /*
    Fully seeding the Q[1220] array requires 64660
    seed bits, plus another 106 for the initial zx
    and zy of the lag-2 SWB sequence.
    As in the above listing, using just two 32-bit
    seeds, x and y in main(), to initialize the Q
    array, one bit at a time, by means of the
    resulting Congruential+Xorshift sequence
    may be enough for many applications.

    Applications such as in Law or Gaming may
    require enough seeds in the Q[1220] array to
    guarantee that each one of a huge set of
    possible outcomes can appear. For example,
    choosing a jury venire of 80 from a
    list of 2000 eligibles would require at least
    ten 53-bit seeds; choosing 180 from 4000 would
    require twenty 53-bit seeds.
    To get certification, a casino machine that could
    play forty simultaneous games of poker must be
    able to produce forty successive straight-flushes,
    with a resulting minimal seed set.

    Users can choose their 32-bit x,y for the
    above seeding process, or develop their own
    for more exacting requirements when a mere
    set of 64 seed bits may not be enough.

    Properties:
    1. Simple and very fast, some 70 million double-precision
    uniform(0,1] random variables/second, in IEEE 754 standard
    form: 1 sign bit, 11 exponent bits, 52 fraction bits
    plus the implied 1 leading the fraction part, making a
    total of 53 bits for each uniform floating-point variate.

    2. Period some 10^19492 (vs. the 10^61 of an earlier version),
    G. Marsaglia and W.W. Tsang, "The 64-bit universal RNG",(2004)
    Statistics and Probability Letters}, v66, no. 2, 183-187,
    or an earlier version provided for Matlab.)

    3. Following the KISS, (Keep-It-Simple-Stupid) principle,
    combines, via subtraction, a CSWB sequence
    x[n]=x[n-1220]-x[n-1190]-borrow mod 2^53
    based on the prime p=b^1220-b^1190+1, b=2^53,
    with a SWB lag-2 sequence z[n]=w[n-2]-z[n-1] -c mod 2^53.
    All modular arithmetic on numerators of rationals
    with denominators 2.^53.

    4. Easily implemented in various programming languages,
    as only tests on magnitude and double-precision floating-point
    subtraction or addition required.

    5. Passes all Diehard tests,
    http://i.cs.hku.hk/~diehard/
    As Diehard is designed for 32-bit integer input, all of its
    234 tests are applied to bits 1..32, then 2..33, then 3..34,..,
    then 22..53 of the 53 bits in each dUNI().
    (To form 32-bit integer j from bits i..(i+31):
    t=dUNI()*(1<<(i-1)); i=t; j=(t-i)*4294967296.0; )
    All twenty-two choices for the 32-bit segments performed
    very well on the 234 p-values from Diehard tests, using
    the default 32-bit seeds x and y. You might try your own,
    with possibly more extensive seeding of the Q array.

    -- George Marsaglia

    */
     
    Dann Corbit, Apr 14, 2010
    #1
    1. Advertising

  2. Dann Corbit

    Jonathan Lee Guest

    On Apr 14, 2:06 pm, Dann Corbit <> wrote:

    Thanks for sharing.

    Isn't this dangerous for y:

    >             for (j = 0; j < 52; j++)
    >             {
    >                 t = 0.5 * t; /* make t=.5/2^j  */
    >                 x = 69069 * x + 123;
    >                 y ^= (y << 13);
    >                 y ^= (y >> 17);
    >                 y ^= (y << 5);
    >                 if (((x + y) >> 23) & 1)
    >                     s = s + t; /* change bit of s, maybe */
    >             }    /* end j loop */


    Specifically (y >> 17) could mix in high bits if unsigned
    long were 64 bit instead of (presumably) 32-bit. Just at
    a quick glance, I would think the operations on y
    would need to be masked to be portable.

    --Jonathan
     
    Jonathan Lee, Apr 14, 2010
    #2
    1. Advertising

  3. Dann Corbit

    Jonathan Lee Guest

    On Apr 14, 2:06 pm, Dann Corbit <> wrote:
    > For reference and comparison, here is a "down and dirty" translation
    > into a C class.
    >
    > I guess that some C++ guru will take a few minutes to shine the apple a
    > bit so that it can be genuinely useful.


    Another thought: you could probably remove that assertion by building
    the double from LSb to MSb. You'd essentially be bitreversing the
    number which probably won't affect the randomness (can't prove it
    off hand). If double didn't have enough bits, the machine would simply
    round the value as you went along. Not ideal under some RNG scenarios,
    but the result should be the "same" as the actual number up to machine
    precision.

    --Jonathan
     
    Jonathan Lee, Apr 14, 2010
    #3
  4. Dann Corbit

    Jonathan Lee Guest

    On Apr 14, 2:53 pm, Jonathan Lee <> wrote:
    > On Apr 14, 2:06 pm, Dann Corbit <> wrote:
    >
    > Thanks for sharing.
    >
    > Isn't this dangerous for y:
    >


    Just confirmed: if y isn't masked to 32-bits the output
    is wrong.

    Anyway, here's my take on the code:

    // kiss2.hpp
    -------------------------------------------------------------------
    #include <cstddef>

    class kiss2 {
    double Q[1220]; // meh.. I guess this could be a
    std::vector
    const std::size_t qlen; // length of Q array
    std::size_t indx;
    double c; // current CSWB
    double zc; // current SWB `borrow`
    double zx; // SWB seed1
    double zy; // SWB seed2
    public:
    kiss2(unsigned long = 123456789UL, unsigned long = 362436069UL);
    double operator()(); // For getting a random number
    };

    // kiss2.cpp
    -------------------------------------------------------------------
    //#include "kiss2.hpp"
    #include <cfloat>
    using std::size_t;
    /**
    \todo? Replace constant initialization of indx and qlen w/ sizeof Q/
    sizeof Q[0]
    \todo Magic numbers make me uneasy (re: zx, zy)
    */
    kiss2::kiss2(
    unsigned long x, // seed1 value
    unsigned long y // seed2 value
    ): qlen(1220), indx(1220), c(0.0), zc(0.0),
    zx(5212886298506819.0 / 9007199254740992.0),
    zy(2020898595989513.0 / 9007199254740992.0)
    {
    #if (FLT_RADIX != 2 || DBL_MIN_EXP > -53)
    #error "Machine double not supported"
    #endif

    // fill in Q "using 9th bit from Cong+Xorshift"
    for (size_t i = 0; i < qlen; i++) {
    double s = 0.0, t = 1.0;

    // Build Q one bit at a time
    for (int j = 0; j < 52; j++) {
    t *= 0.5; // make t=.5/2^j
    x = (69069 * x + 123);
    y = (y ^ (y << 13)) & 0xFFFFFFFFUL;
    y = (y ^ (y >> 17)) & 0xFFFFFFFFUL;
    y = (y ^ (y << 5)) & 0xFFFFFFFFUL;

    if (((x + y) >> 23) & 1UL) // conditionally set bit of s
    s += t;
    }
    Q = s;
    }
    }

    /**
    \todo Separate refill function?
    \todo I guess we sorta need a guarantee that qlen >= 30
    */
    double kiss2::eek:perator()() { // Takes 14 nanosecs, Intel Q6600,2.40GHz
    static const double cc = 1.0 / 9007199254740992.0; // 2^-53

    double t; // t: first temp, then next CSWB value
    // First get zy as next lag-2 SWB
    t = zx - zy - zc;
    zx = zy;
    if (t < 0) {
    zy = t + 1.0, zc = cc;
    } else zy = t, zc = 0.0;

    // Then get t as the next lag-1220 CSWB value
    if (indx >= qlen) { // refill Q[n] via Q[n-1220]-Q[n-1190]-c
    for (size_t i = 0; i < qlen; i++) {
    size_t j = (i < 30) ? i + (qlen - 30) : i - 30;
    t = Q[j] - Q + c; // Get next CSWB element
    if (t > 0) {
    t = t - cc, c = cc;
    } else t = t - cc + 1.0, c = 0.0;
    Q = t;
    }
    indx = 0; // reset indx
    }

    // return Q[indx] - zy (mod 1)
    t = Q[indx++] - zy;
    return (t < 0.0 ? 1.0 + t : t);
    }

    // main.cpp
    --------------------------------------------------------------------
    #include <cstdio>
    #include <cstdlib>
    // #include "kiss2.hpp"
    int main(void)
    { /* You must provide at least two 32-bit seeds */
    kiss2 krng; // Use the default seeds
    double t;
    int i;
    for (i = 0; i < 1000000000; i++)
    t = krng();
    printf ("%.16f\n", krng());
    }

    --Jonathan
     
    Jonathan Lee, Apr 14, 2010
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. geo

    64-bit KISS RNGs

    geo, Feb 28, 2009, in forum: C Programming
    Replies:
    23
    Views:
    2,977
  2. geo

    RNGs: A Super KISS

    geo, Nov 3, 2009, in forum: C Programming
    Replies:
    8
    Views:
    568
    user923005
    Nov 4, 2009
  3. user923005

    Re: RNGs: A Super KISS

    user923005, Nov 3, 2009, in forum: C++
    Replies:
    1
    Views:
    1,402
    user923005
    Nov 3, 2009
  4. user923005

    Re: RNGs: A Super KISS

    user923005, Nov 4, 2009, in forum: C++
    Replies:
    2
    Views:
    547
    Michael Doubez
    Nov 5, 2009
  5. geo

    RNGs: A double KISS

    geo, Apr 14, 2010, in forum: C Programming
    Replies:
    10
    Views:
    999
    steve
    Apr 17, 2010
Loading...

Share This Page