G

#### geo

appears to be one of the better ways to get

fast and reliable random numbers in a computer,

by combining several simple methods.

Most RNGs produce random integers---usually 32-bit---

that must be converted to floating point numbers

for applications. I offer here a posting

describing my latest version of a KISS RNG that

produces double-precision uniform random variables

directly, without need for the usual integer-to-float

conversion. It is an extension of a method I provided

for MATLAB, with details described in

G.Marsaglia and W.W.Tsang, The 64-bit universal RNG,

Statistics & Probability Letters, V66, Issue 2,2004.

Rather than combining simple lagged Fibonacci and Weyl

sequences, as in those versions, this latest version combines

a lag-1220 CSWB (Complimentary-Subtract-With-Borrow) sequence,

x[n]=x[n-1220-x[n-1190]-borrow \mod b=2^53, period >10^19462

with a lag-2 SWB (Subtract-With-Borrow) sequence,

z[n]=z[n-1]-z[n-2]-borrow mod b=2^53. period>10^30.

Details on how to maintain exact integer arithmetic

with only float operations are in the article above.

This double-precision RNG, called dUNI, has an immense period,

around 10^19492 versus a previous 10^61, requires just a few

lines of code, is very fast, some 70 million per second, and

---a key feature---behaves very well on tests of randomness.

The extensive tests of The Diehard Battery, designed for 32-bit

random integers, are applied to bits 1 to 32, then 2 to 33,

then 3 to 34,...,then finally bits 21 to 53 for each part of

the string of 53 bits in each double-precision float dUNI().

A C version is listed below, but as only tests of magnitude

and floating point addition or subtraction are required,

implementations in other languages could be made available.

I invite such from interested readers.

If you cut, paste, compile and run the following C code,

it should take around 14 seconds to generate 1000 million

dUNI's, followed by the output 0.6203646342357479.

------------------------------------------------------------

/*

dUNI(), a double-precision uniform RNG based

on the KISS (Keep-It-Simple-Stupid) principle,

combining, by subtraction mod 1, a CSWB

(Complimentary-Subtract-With-Borrow) integer sequence

x[n]=x[n-1220]-x[n-1190]-borrow mod b=2^53,

with a SWB (Subtract-With-Borrow) sequence

z[n]=z[n-2]-z[n-1]-borrow mod b=2^53.

Both implemented as numerators of double precision

rationals, t for CSWB, zy for SWB, each formed as

(integer mod 2^53)/2^53, leading to a returned KISS

value t-zw mod 1. All denominators floats of b=2^53.

The CSWB sequence is based on prime p=b^1220-b^1190+1,

for which the order of b mod p is (p-1)/72, so the CSWB

sequence has period >2^64653 or 10^19462.

The SWB-lag2 sequence z[n]=z[n-2]-z[n-1]-zc mod b=2^53,

is based on b^2-b-1 = 11*299419*24632443746239056514780519

with period 10*299418*(24632443746239056514780518/2) > 2^101.

cc is the CSWB and SWB borrow or carry value: cc=1.0/b;

*/

#include <stdio.h>

static double Q[1220];

static int indx=1220;

double dUNI()

{ /* Takes 14 nanosecs, Intel Q6600,2.40GHz */

const double cc=1.0/9007199254740992.0;

static double c=0.0,zc=0.0, /*current CSWB and SWB `borrow`*/

zx=5212886298506819.0/9007199254740992.0, /*SWB seed1*/

zy=2020898595989513.0/9007199254740992.0; /*SWB seed2*/

int i,j; double t; /*t: first temp, then next CSWB value*/

/* First get zy as next lag-2 SWB */

t=zx-zy-zc; zx=zy; if(t<0){zy=t+1.0; zc=cc;} else{zy=t; zc=0.0;}

/*Then get t as the next lag-1220 CSWB value */

if(indx<1220) t=Q[indx++];

else /*refill Q[n] via Q[n-1220]-Q[n-1190]-c, */

{ for(i=0;i<1220;i++)

{j=(i<30) ? i+1190 : i-30;

t=Q[j]-Q

*+c; /*Get next CSWB element*/*

if(t>0) {t=t-cc; c=cc;} else {t=t-cc+1.0; c=0.0;}

Q

if(t>0) {t=t-cc; c=cc;} else {t=t-cc+1.0; c=0.0;}

Q

*=t;*

} /* end i loop */

indx=1; t=Q[0]; /*set indx, exit 'else' with t=Q[0] */

} /* end else segment; return t-zy mod 1 */

return( (t<zy)? 1.0+(t-zy) : t-zy ) ;

} /* end dUNI() */

int main() /*You must provide at least two 32-bit seeds*/

{ int i,j; double s,t; /*Choose 32 bits for x, 32 for y */

unsigned long x=123456789, y=362436069; /*default seeds*/

/* Next, seed each Q} /* end i loop */

indx=1; t=Q[0]; /*set indx, exit 'else' with t=Q[0] */

} /* end else segment; return t-zy mod 1 */

return( (t<zy)? 1.0+(t-zy) : t-zy ) ;

} /* end dUNI() */

int main() /*You must provide at least two 32-bit seeds*/

{ int i,j; double s,t; /*Choose 32 bits for x, 32 for y */

unsigned long x=123456789, y=362436069; /*default seeds*/

/* Next, seed each Q

*, one bit at a time, */*

for(i=0;i<1220;i++) /*using 9th bit from Cong+Xorshift*/

{s=0.0; t=1.0;

for(j=0;j<52;j++)

{t=0.5*t; /* make t=.5/2^j */

x=69069*x+123; y^=(y<<13);y^=(y>>17);y^=(y<<5);

if(((x+y)>>23)&1) s=s+t; /*change bit of s, maybe */

} /* end j loop*/

Qfor(i=0;i<1220;i++) /*using 9th bit from Cong+Xorshift*/

{s=0.0; t=1.0;

for(j=0;j<52;j++)

{t=0.5*t; /* make t=.5/2^j */

x=69069*x+123; y^=(y<<13);y^=(y>>17);y^=(y<<5);

if(((x+y)>>23)&1) s=s+t; /*change bit of s, maybe */

} /* end j loop*/

Q

*=s;*

} /*end i seed loop, Now generate 10^9 dUNI's: */

for(i=0;i<1000000000;i++) t=dUNI();

printf("%.16f\n",dUNI());

}

/*

Output, after 10^9 random uniforms:

0.6203646342357479

Should take about 14 seconds,

e.g. with gcc -O3 opimizing compiler

--------------------------------------------------------

*/

/*

Fully seeding the Q[1220] array requires 64660

seed bits, plus another 106 for the initial zx

and zy of the lag-2 SWB sequence.

As in the above listing, using just two 32-bit

seeds, x and y in main(), to initialize the Q

array, one bit at a time, by means of the

resulting Congruential+Xorshift sequence

may be enough for many applications.

Applications such as in Law or Gaming may

require enough seeds in the Q[1220] array to

guarantee that each one of a huge set of

possible outcomes can appear. For example,

choosing a jury venire of 80 from a

list of 2000 eligibles would require at least

ten 53-bit seeds; choosing 180 from 4000 would

require twenty 53-bit seeds.

To get certification, a casino machine that could

play forty simultaneous games of poker must be

able to produce forty successive straight-flushes,

with a resulting minimal seed set.

Users can choose their 32-bit x,y for the

above seeding process, or develop their own

for more exacting requirements when a mere

set of 64 seed bits may not be enough.

Properties:

1. Simple and very fast, some 70 million double-precision

uniform(0,1] random variables/second, in IEEE 754 standard

form: 1 sign bit, 11 exponent bits, 52 fraction bits

plus the implied 1 leading the fraction part, making a

total of 53 bits for each uniform floating-point variate.

2. Period some 10^19492 (vs. the 10^61 of an earlier version),

G. Marsaglia and W.W. Tsang, "The 64-bit universal RNG",(2004)

Statistics and Probability Letters}, v66, no. 2, 183-187,

or an earlier version provided for Matlab.)

3. Following the KISS, (Keep-It-Simple-Stupid) principle,

combines, via subtraction, a CSWB sequence

x[n]=x[n-1220]-x[n-1190]-borrow mod 2^53

based on the prime p=b^1220-b^1190+1, b=2^53,

with a SWB lag-2 sequence z[n]=w[n-2]-z[n-1] -c mod 2^53.

All modular arithmetic on numerators of rationals

with denominators 2.^53.

4. Easily implemented in various programming languages,

as only tests on magnitude and double-precision floating-point

subtraction or addition required.

5. Passes all Diehard tests,

http://i.cs.hku.hk/~diehard/

As Diehard is designed for 32-bit integer input, all of its

234 tests are applied to bits 1..32, then 2..33, then 3..34,..,

then 22..53 of the 53 bits in each dUNI().

(To form 32-bit integer j from bits i..(i+31):

t=dUNI()*(1<<(i-1)); i=t; j=(t-i)*4294967296.0; )

All twenty-two choices for the 32-bit segments performed

very well on the 234 p-values from Diehard tests, using

the default 32-bit seeds x and y. You might try your own,

with possibly more extensive seeding of the Q array.

*/

George Marsaglia} /*end i seed loop, Now generate 10^9 dUNI's: */

for(i=0;i<1000000000;i++) t=dUNI();

printf("%.16f\n",dUNI());

}

/*

Output, after 10^9 random uniforms:

0.6203646342357479

Should take about 14 seconds,

e.g. with gcc -O3 opimizing compiler

--------------------------------------------------------

*/

/*

Fully seeding the Q[1220] array requires 64660

seed bits, plus another 106 for the initial zx

and zy of the lag-2 SWB sequence.

As in the above listing, using just two 32-bit

seeds, x and y in main(), to initialize the Q

array, one bit at a time, by means of the

resulting Congruential+Xorshift sequence

may be enough for many applications.

Applications such as in Law or Gaming may

require enough seeds in the Q[1220] array to

guarantee that each one of a huge set of

possible outcomes can appear. For example,

choosing a jury venire of 80 from a

list of 2000 eligibles would require at least

ten 53-bit seeds; choosing 180 from 4000 would

require twenty 53-bit seeds.

To get certification, a casino machine that could

play forty simultaneous games of poker must be

able to produce forty successive straight-flushes,

with a resulting minimal seed set.

Users can choose their 32-bit x,y for the

above seeding process, or develop their own

for more exacting requirements when a mere

set of 64 seed bits may not be enough.

Properties:

1. Simple and very fast, some 70 million double-precision

uniform(0,1] random variables/second, in IEEE 754 standard

form: 1 sign bit, 11 exponent bits, 52 fraction bits

plus the implied 1 leading the fraction part, making a

total of 53 bits for each uniform floating-point variate.

2. Period some 10^19492 (vs. the 10^61 of an earlier version),

G. Marsaglia and W.W. Tsang, "The 64-bit universal RNG",(2004)

Statistics and Probability Letters}, v66, no. 2, 183-187,

or an earlier version provided for Matlab.)

3. Following the KISS, (Keep-It-Simple-Stupid) principle,

combines, via subtraction, a CSWB sequence

x[n]=x[n-1220]-x[n-1190]-borrow mod 2^53

based on the prime p=b^1220-b^1190+1, b=2^53,

with a SWB lag-2 sequence z[n]=w[n-2]-z[n-1] -c mod 2^53.

All modular arithmetic on numerators of rationals

with denominators 2.^53.

4. Easily implemented in various programming languages,

as only tests on magnitude and double-precision floating-point

subtraction or addition required.

5. Passes all Diehard tests,

http://i.cs.hku.hk/~diehard/

As Diehard is designed for 32-bit integer input, all of its

234 tests are applied to bits 1..32, then 2..33, then 3..34,..,

then 22..53 of the 53 bits in each dUNI().

(To form 32-bit integer j from bits i..(i+31):

t=dUNI()*(1<<(i-1)); i=t; j=(t-i)*4294967296.0; )

All twenty-two choices for the 32-bit segments performed

very well on the 234 p-values from Diehard tests, using

the default 32-bit seeds x and y. You might try your own,

with possibly more extensive seeding of the Q array.

*/

George Marsaglia