# SuperKISS for 32- and 64-bit RNGs in both C and Fortran.

G

#### geo

On Nov 3 I posted

RNGs: A Super KISS sci.math, comp.lang.c, sci.crypt

a KISS (Keep-It-Simple-Stupid) RNG combining,
by addition mod 2^32, three simple RNGs:
CMWC(Complementary-Multiply-With-Carry)+CNG(Congruential)
+XS(Xorshift)
with resulting period greater than 10^402575.

The extreme period comes from finding a prime p=a*b^r+1 for
which the order of b has magnitude near p-1, then use
the CMWC method, the mathematics of which I outline here:

Let Z be the set of all "vectors" of the form
[x_1,...,x_r;c] with 0<=x_i<b and 0<=c<a.
Then Z has ab^r elements, and if the function f() on Z is
f([x_1,x2,...,x_r;c])=
[x_2,...,x_r,b-1-(t mod b);trunc((t/b)], t=a*x_1+c
then f() has an inverse on Z: for each z in Z there is exactly
one w in Z for which f(w)=z:
f^{-1}([x_1,x2,...,x_r;c])=
[trunc((v/a),x_1,...,x_{r-1}; v mod a], v=cb+(b-1)-x_r

Thus a directed graph based on z->f(z) will consist only
of disjoint loops of size s=order(b,p) and there will be
L=ab^r/s such loops.

A random uniform choice of z from Z is equally likely to
fall in any one of the L loops, and then each "vector" in
the sequence obtained by iterating f:
f(z), f(f(z)), f(f(f(z))),...
will have a uniform distribution over that loop, and the sequence
determined by taking the r'th element from each "vector",
(the output of the CMWC RNG) will be periodic with period
the order of b for the prime p=ab^r+1, and that sequence
will produce, in reverse order, the base-b expansion of a
rational k/p for some k determined by choice of the seed z.

For that Nov 3 post, I had found that the order of b=2^32
for the prime p=7010176*b^41790+1 is 54767*2^1337279, about
10^402566, thus providing an easily-implemented
KISS=CMWC+CNG+XS RNG with immense period and
excellent performance on tests of randomness.

That easy implementation required carrying out the essential
parts of the CMWC function f(): form t=7010176*x+c in 64 bits,
extract the top 32 bits for the new c, the bottom 32 for
the new x---easy to do in C, not easy in Fortran.
And if we want 64-bit random numbers, with B=2^64, our prime
becomes 7010176*B^20985+1, for which the period of B is
54767*2^1337278, still immense, but in C, with 64-bit x,c
there seems no easy way to form t=7010176*x+c in 128 bits,
then extract the top and bottom 64 bit halves.

So base b=2^32 works for C but not Fortran,
and base B=2^64 works for neither C nor Fortran.

I offer here is a prime that provides CMWC RNGs for both
32- and 64-bits, and for both C and Fortran, and with
equally massive periods, again greater than 2^(1.3million):

p=640*b^41265+1 = 2748779069440*B^20632+1 = 5*2^1320487+1.

That prime came from the many who have dedicated their
efforts and computer time to prime searches. After some
three weeks of dedicated computer time using pfgw with
scrypt, I found the orders of b and B:
5*2^1320481 for b=2^32, 5*2^1320480 for B=2^64.

It is the choice of the "a" that makes it feasible to get
the top and bottom valves of t=a*x+c, yet stay within the
integer sizes the C or Fortran compilers are set for.
In the above prime: a=640=2^9+2^7 for b=2^32 and
a=2748779069440=2^41+2^39 for B=2^64.
Thus, for example with b=2^32 and using only 32-bit C code,
with a supposed 128-bit t=(2^9+2^7)*x+c, the top and bottom
32-bits of t may be obtained by setting, say,
h=(c&1); z=(x<<9)>>1 + (x<<7)>>1 + c>>1;
then the top half of that t would be
c=(x>>23)+(x>>25)+(z>>31);
and the bottom half, before being complemented, would be
x=(z<<1)+h;

When B=2^64 we need only change to
h=(c&1); z=(x<<41)>>1 + (x<<39)>>1 + c>>1;
c=(x>>23)+(x>>25)+(z>>63);

These C operations all have Fortran equivalents, and will
produce the required bit patterns, whether integers are
considered signed or unsigned. (In C, one must make sure
that the >> operation performs a logical right shift,
perhaps best done via "unsigned" declarations.)

The CMWC z "vector" elements [x_1,x_2,...,x_r] are kept in
an array, Q[] in C, Q() in Fortran, with a separate current
"carry". This is all spelled out in the following examples:
code for 32- and 64-bit SuperKiss RNGs for C and Fortran.

Note that in these sample listings, the Q array is seeded
by CNG+XS, based on the seed values specified in the
initial declarations. For many simulation studies, the
73 bits needed to seed the initial xcng, xs and carry<a
for the 32-bit version, or 169 bits needed for the 64-bit
But more demanding applications may require a significant
portion of the >1.3 million seed bits that Q requires.
See text and comments from the Nov 3 posting.

I am indebted to an anonymous mecej4 for providing the basic
form and KIND declarations of the Fortran versions.

Please let me and other readers know if the results are not
as specified when run with your compilers, or if you can
provide equivalent versions in other programming languages.

George Marsaglia

--------------------------------------------------------
Here is SUPRKISS64.c, the immense-period 64-bit RNG. I
invite you to cut, paste, compile and run to see if you
get the result I do. It should take around 20 seconds.
--------------------------------------------------------
/* SUPRKISS64.c, period 5*2^1320480*(2^64-1) */
#include <stdio.h>
static unsigned long long Q,carry=36243678541LL,
xcng=12367890123456LL,xs=521288629546311LL,indx=20632;

#define CNG ( xcng=6906969069LL*xcng+123 )
#define XS ( xs^=xs<<13,xs^=xs>>17,xs^=xs<<43 )
#define SUPR ( indx<20632 ? Q[indx++] : refill() )
#define KISS SUPR+CNG+XS

unsigned long long refill( )
{int i; unsigned long long z,h;
for(i=0;i<20632;i++){ h=(carry&1);
z=((Q<<41)>>1)+((Q<<39)>>1)+(carry>>1);
carry=(Q>>23)+(Q>>25)+(z>>63);
Q=~((z<<1)+h); }
indx=1; return (Q);
}

int main()
{int i; unsigned long long x;
for(i=0;i<20632;i++) Q=CNG+XS;
for(i=0;i<1000000000;i++) x=KISS;
printf("Does x=4013566000157423768\n x=%LLd.\n",x);
}
---------------------------------------------------------

Here is SUPRKISS32.c, the immense-period 32-bit RNG. I
invite you to cut, paste, compile and run to see if you
get the result I do. It should take around 10 seconds.
---------------------------------------------------------
/*suprkiss64.c
b=2^64; x[n]=(b-1)-[(2^41+2^39)*x[n-20632]+carry mod b]
period 5*2^1320480>10^397505
This version of SUPRKISS doesn't use t=a*x+c in 128 bits,
but uses only 64-bit stuff, takes 20 nanos versus 7.5 for
the 32-bit unsigned long long t=a*x+c version.
*/

/* SUPRKISS64.c, period 5*2^1320480*(2^64-1) */
#include <stdio.h>
static unsigned long long Q,carry=36243678541LL,
xcng=12367890123456LL,xs=521288629546311LL,indx=20632;

#define CNG ( xcng=6906969069LL*xcng+123 )
#define XS ( xs^=xs<<13,xs^=xs>>17,xs^=xs<<43 )
#define SUPR ( indx<20632 ? Q[indx++] : refill() )
#define KISS SUPR+CNG+XS

unsigned long long refill( )
{int i; unsigned long long z,h;
for(i=0;i<20632;i++){ h=(carry&1);
z=((Q<<41)>>1)+((Q<<39)>>1)+(carry>>1);
carry=(Q>>23)+(Q>>25)+(z>>63);
Q=~((z<<1)+h); }
indx=1; return (Q);
}

int main()
{int i; unsigned long long x;
for(i=0;i<20632;i++) Q=CNG+XS;
for(i=0;i<1000000000;i++) x=KISS;
printf("Does x=4013566000157423768\n x=%LLd.\n",x);
}

-----------------------------------------------------------

And here are equivalent Fortran versions, which, absent
C's inline features, seem to need ~10% more run time.

-----------------------------------------------------------
module suprkiss64_M ! period 5*2^1320480*(2^64-1)
integer,parameter :: I8=selected_int_kind(18)
integer(kind=I8) :: Q(20632),carry=36243678541_I8, &
xcng=12367890123456_I8,xs=521288629546311_I8,indx=20633_I8
contains
function KISS64() result(x)
integer(kind=I8) :: x
if(indx <= 20632)then; x=Q(indx); indx=indx+1
else; x=refill(); endif
xcng=xcng*6906969069_I8+123
xs=ieor(xs,ishft(xs,13))
xs=ieor(xs,ishft(xs,-17))
xs=ieor(xs,ishft(xs,43))
x=x+xcng+xs
return; end function KISS64

function refill() result(s)
integer(kind=I8) :: i,s,z,h
do i=1,20632
h=iand(carry,1_I8)
z = ishft(ishft(Q(i),41),-1)+ &
ishft(ishft(Q(i),39),-1)+ &
ishft(carry,-1)
carry=ishft(Q(i),-23)+ishft(Q(i),-25)+ishft(z,-63)
Q(i)=not(ishft(z,1)+h)
end do
indx=2; s=Q(1)
return; end function refill

end module suprkiss64_M

program testKISS64
use suprkiss64_M
integer(kind=I8) :: i,x
do i=1,20632 !fill Q with Congruential+Xorshift
xcng=xcng*6906969069_I8+123
xs=ieor(xs,ishft(xs,13))
xs=ieor(xs,ishft(xs,-17))
xs=ieor(xs,ishft(xs,43))
Q(i)=xcng+xs
end do
do i=1,1000000000_I8; x=KISS64(); end do
write(*,10) x
10 format(' Does x = 4013566000157423768 ?',/,6x,'x = ',I20)
end program testKISS64
-------------------------------------------------------------

module suprkiss32_M ! period 5*2^1320481*(2^32-1)
integer,parameter :: I4=selected_int_kind(9)
integer(kind=I4) :: Q(41265),carry=362_I4, &
xcng=1236789_I4,xs=521288629_I4,indx=41266_I4
contains
function KISS32() result(x)
integer(kind=I4):: x
if(indx <= 41265)then;x=Q(indx); indx=indx+1
else; x=refill(); endif
xcng=xcng*69069_I4+123
xs=ieor(xs,ishft(xs,13))
xs=ieor(xs,ishft(xs,-17));
xs=ieor(xs,ishft(xs,5))
x=x+xcng+xs
return; end function KISS32

function refill() result(s)
integer(kind=I4) :: i,s,z,h
do i = 1,41265
h = iand(carry,1_I4)
z = ishft(ishft(Q(i),9),-1)+ &
ishft(ishft(Q(i),7),-1)+ &
ishft(carry,-1)
carry=ishft(Q(i),-23)+ishft(Q(i),-25)+ishft(z,-31)
Q(i)=not(ishft(z,1)+h)
end do
indx=2; s=Q(1)
return; end function refill

end module suprkiss32_M

program testKISS32
use suprkiss32_M
integer(kind=I4) :: i,x
do i=1,41265 !fill Q with Congruential+Xorshift
xcng=xcng*69069_I4+123
xs=ieor(xs,ishft(xs,13))
xs=ieor(xs,ishft(xs,-17))
xs=ieor(xs,ishft(xs,5))
Q(i)=xcng+xs
end do
do i=1,1000000000_I4; x=KISS32(); end do
write(*,10) x
10 format(' Does x = 1809478889 ?',/,6x,'x =',I11)
end program testKISS32

---------------------------------------------------------------

R

#### robin

| On Nov 3 I posted
|
| RNGs: A Super KISS sci.math, comp.lang.c, sci.crypt
|
| a KISS (Keep-It-Simple-Stupid) RNG combining,
| by addition mod 2^32, three simple RNGs:
| CMWC(Complementary-Multiply-With-Carry)+CNG(Congruential)
| +XS(Xorshift)
| with resulting period greater than 10^402575.
|
| The extreme period comes from finding a prime p=a*b^r+1 for
| which the order of b has magnitude near p-1, then use
| the CMWC method, the mathematics of which I outline here:
|
| Let Z be the set of all "vectors" of the form
| [x_1,...,x_r;c] with 0<=x_i<b and 0<=c<a.
| Then Z has ab^r elements, and if the function f() on Z is
| f([x_1,x2,...,x_r;c])=
| [x_2,...,x_r,b-1-(t mod b);trunc((t/b)], t=a*x_1+c
| then f() has an inverse on Z: for each z in Z there is exactly
| one w in Z for which f(w)=z:
| f^{-1}([x_1,x2,...,x_r;c])=
| [trunc((v/a),x_1,...,x_{r-1}; v mod a], v=cb+(b-1)-x_r
|
| Thus a directed graph based on z->f(z) will consist only
| of disjoint loops of size s=order(b,p) and there will be
| L=ab^r/s such loops.
|
| A random uniform choice of z from Z is equally likely to
| fall in any one of the L loops, and then each "vector" in
| the sequence obtained by iterating f:
| f(z), f(f(z)), f(f(f(z))),...
| will have a uniform distribution over that loop, and the sequence
| determined by taking the r'th element from each "vector",
| (the output of the CMWC RNG) will be periodic with period
| the order of b for the prime p=ab^r+1, and that sequence
| will produce, in reverse order, the base-b expansion of a
| rational k/p for some k determined by choice of the seed z.
|
| For that Nov 3 post, I had found that the order of b=2^32
| for the prime p=7010176*b^41790+1 is 54767*2^1337279, about
| 10^402566, thus providing an easily-implemented
| KISS=CMWC+CNG+XS RNG with immense period and
| excellent performance on tests of randomness.
|
| That easy implementation required carrying out the essential
| parts of the CMWC function f(): form t=7010176*x+c in 64 bits,
| extract the top 32 bits for the new c, the bottom 32 for
| the new x---easy to do in C, not easy in Fortran.
| And if we want 64-bit random numbers, with B=2^64, our prime
| becomes 7010176*B^20985+1, for which the period of B is
| 54767*2^1337278, still immense, but in C, with 64-bit x,c
| there seems no easy way to form t=7010176*x+c in 128 bits,
| then extract the top and bottom 64 bit halves.
|
| So base b=2^32 works for C but not Fortran,
| and base B=2^64 works for neither C nor Fortran.
|
| I offer here is a prime that provides CMWC RNGs for both
| 32- and 64-bits, and for both C and Fortran, and with
| equally massive periods, again greater than 2^(1.3million):
|
| p=640*b^41265+1 = 2748779069440*B^20632+1 = 5*2^1320487+1.
|
| That prime came from the many who have dedicated their
| efforts and computer time to prime searches. After some
| three weeks of dedicated computer time using pfgw with
| scrypt, I found the orders of b and B:
| 5*2^1320481 for b=2^32, 5*2^1320480 for B=2^64.
|
| It is the choice of the "a" that makes it feasible to get
| the top and bottom valves of t=a*x+c, yet stay within the
| integer sizes the C or Fortran compilers are set for.
| In the above prime: a=640=2^9+2^7 for b=2^32 and
| a=2748779069440=2^41+2^39 for B=2^64.
| Thus, for example with b=2^32 and using only 32-bit C code,
| with a supposed 128-bit t=(2^9+2^7)*x+c, the top and bottom
| 32-bits of t may be obtained by setting, say,
| h=(c&1); z=(x<<9)>>1 + (x<<7)>>1 + c>>1;
| then the top half of that t would be
| c=(x>>23)+(x>>25)+(z>>31);
| and the bottom half, before being complemented, would be
| x=(z<<1)+h;
|
| When B=2^64 we need only change to
| h=(c&1); z=(x<<41)>>1 + (x<<39)>>1 + c>>1;
| c=(x>>23)+(x>>25)+(z>>63);
|
| These C operations all have Fortran equivalents, and will
| produce the required bit patterns, whether integers are
| considered signed or unsigned. (In C, one must make sure
| that the >> operation performs a logical right shift,
| perhaps best done via "unsigned" declarations.)
|
| The CMWC z "vector" elements [x_1,x_2,...,x_r] are kept in
| an array, Q[] in C, Q() in Fortran, with a separate current
| "carry". This is all spelled out in the following examples:
| code for 32- and 64-bit SuperKiss RNGs for C and Fortran.
|
| Note that in these sample listings, the Q array is seeded
| by CNG+XS, based on the seed values specified in the
| initial declarations. For many simulation studies, the
| 73 bits needed to seed the initial xcng, xs and carry<a
| for the 32-bit version, or 169 bits needed for the 64-bit
| But more demanding applications may require a significant
| portion of the >1.3 million seed bits that Q requires.
| See text and comments from the Nov 3 posting.
|
| I am indebted to an anonymous mecej4 for providing the basic
| form and KIND declarations of the Fortran versions.
|
| Please let me and other readers know if the results are not
| as specified when run with your compilers, or if you can
| provide equivalent versions in other programming languages.
|
| George Marsaglia
|
| --------------------------------------------------------
| Here is SUPRKISS64.c, the immense-period 64-bit RNG. I
| invite you to cut, paste, compile and run to see if you
| get the result I do. It should take around 20 seconds.
| --------------------------------------------------------
| /* SUPRKISS64.c, period 5*2^1320480*(2^64-1) */
| #include <stdio.h>
| static unsigned long long Q,carry=36243678541LL,
| xcng=12367890123456LL,xs=521288629546311LL,indx=20632;
|
| #define CNG ( xcng=6906969069LL*xcng+123 )
| #define XS ( xs^=xs<<13,xs^=xs>>17,xs^=xs<<43 )
| #define SUPR ( indx<20632 ? Q[indx++] : refill() )
| #define KISS SUPR+CNG+XS
|
| unsigned long long refill( )
| {int i; unsigned long long z,h;
| for(i=0;i<20632;i++){ h=(carry&1);
| z=((Q<<41)>>1)+((Q<<39)>>1)+(carry>>1);
| carry=(Q>>23)+(Q>>25)+(z>>63);
| Q=~((z<<1)+h); }
| indx=1; return (Q);
| }
|
| int main()
| {int i; unsigned long long x;
| for(i=0;i<20632;i++) Q=CNG+XS;
| for(i=0;i<1000000000;i++) x=KISS;
| printf("Does x=4013566000157423768\n x=%LLd.\n",x);
| }
| ---------------------------------------------------------
|
| Here is SUPRKISS32.c, the immense-period 32-bit RNG. I
| invite you to cut, paste, compile and run to see if you
| get the result I do. It should take around 10 seconds.
| ---------------------------------------------------------
| /*suprkiss64.c
| b=2^64; x[n]=(b-1)-[(2^41+2^39)*x[n-20632]+carry mod b]
| period 5*2^1320480>10^397505
| This version of SUPRKISS doesn't use t=a*x+c in 128 bits,
| but uses only 64-bit stuff, takes 20 nanos versus 7.5 for
| the 32-bit unsigned long long t=a*x+c version.
| */
|
| /* SUPRKISS64.c, period 5*2^1320480*(2^64-1) */
| #include <stdio.h>
| static unsigned long long Q,carry=36243678541LL,
| xcng=12367890123456LL,xs=521288629546311LL,indx=20632;
|
| #define CNG ( xcng=6906969069LL*xcng+123 )
| #define XS ( xs^=xs<<13,xs^=xs>>17,xs^=xs<<43 )
| #define SUPR ( indx<20632 ? Q[indx++] : refill() )
| #define KISS SUPR+CNG+XS
|
| unsigned long long refill( )
| {int i; unsigned long long z,h;
| for(i=0;i<20632;i++){ h=(carry&1);
| z=((Q<<41)>>1)+((Q<<39)>>1)+(carry>>1);
| carry=(Q>>23)+(Q>>25)+(z>>63);
| Q=~((z<<1)+h); }
| indx=1; return (Q);
| }
|
| int main()
| {int i; unsigned long long x;
| for(i=0;i<20632;i++) Q=CNG+XS;
| for(i=0;i<1000000000;i++) x=KISS;
| printf("Does x=4013566000157423768\n x=%LLd.\n",x);
| }
|
| -----------------------------------------------------------
|
| And here are equivalent Fortran versions, which, absent
| C's inline features, seem to need ~10% more run time.
|
| -----------------------------------------------------------
| module suprkiss64_M ! period 5*2^1320480*(2^64-1)
| integer,parameter :: I8=selected_int_kind(18)
| integer(kind=I8) :: Q(20632),carry=36243678541_I8, &
| xcng=12367890123456_I8,xs=521288629546311_I8,indx=20633_I8
| contains
| function KISS64() result(x)
| integer(kind=I8) :: x
| if(indx <= 20632)then; x=Q(indx); indx=indx+1
| else; x=refill(); endif
| xcng=xcng*6906969069_I8+123
| xs=ieor(xs,ishft(xs,13))
| xs=ieor(xs,ishft(xs,-17))
| xs=ieor(xs,ishft(xs,43))
| x=x+xcng+xs
| return; end function KISS64
|
| function refill() result(s)
| integer(kind=I8) :: i,s,z,h
| do i=1,20632
| h=iand(carry,1_I8)
| z = ishft(ishft(Q(i),41),-1)+ &
| ishft(ishft(Q(i),39),-1)+ &
| ishft(carry,-1)
| carry=ishft(Q(i),-23)+ishft(Q(i),-25)+ishft(z,-63)
| Q(i)=not(ishft(z,1)+h)
| end do
| indx=2; s=Q(1)
| return; end function refill
|
| end module suprkiss64_M
|
| program testKISS64
| use suprkiss64_M
| integer(kind=I8) :: i,x
| do i=1,20632 !fill Q with Congruential+Xorshift
| xcng=xcng*6906969069_I8+123
| xs=ieor(xs,ishft(xs,13))
| xs=ieor(xs,ishft(xs,-17))
| xs=ieor(xs,ishft(xs,43))
| Q(i)=xcng+xs
| end do
| do i=1,1000000000_I8; x=KISS64(); end do
| write(*,10) x
| 10 format(' Does x = 4013566000157423768 ?',/,6x,'x = ',I20)
| end program testKISS64
| -------------------------------------------------------------
|
| module suprkiss32_M ! period 5*2^1320481*(2^32-1)
| integer,parameter :: I4=selected_int_kind(9)
| integer(kind=I4) :: Q(41265),carry=362_I4, &
| xcng=1236789_I4,xs=521288629_I4,indx=41266_I4
| contains
| function KISS32() result(x)
| integer(kind=I4):: x
| if(indx <= 41265)then;x=Q(indx); indx=indx+1
| else; x=refill(); endif
| xcng=xcng*69069_I4+123
| xs=ieor(xs,ishft(xs,13))
| xs=ieor(xs,ishft(xs,-17));
| xs=ieor(xs,ishft(xs,5))
| x=x+xcng+xs
| return; end function KISS32
|
| function refill() result(s)
| integer(kind=I4) :: i,s,z,h
| do i = 1,41265
| h = iand(carry,1_I4)
| z = ishft(ishft(Q(i),9),-1)+ &
| ishft(ishft(Q(i),7),-1)+ &
| ishft(carry,-1)
| carry=ishft(Q(i),-23)+ishft(Q(i),-25)+ishft(z,-31)
| Q(i)=not(ishft(z,1)+h)
| end do
| indx=2; s=Q(1)
| return; end function refill
|
| end module suprkiss32_M
|
| program testKISS32
| use suprkiss32_M
| integer(kind=I4) :: i,x
| do i=1,41265 !fill Q with Congruential+Xorshift
| xcng=xcng*69069_I4+123
| xs=ieor(xs,ishft(xs,13))
| xs=ieor(xs,ishft(xs,-17))
| xs=ieor(xs,ishft(xs,5))
| Q(i)=xcng+xs
| end do
| do i=1,1000000000_I4; x=KISS32(); end do
| write(*,10) x
| 10 format(' Does x = 1809478889 ?',/,6x,'x =',I11)
| end program testKISS32
|
| ---------------------------------------------------------------

Here is the 32-bit version which I translated:

testKISS32: procedure options (main); /* period 5*2^1320481*(2^32-1) */
declare ( Q(41265), carry initial (362),
xcng initial (1236789), xs initial (521288629),
indx initial (41266)) fixed binary (31);

KISS32: procedure() returns(fixed binary (31)) options (reorder);
declare x fixed binary (31);

if indx <= 41265 then do; x=Q(indx); indx=indx+1; end;
else x=refill();
xcng=xcng*69069+123;
xs=ieor(xs,isll(xs,13));
xs=ieor(xs,isrl(xs,17));
xs=ieor(xs,isll(xs,5));
return (x+xcng+xs);
end KISS32;

refill: procedure() returns(fixed binary(31)) options (reorder);
declare (i,s,z,h) fixed binary (31);

do i = 1 to 41265;
h = iand(carry,1);
z = isrl(isll(Q(i),9),1)+
isrl(isll(Q(i),7),1)+
isrl(carry,1);
carry=isrl(Q(i),23)+isrl(Q(i),25)+isrl(z,31);
Q(i)=inot(isll(z,1)+h);
end;
indx=2;
return (Q(1));
end refill;

declare (i,x) fixed binary (31);
do i=1 to 41265; /* fill Q with Congruential+Xorshift */
xcng=xcng*69069+123;
xs=ieor(xs,isll(xs,13));
xs=ieor(xs,isrl(xs,17));
xs=ieor(xs,isll(xs,5));
Q(i)=xcng+xs;
end;
do i=1 to 1000000000; x=KISS32(); end;
put skip edit ( 'Does x = 1809478889 ?', 'x =', x)
(a, skip, x(5), a, f(11));
end testKISS32;

/*--------------------------------------------------------------- */

R

#### robin

| On Nov 3 I posted
|
| RNGs: A Super KISS sci.math, comp.lang.c, sci.crypt
|
| a KISS (Keep-It-Simple-Stupid) RNG combining,
| by addition mod 2^32, three simple RNGs:
| CMWC(Complementary-Multiply-With-Carry)+CNG(Congruential)
| +XS(Xorshift)
| with resulting period greater than 10^402575.

Fantastic!

R

#### robin

| On Nov 3 I posted
|
| RNGs: A Super KISS sci.math, comp.lang.c, sci.crypt
|
| a KISS (Keep-It-Simple-Stupid) RNG combining,
| by addition mod 2^32, three simple RNGs:
| CMWC(Complementary-Multiply-With-Carry)+CNG(Congruential)
| +XS(Xorshift)
| with resulting period greater than 10^402575.

Here is the 64-bit PL/I version which I translated:

testKISS64: procedure options (main); /* period 5*2^1320480*(2^64-1) */
declare (i,x) fixed binary (63);
declare (Q(20632), carry initial (36243678541),
xcng initial (12367890123456), xs initial (521288629546311),
indx initial (20633)) fixed binary (63);

KISS64: procedure() returns (fixed binary (63)) options (reorder);
declare x fixed binary (63);

if indx <= 20632 then do; x=Q(indx); indx=indx+1; end;
else x=refill();
xcng=xcng*6906969069+123;
xs=ieor(xs,isll(xs,13));
xs=ieor(xs,isrl(xs,17));
xs=ieor(xs,isll(xs,43));
return (x+xcng+xs);

end KISS64;

refill: procedure () returns (fixed binary (63)) options (reorder);
declare (i,s,z,h) fixed binary (63);
do i=1 to 20632;
h=iand(carry,1);
z = isrl(isll(Q(i),41),1)+
isrl(isll(Q(i),39),1)+
isrl(carry,1);
carry=isrl(Q(i),23)+isrl(Q(i),25)+isrl(z,63);
Q(i)=inot(isll(z,1)+h);
end;
indx=2;
return(Q(1));
end refill;

do i=1 to 20632; /* fill Q with Congruential+Xorshift */
xcng=xcng*6906969069+123;
xs=ieor(xs,isll(xs,13));
xs=ieor(xs,isrl(xs,17));
xs=ieor(xs,isll(xs,43));
Q(i)=xcng+xs;
end;
do i=1 to 1000000000; x=KISS64(); end;
put skip edit ( 'Does x = 4013566000157423768 ?', 'x = ', x)
(a, skip, x(5), a, f(20));

end testKISS64;
/*-------------------------------------------------------------*/

A

#### amzoti

On Nov 3 I posted

RNGs: A Super KISS   sci.math, comp.lang.c, sci.crypt

a KISS (Keep-It-Simple-Stupid) RNG combining,
by addition mod 2^32, three simple RNGs:
CMWC(Complementary-Multiply-With-Carry)+CNG(Congruential)
+XS(Xorshift)
with resulting period greater than 10^402575.

The extreme period comes from finding a prime p=a*b^r+1 for
which the order of b has magnitude near p-1, then use
the CMWC method, the mathematics of which I outline here:

Let Z be the set of all "vectors" of the form
[x_1,...,x_r;c] with 0<=x_i<b and 0<=c<a.
Then Z has ab^r elements, and if the function f() on Z is
f([x_1,x2,...,x_r;c])=
[x_2,...,x_r,b-1-(t mod b);trunc((t/b)],  t=a*x_1+c
then f() has an inverse on Z: for each z in Z there is exactly
one w in Z for which f(w)=z:
f^{-1}([x_1,x2,...,x_r;c])=
[trunc((v/a),x_1,...,x_{r-1}; v mod a],  v=cb+(b-1)-x_r

Thus a directed graph based on z->f(z) will consist only
of disjoint loops of size s=order(b,p) and there will be
L=ab^r/s such loops.

A random uniform choice of z from Z is equally likely to
fall in any one of the L loops, and then each "vector" in
the sequence obtained by iterating f:
f(z), f(f(z)), f(f(f(z))),...
will have a uniform distribution over that loop, and the sequence
determined by taking the r'th element from each "vector",
(the output of the CMWC RNG) will be periodic with period
the order of b for the prime p=ab^r+1, and that sequence
will produce, in reverse order, the base-b expansion of a
rational k/p for some k determined by choice of the seed z.

For that Nov 3 post, I had found that the order of b=2^32
for the prime p=7010176*b^41790+1 is 54767*2^1337279, about
10^402566, thus providing an easily-implemented
KISS=CMWC+CNG+XS  RNG with immense period and
excellent performance on tests of randomness.

That easy implementation required carrying out the essential
parts of the CMWC function f(): form t=7010176*x+c in 64 bits,
extract the top 32 bits for the new c, the bottom 32 for
the new x---easy to do in C, not easy in Fortran.
And if we want 64-bit random numbers, with B=2^64, our prime
becomes 7010176*B^20985+1, for which the period of B is
54767*2^1337278, still immense, but in C, with 64-bit x,c
there seems no easy way to form t=7010176*x+c in 128 bits,
then extract the top and bottom 64 bit halves.

So base b=2^32 works for C but not Fortran,
and base B=2^64 works for neither C nor Fortran.

I offer here is a prime that provides CMWC RNGs for both
32- and 64-bits, and for both C and Fortran, and with
equally massive periods, again greater than 2^(1.3million):

p=640*b^41265+1 = 2748779069440*B^20632+1 = 5*2^1320487+1.

That prime came from the many who have dedicated their
efforts and computer time to prime searches. After some
three weeks of dedicated computer time using pfgw with
scrypt, I found the orders of b and B:
5*2^1320481 for b=2^32, 5*2^1320480 for B=2^64.

It is the choice of the "a" that makes it feasible to get
the top and bottom valves of t=a*x+c, yet stay within the
integer sizes the C or Fortran compilers are set for.
In the above prime: a=640=2^9+2^7 for b=2^32 and
a=2748779069440=2^41+2^39 for B=2^64.
Thus, for example with b=2^32 and using only 32-bit C code,
with a supposed 128-bit t=(2^9+2^7)*x+c, the top and bottom
32-bits of t may be obtained by setting, say,
h=(c&1); z=(x<<9)>>1 + (x<<7)>>1 + c>>1;
then the top half of that t would be
c=(x>>23)+(x>>25)+(z>>31);
and the bottom half, before being complemented, would be
x=(z<<1)+h;

When B=2^64 we need only change to
h=(c&1); z=(x<<41)>>1 + (x<<39)>>1 + c>>1;
c=(x>>23)+(x>>25)+(z>>63);

These C operations all have Fortran equivalents, and will
produce the required bit patterns, whether integers are
considered signed or unsigned. (In C, one must make sure
that the >> operation performs a logical right shift,
perhaps best done via "unsigned" declarations.)

The CMWC z "vector" elements [x_1,x_2,...,x_r] are kept in
an array, Q[] in C, Q() in Fortran, with a separate current
"carry". This is all spelled out in the following examples:
code for 32- and 64-bit SuperKiss RNGs for C and Fortran.

Note that in these sample listings, the Q array is seeded
by CNG+XS, based on the seed values specified in the
initial declarations.  For many simulation studies, the
73 bits needed to seed the initial xcng, xs and carry<a
for the 32-bit version, or  169 bits needed for the 64-bit
But more demanding applications may require a significant
portion of the >1.3 million seed bits that Q requires.
See text and comments from the Nov 3 posting.

I am indebted to an anonymous mecej4 for providing the basic
form and KIND declarations of the Fortran versions.

Please let me and other readers know if the results are not
as specified when run with your compilers, or if you can
provide equivalent versions in other programming languages.

George Marsaglia

--------------------------------------------------------
Here is SUPRKISS64.c, the immense-period 64-bit RNG. I
invite you to cut, paste, compile and run to see if you
get the result I do.  It should take around 20 seconds.
--------------------------------------------------------
/*   SUPRKISS64.c, period 5*2^1320480*(2^64-1)   */
#include <stdio.h>
static unsigned long long Q,carry=36243678541LL,
xcng=12367890123456LL,xs=521288629546311LL,indx=20632;

#define CNG ( xcng=6906969069LL*xcng+123 )
#define XS  ( xs^=xs<<13,xs^=xs>>17,xs^=xs<<43 )
#define SUPR ( indx<20632 ? Q[indx++] : refill() )
#define KISS SUPR+CNG+XS

unsigned long long refill( )
{int i; unsigned long long z,h;
for(i=0;i<20632;i++){  h=(carry&1);
z=((Q<<41)>>1)+((Q<<39)>>1)+(carry>>1);
carry=(Q>>23)+(Q>>25)+(z>>63);
Q=~((z<<1)+h);   }
indx=1; return (Q);
}

int main()
{int i; unsigned long long x;
for(i=0;i<20632;i++) Q=CNG+XS;
for(i=0;i<1000000000;i++) x=KISS;
printf("Does x=4013566000157423768\n     x=%LLd.\n",x);}

---------------------------------------------------------

Here is SUPRKISS32.c, the immense-period 32-bit RNG.  I
invite you to cut, paste, compile and run to see if you
get the result I do.  It should take around 10 seconds.
---------------------------------------------------------
/*suprkiss64.c
b=2^64; x[n]=(b-1)-[(2^41+2^39)*x[n-20632]+carry mod b]
period 5*2^1320480>10^397505
This version of SUPRKISS doesn't use t=a*x+c in 128 bits,
but uses only 64-bit stuff, takes 20 nanos versus 7.5 for
the 32-bit unsigned long long t=a*x+c version.
*/

/*   SUPRKISS64.c, period 5*2^1320480*(2^64-1)   */
#include <stdio.h>
static unsigned long long Q,carry=36243678541LL,
xcng=12367890123456LL,xs=521288629546311LL,indx=20632;

#define CNG ( xcng=6906969069LL*xcng+123 )
#define XS  ( xs^=xs<<13,xs^=xs>>17,xs^=xs<<43 )
#define SUPR ( indx<20632 ? Q[indx++] : refill() )
#define KISS SUPR+CNG+XS

unsigned long long refill( )
{int i; unsigned long long z,h;
for(i=0;i<20632;i++){  h=(carry&1);
z=((Q<<41)>>1)+((Q<<39)>>1)+(carry>>1);
carry=(Q>>23)+(Q>>25)+(z>>63);
Q=~((z<<1)+h);   }
indx=1; return (Q);
}

int main()
{int i; unsigned long long x;
for(i=0;i<20632;i++) Q=CNG+XS;
for(i=0;i<1000000000;i++) x=KISS;
printf("Does x=4013566000157423768\n     x=%LLd.\n",x);

}

-----------------------------------------------------------

And here are equivalent Fortran versions, which, absent
C's inline features, seem to need ~10% more run time.

-----------------------------------------------------------
module suprkiss64_M   ! period 5*2^1320480*(2^64-1)
integer,parameter :: I8=selected_int_kind(18)
integer(kind=I8) :: Q(20632),carry=36243678541_I8, &
xcng=12367890123456_I8,xs=521288629546311_I8,indx=20633_I8
contains
function KISS64() result(x)
integer(kind=I8) :: x
if(indx <= 20632)then; x=Q(indx); indx=indx+1
else; x=refill(); endif
xcng=xcng*6906969069_I8+123
xs=ieor(xs,ishft(xs,13))
xs=ieor(xs,ishft(xs,-17))
xs=ieor(xs,ishft(xs,43))
x=x+xcng+xs
return; end function KISS64

function refill() result(s)
integer(kind=I8) :: i,s,z,h
do i=1,20632
h=iand(carry,1_I8)
z = ishft(ishft(Q(i),41),-1)+ &
ishft(ishft(Q(i),39),-1)+ &
ishft(carry,-1)
carry=ishft(Q(i),-23)+ishft(Q(i),-25)+ishft(z,-63)
Q(i)=not(ishft(z,1)+h)
end do
indx=2; s=Q(1)
return; end function refill

end module suprkiss64_M

program testKISS64
use suprkiss64_M
integer(kind=I8) :: i,x
do i=1,20632      !fill Q with Congruential+Xorshift
xcng=xcng*6906969069_I8+123
xs=ieor(xs,ishft(xs,13))
xs=ieor(xs,ishft(xs,-17))
xs=ieor(xs,ishft(xs,43))
Q(i)=xcng+xs
end do
do i=1,1000000000_I8; x=KISS64(); end do
write(*,10) x
10 format(' Does x =  4013566000157423768 ?',/,6x,'x = ',I20)
end program testKISS64
-------------------------------------------------------------

module suprkiss32_M  ! period 5*2^1320481*(2^32-1)
integer,parameter :: I4=selected_int_kind(9)
integer(kind=I4) :: Q(41265),carry=362_I4, &
xcng=1236789_I4,xs=521288629_I4,indx=41266_I4
contains
function KISS32() result(x)
integer(kind=I4):: x
if(indx <= 41265)then;x=Q(indx); indx=indx+1
else; x=refill(); endif
xcng=xcng*69069_I4+123
xs=ieor(xs,ishft(xs,13))
xs=ieor(xs,ishft(xs,-17));
xs=ieor(xs,ishft(xs,5))
x=x+xcng+xs
return; end function KISS32

function refill() result(s)
integer(kind=I4) :: i,s,z,h
do i = 1,41265
h = iand(carry,1_I4)
z = ishft(ishft(Q(i),9),-1)+ &
ishft(ishft(Q(i),7),-1)+ &
ishft(carry,-1)
carry=ishft(Q(i),-23)+ishft(Q(i),-25)+ishft(z,-31)
Q(i)=not(ishft(z,1)+h)
end do
indx=2;   s=Q(1)
return; end function refill

end module suprkiss32_M

program testKISS32
use suprkiss32_M
integer(kind=I4) :: i,x
do i=1,41265   !fill Q with Congruential+Xorshift
xcng=xcng*69069_I4+123
xs=ieor(xs,ishft(xs,13))
...

I am a big fan of your work.

One question, has the crypto usage of this been statistically verified
with http://www.iro.umontreal.ca/~simardr/testu01/tu01.html?

I supposed you already tested it against KISS.

Thanks ~A

P

#### Phil Carmody

Richard Outerbridge said:
Here are two somewhat "whiter" and more C99 compliant versions of
SuperKISS32 and SuperKISS64.

On my Macintosh PPC PowerBook5,8 running Leopard v10.5.8 (9L30) PPC,
and using for a compiler powerpc-apple-darwin9-gcc-4.0.1, I get UNIX
time speeds of ~11.0 and ~34.9 seconds, respectively. I've assumed
the standard C type restrictions: an int is only at least 16-bits,
and long is only at least 32-bits; and so on.

Any reason you're avoiding <stdint.h>?

Phil

R

#### robin

This 32-bit version uses the unsigned arithmetic facilities of PL/I:

/*-------------------------------------------------------------*/

testKISS32: procedure options (main);

/* George Marsaglia's Pseudo-Random Number Generator having a period of */
/* 5*2^1320481*(2^32-1). */
/* 32-bit PRNGs are generated directly. */
/* The procedure FILL must be called first before invoking KISS32. */

KISS32: procedure() returns(fixed binary (32) unsigned) options (reorder);
declare x fixed binary (32) unsigned;
declare Q(41265) fixed binary (32) unsigned external,
indx fixed binary (31) static initial (41266);
declare (carry initial (362),
xcng initial (1236789),
xs initial (521288629)) fixed binary (32) unsigned external;

if indx <= 41265 then do; x=Q(indx); indx=indx+1; end;
else do; x=refill(); indx = 2; end;
xcng=xcng*69069+123;
xs=ieor(xs,isll(xs,13));
xs=ieor(xs,isrl(xs,17));
xs=ieor(xs,isll(xs,5));
return (x+xcng+xs);
end KISS32;

refill: procedure() returns(fixed binary(32) unsigned) options (reorder);
declare (i,s,z,h) fixed binary (32) unsigned;
declare Q(41265) fixed binary (32) unsigned external;
declare (carry initial (362),
xcng initial (1236789),
xs initial (521288629)) fixed binary (32) unsigned external;

do i = 1 to 41265;
h = iand(carry,1);
z = isrl(isll(Q(i),9),1) + isrl(isll(Q(i),7),1) + isrl(carry,1);
carry=isrl(Q(i),23)+isrl(Q(i),25)+isrl(z,31);
Q(i)=inot(isll(z,1)+h);
end;
return (Q(1));
end refill;

fill: procedure options (reorder);
declare Q(41265) fixed binary (32) unsigned external;
declare (carry initial (362),
xcng initial (1236789),
xs initial (521288629)) fixed binary (32) unsigned external;
declare i fixed binary (31);

do i=1 to 41265; /* fill Q with Congruential+Xorshift */
xcng=xcng*69069+123;
xs=ieor(xs,isll(xs,13));
xs=ieor(xs,isrl(xs,17));
xs=ieor(xs,isll(xs,5));
Q(i)=xcng+xs;
end;
end fill;

declare i fixed binary (31), x fixed binary (32) unsigned;

call fill;
do i=1 to 1000000000; x=KISS32(); end;
put skip edit ( 'Does x = 1809478889 ?', 'x =', x)
(a, skip, x(5), a, f(11));
end testKISS32;

/*--------------------------------------------------------------- */

R

#### robin

This is the 64-bit PL/I version, again using unsigned arithmetic:

testKISS64: procedure options (main);
declare (i,x) fixed binary (64) unsigned;

/* George Marsaglia's Pseudo-Random Number Generator having a period of */
/* 5*2^1320480*(2^64-1). */
/* 64-bit PRNGs are generated directly. */
/* The procedure FILL must be called first before invoking KISS64. */

KISS64: procedure() returns (fixed binary (64) unsigned) options (reorder);
declare x fixed binary (64) unsigned,
Q(20632) fixed binary (64) unsigned external,
indx fixed binary static initial (20633);
declare (carry initial (36243678541),
xcng initial (12367890123456),
xs initial (521288629546311)) fixed binary (64) unsigned external;
if indx <= 20632 then do; x=Q(indx); indx=indx+1; end;
else do; x=refill(); indx = 2; end;
(nofixedoverflow):
xcng=xcng*6906969069+123;
xs=ieor(xs,isll(xs,13));
xs=ieor(xs,isrl(xs,17));
xs=ieor(xs,isll(xs,43));
return (x+xcng+xs);
end KISS64;

refill: procedure () returns (fixed binary (64) unsigned) options (reorder);
declare Q(20632) fixed binary (64) unsigned external;
declare (carry initial (36243678541),
xcng initial (12367890123456),
xs initial (521288629546311)) fixed binary (64) unsigned external;
declare i fixed binary, (z,h) fixed binary (64) unsigned;
do i=1 to 20632;
h=iand(carry,1);
z = isrl(isll(Q(i),41),1) + isrl(isll(Q(i),39),1) + isrl(carry,1);
carry=isrl(Q(i),23)+isrl(Q(i),25)+isrl(z,63);
Q(i)=inot(isll(z,1)+h);
end;
return (Q(1));
end refill;

fill: procedure options (reorder);
declare Q(20632) fixed binary (64) unsigned external;
declare (carry initial (36243678541),
xcng initial (12367890123456),
xs initial (521288629546311)) fixed binary (64) unsigned external;
declare i fixed binary;

do i=1 to 20632; /* fill Q with Congruential+Xorshift */
xcng=xcng*6906969069+123;
xs=ieor(xs,isll(xs,13));
xs=ieor(xs,isrl(xs,17));
xs=ieor(xs,isll(xs,43));
Q(i)=xcng+xs;
end;
end fill;

call fill;
do i=1 to 1000000000; x=KISS64(); end;
put skip edit ( 'Does x = 4013566000157423768 ?', 'x = ', x)
(a, skip, x(5), a, f(20));

end testKISS64;
/*-------------------------------------------------------------*/

R

#### robin

Here is the 64-bit version as a package:

/*-----------------------------------------------------------*/
RNG64: package exports (KISS64, FILL);

declare Q(20632) fixed binary (64) unsigned static;
declare (carry initial (36243678541),
xcng initial (12367890123456),
xs initial (521288629546311)) fixed binary (64) unsigned static;

/* George Marsaglia's Pseudo-Random Number Generator having a period of */
/* 5*2^1320480*(2^64-1). */
/* 64-bit PRNGs are generated directly. */
/* The procedure FILL must be called first before invoking KISS64. */
/* Translated from Fortran to PL/I by R. Vowels. 29 November 2009. */
KISS64: procedure() returns (fixed binary (64) unsigned) options (reorder);
declare x fixed binary (64) unsigned,
indx fixed binary static initial (20633);

if indx <= 20632 then do; x=Q(indx); indx=indx+1; end;
else do; x=refill(); indx = 2; end;
(nofixedoverflow):
xcng=xcng*6906969069+123;
xs=ieor(xs,isll(xs,13));
xs=ieor(xs,isrl(xs,17));
xs=ieor(xs,isll(xs,43));
return (x+xcng+xs);
end KISS64;

refill: procedure () returns (fixed binary (64) unsigned) options (reorder);
declare i fixed binary, (z,h) fixed binary (64) unsigned;

do i=1 to 20632;
h=iand(carry,1);
z = isrl(isll(Q(i),41),1) + isrl(isll(Q(i),39),1) + isrl(carry,1);
carry=isrl(Q(i),23)+isrl(Q(i),25)+isrl(z,63);
Q(i)=inot(isll(z,1)+h);
end;
return (Q(1));
end refill;

fill: procedure options (reorder);
declare i fixed binary;

do i=1 to 20632; /* fill Q with Congruential+Xorshift */
xcng=xcng*6906969069+123;
xs=ieor(xs,isll(xs,13));
xs=ieor(xs,isrl(xs,17));
xs=ieor(xs,isll(xs,43));
Q(i)=xcng+xs;
end;
end fill;

end RNG64;
/*-------------------------------------------------------------*/

G

#### g.resta

#define KISS SUPR+CNG+XS

The line above should be written as

#define KISS (SUPR+CNG+XS)

otherwise it is very dangerous (that is, gives the wrong result)
if one uses KISS in operations, like:

z = KISS % Something; // or
z = KISS / SomethingElse;

or things like that.

giovanni

P

#### Phil Carmody

Richard Outerbridge said:
Scope creep? That's probably the better way to go.

I view the probability of that to be less than a half. Your opinion
may differ. There's nothing big or clever about insisting on being a
dinosaur a decade after the standard incorporated the feature that

Phil

P

#### Phil Carmody

Richard Outerbridge said:
That's really what I was trying to say: using <stdint.h> is probably
the better way to go.

Ah, in retrospect you did. In the absense of a verb, I extrapolated
an 'avoiding' as a prefix to your noun phrase and by so doing changed
its parity.
I'll admit to being a small, ignorant dinosaur.

There may be some who delight in dinosaurism, but many are just
too shy to break their prior mould, lest they appear clumsy in
their first few uses.
I will, however, suggest that avoiding scope creep is almost always a
good idea.

I think any feature which can be implemented as a prior-standard
compliant header file with sufficient insider knowledge of the
implmentation barely counts as that. (So if a C90 implementation
didn't support it, you could 'upgrade' it by a simple #include.)
stdint.h is one of those feature sets, IMHO.

Phil