An interesting new RNG.

G

geo

I have found an interesting new
Random Number Generator (RNG) based on the prime

p = 2^137210-2^133114+1.

One of its most interesting features is that establishing
the period requires finding all of the prime factors of p-1,
and each of these famous giants of prime number theory:
Euler, Cunningham, Morrison, Brillhart,
Brent, Pollard, Lenstra, Selfridge
has discovered one or more of the prime factors of p-1:
p-1 = 2^133114*F_2*F_3*F_4*F_5*F_6*F_7*F_8*F_9*F_10*F_11,
with F_n=2^(2^n)+1 the nth Fermat number. See, for example,
http://www.prothsearch.net/fermat.html

If j is a random integer in 0<j<p then the binary expansion
of j/p will have period (p-1)/2 and its bits can well serve
as a source of random bits.

The bits in such an expansion may be generated---in reverse
order, either 32-bits at a time, or 64-bits at a time---by a
CSWB (Complimentary-Subtract-With-Borrow) procedure:

Given x_0,x_1,...,x_{4287},(32-bit seeds) and previously generated
x_{4288},x_{4289},...,x_{n-1} for n>4287, and a current `boro` of
0 or 1, these simple operations may be used to get
the next boro and the next random number x_n:
t=x_{n-4288}; h=x_{n-4160}+boro;
if t<h then boro=1 else boro=0;
return x_n=(h-t-1 mod b=2^32);
and keep boro for the next x_n.

The 32-bit version is based on the representation
p=b^4288-b^4160+1 with b=2^32, but setting B=2^64 and
p=B^2144-B^2080+1 leads to 64-bit integers by means of
the same instructions, except that, with 2144 64-bit seeds,
t=x_{n-2144}; h=x_{n-2080}+boro;

Note that x_n=h-t-1 will be automatically mod b=2^32
for most 32-bit CPUs, or mod B=2^64 for 64-bit CPUs,
whether for signed or unsigned integers.

Determining the new boro is particularly simple
for signed integers. In C: boro=(t<h);
For languages that permit only signed integers, it
is not as simple. Let s(x) be the sign bit of x,
easily obtained, for example, by a right shift:
if s(t)=s(h) then boro=s(t-h); else boro=s(h);

Interested readers are invited to provide C,C++,Fortran or
other implementations. All you need is an array, say X[4288]
for 32-bit or X[2144] for 64-bit CPUs, and return the next
unused element of the array, except that if none are left,
refill the array with the rules above, reset the array index
and return the first element. Suitable for signed or
unsigned, 32- or 64-bit integers, but signed may require the
more complicated method to determine the 'boro'.

The arrays X[4288] or X[2144] must be initially seeded with
32-bit or 64-bit integers and an initial boro seed of 0 or 1,
Having so many seed bits is desirable in applications such
as in Law or Gaming, where a minimal, often extremely large,
number of possible results are required, but a nuisance in most
applications where only a few dozen seed bits may be enough.
Most implementations would ask for a few seeds from which the
X[4288] or X[2144] arrays are filled by combining simple RNGs.

The CSWB sequence x_n=x_{n-4288}-x_{n-4160+boro mod 2^32
might do well on the Diehard tests, but I tend to favor
a KISS (Keep It Simple, Stupid) approach, combining it, via
addition mod b=2^32 or B=2^64, Congruential and Xorshift
sequences. Unlike most combinations, such a one would not
increase the period, as 2^32, 2^32-1, 2^64 and 2^64-1 all
divide p-1. But a period exceeding 10^3104 should serve.

George Marsaglia
 
T

Tim Rentsch

geo said:
I have found an interesting new
Random Number Generator (RNG) based on the prime

p = 2^137210-2^133114+1.

One of its most interesting features is that establishing
the period requires finding all of the prime factors of p-1,
and each of these famous giants of prime number theory:
Euler, Cunningham, Morrison, Brillhart,
Brent, Pollard, Lenstra, Selfridge
has discovered one or more of the prime factors of p-1:
p-1 = 2^133114*F_2*F_3*F_4*F_5*F_6*F_7*F_8*F_9*F_10*F_11,
with F_n=2^(2^n)+1 the nth Fermat number. See, for example,
http://www.prothsearch.net/fermat.html

If j is a random integer in 0<j<p then the binary expansion
of j/p will have period (p-1)/2 and its bits can well serve
as a source of random bits.

The bits in such an expansion may be generated---in reverse
order, either 32-bits at a time, or 64-bits at a time---by a
CSWB (Complimentary-Subtract-With-Borrow) procedure:

Given x_0,x_1,...,x_{4287},(32-bit seeds) and previously generated
x_{4288},x_{4289},...,x_{n-1} for n>4287, and a current `boro` of
0 or 1, these simple operations may be used to get
the next boro and the next random number x_n:
t=x_{n-4288}; h=x_{n-4160}+boro;
if t<h then boro=1 else boro=0;
return x_n=(h-t-1 mod b=2^32);
and keep boro for the next x_n.

The 32-bit version is based on the representation
p=b^4288-b^4160+1 with b=2^32, but setting B=2^64 and
p=B^2144-B^2080+1 leads to 64-bit integers by means of
the same instructions, except that, with 2144 64-bit seeds,
t=x_{n-2144}; h=x_{n-2080}+boro;

Note that x_n=h-t-1 will be automatically mod b=2^32
for most 32-bit CPUs, or mod B=2^64 for 64-bit CPUs,
whether for signed or unsigned integers.

Determining the new boro is particularly simple
for signed integers.
^^^^^^

I assume you actually meant unsigned integers here,
not signed integers.
In C: boro=(t<h);
For languages that permit only signed integers, it
is not as simple. Let s(x) be the sign bit of x,
easily obtained, for example, by a right shift:
if s(t)=s(h) then boro=s(t-h); else boro=s(h);

Interested readers are invited to provide C,C++,Fortran or
other implementations. All you need is an array, say X[4288]
for 32-bit or X[2144] for 64-bit CPUs, and return the next
unused element of the array, except that if none are left,
refill the array with the rules above, reset the array index
and return the first element. Suitable for signed or
unsigned, 32- or 64-bit integers, but signed may require the
more complicated method to determine the 'boro'.

The arrays X[4288] or X[2144] must be initially seeded with
32-bit or 64-bit integers and an initial boro seed of 0 or 1,
Having so many seed bits is desirable in applications such
as in Law or Gaming, where a minimal, often extremely large,
number of possible results are required, but a nuisance in most
applications where only a few dozen seed bits may be enough.
Most implementations would ask for a few seeds from which the
X[4288] or X[2144] arrays are filled by combining simple RNGs.

The CSWB sequence x_n=x_{n-4288}-x_{n-4160+boro mod 2^32
might do well on the Diehard tests, but I tend to favor
a KISS (Keep It Simple, Stupid) approach, combining it, via
addition mod b=2^32 or B=2^64, Congruential and Xorshift
sequences. Unlike most combinations, such a one would not
increase the period, as 2^32, 2^32-1, 2^64 and 2^64-1 all
divide p-1. But a period exceeding 10^3104 should serve.

A nice idea for an RNG. Have you run it through the Diehard
suite? I'm curious to know how it does. I quickly coded
up a C version (assuming I understood the explanation
correctly), it produces about 200 million 32-bit RN's
per second (on a box that's roughly 2.5 billion IPS).
 
B

Ben Bacarisse

A nice idea for an RNG. Have you run it through the Diehard
suite? I'm curious to know how it does. I quickly coded
up a C version (assuming I understood the explanation
correctly),

Which p did you use? I got puzzled when I saw

p = 2^137210-2^133114+1

as the basis for the post but

p=b^4288-b^4160+1 with b=2^32

as the representation to use.
 
G

geo

Which p did you use?  I got puzzled when I saw

  p = 2^137210-2^133114+1

as the basis for the post but

  p=b^4288-b^4160+1 with b=2^32

as the representation to use.

oops; that 137210 should be 137216 so that

p = 2^137216-2^133120+1= b^4288-b^4160+1 with b=2^32,
or p = B^2144-B^2080+1 with B=2^64.

I should be more careful with number entries,
as spelling checks are of little use.

Fortunately, I had the key point correct:
p-1= 2^133114*(2^4096-1) and, thanks to the contributions of
those greats over the past 250 years, was able to verify that
2^k = 1 mod p when k = (p-1)/2,
and for each of the 26 prime factors q_i of p-1,
2^(k/q_i) mod p is not 1.

George Marsaglia
 
T

Tim Rentsch

Ben Bacarisse said:
Which p did you use? I got puzzled when I saw

p = 2^137210-2^133114+1

as the basis for the post but

p=b^4288-b^4160+1 with b=2^32

as the representation to use.

I expect that would have puzzled me too if I had paid any
attention to it at all. But I didn't. All I did was code
something up based on the one paragraph with several lines of
indented pseudo-code.
 
T

Tim Rentsch

geo said:
oops; that 137210 should be 137216 so that

p = 2^137216-2^133120+1= b^4288-b^4160+1 with b=2^32,
or p = B^2144-B^2080+1 with B=2^64.

I should be more careful with number entries,
as spelling checks are of little use.

Fortunately, I had the key point correct:
p-1= 2^133114*(2^4096-1) and, thanks to the contributions of

Do you mean 2^133120 here?
those greats over the past 250 years, was able to verify that
2^k = 1 mod p when k = (p-1)/2,
and for each of the 26 prime factors q_i of p-1,
2^(k/q_i) mod p is not 1.

The equation

p-1= 2^133114*(2^4096-1)

coupled with

p-1 = 2^133114*F_2*F_3*F_4*F_5*F_6*F_7*F_8*F_9*F_10*F_11

would imply the identity

2^4096-1 = F_2*F_3*F_4*F_5*F_6*F_7*F_8*F_9*F_10*F_11

It seems like something is wrong. In particular this last identity
looks like it's missing factors of F_0 and and F_1.
 
G

geo

geo said:
I have found an interesting new
Random Number Generator (RNG) based on the prime
     p = 2^137210-2^133114+1.
One of its most interesting features is that establishing
the period requires finding all of the prime factors of p-1,
and each of these famous giants of prime number theory:
   Euler, Cunningham, Morrison, Brillhart,
   Brent, Pollard, Lenstra, Selfridge
has discovered one or more of the prime factors of p-1:
   p-1 = 2^133114*F_2*F_3*F_4*F_5*F_6*F_7*F_8*F_9*F_10*F_11,
with F_n=2^(2^n)+1 the nth Fermat number. See, for example,
http://www.prothsearch.net/fermat.html
If j is a random integer in 0<j<p then the binary expansion
of j/p will have period (p-1)/2 and its bits can well serve
as a source of random bits.
The bits in such an expansion may be generated---in reverse
order, either 32-bits at a time, or 64-bits at a time---by a
CSWB (Complimentary-Subtract-With-Borrow) procedure:
Given x_0,x_1,...,x_{4287},(32-bit seeds) and previously generated
x_{4288},x_{4289},...,x_{n-1} for n>4287, and a current `boro` of
0 or 1, these simple operations may be used to get
the next boro and the next random number x_n:
      t=x_{n-4288};  h=x_{n-4160}+boro;
      if t<h then boro=1 else boro=0;
      return x_n=(h-t-1 mod b=2^32);
      and keep boro for the next x_n.
The 32-bit version is based on the representation
p=b^4288-b^4160+1 with b=2^32, but setting B=2^64 and
p=B^2144-B^2080+1 leads to 64-bit integers by means of
the same instructions, except that, with 2144 64-bit seeds,
      t=x_{n-2144}; h=x_{n-2080}+boro;
Note that x_n=h-t-1 will be automatically mod b=2^32
for most 32-bit CPUs, or mod B=2^64 for 64-bit CPUs,
whether for signed or unsigned integers.
Determining the new boro is particularly simple
for signed integers.

      ^^^^^^

I assume you actually meant unsigned integers here,
not signed integers.




In C: boro=(t<h);
For languages that permit only signed integers, it
is not as simple.  Let s(x) be the sign bit of x,
easily obtained, for example, by a right shift:
    if s(t)=s(h) then boro=s(t-h); else boro=s(h);
Interested readers are invited to provide C,C++,Fortran or
other implementations.   All you need is an array, say X[4288]
for 32-bit or X[2144] for 64-bit CPUs, and return the next
unused element of the array, except that if none are left,
refill the  array with the rules above, reset the array index
and return the first element.  Suitable for signed or
unsigned, 32- or 64-bit integers, but signed may require the
more complicated method to determine the 'boro'.
The arrays X[4288] or X[2144] must be initially seeded with
32-bit or 64-bit integers and an initial boro seed of 0 or 1,
Having so many seed bits is desirable in applications such
as in Law or Gaming, where a minimal, often extremely large,
number of possible results are required, but a nuisance in most
applications where only a few dozen seed bits may be enough.
Most implementations would ask for a few seeds from which the
X[4288] or X[2144] arrays are filled by combining simple RNGs.
The CSWB sequence x_n=x_{n-4288}-x_{n-4160+boro mod 2^32
might do well on the Diehard tests, but I tend to favor
a KISS (Keep It Simple, Stupid) approach, combining it, via
addition mod b=2^32 or B=2^64, Congruential and Xorshift
sequences. Unlike most combinations, such a one would not
increase the period, as 2^32, 2^32-1, 2^64 and 2^64-1 all
divide p-1. But a period exceeding 10^3104 should serve.

A nice idea for an RNG.  Have you run it through the Diehard
suite?  I'm curious to know how it does.  I quickly coded
up a C version (assuming I understood the explanation
correctly), it produces about 200 million 32-bit RN's
per second (on a box that's roughly 2.5 billion IPS).

Yes, running just CSWB() from the C code below yields no significant
extremes among the 230 p-values returned from the Diehard Battery of
Tests.
But for extra safety, I prefer a Keep-It-Simple-Stupid approach, via
macros KISS = CSWB() + CNG + XS.

Here is a complete C listing that will print out the expected
and actual results from calls for 10^9 KISSes.
======================================================================

#include <stdio.h>

/* CSWB4288 x[n]=x[n-4288]-x[n-4160]-boro mod b=2^32 */

static unsigned long boro=0,indx=4287,
xs=532456711,xcng=262436069,Q[4288];

unsigned long CSWB(void)
{ unsigned long t,h,i;
if(indx<4288) return Q[indx++];
for(i=0 ;i<4160;i++)
{t=Q; h=Q[i+ 128]+boro; boro=(t<h); Q=h-t-1; }
for(i=4160;i<4288;i++)
{t=Q; h=Q[i-4160]+boro; boro=(t<h); Q=h-t-1; }
indx=1; return Q[0];
}

/* XS: Xorshift macro, CNG: Congruential macro */
#define XS ( xs^=(xs<<13), xs^=(xs>>17), xs^=(xs<<5) )
#define CNG ( xcng=69069*xcng+123 )
#define KISS ( CSWB()+CNG+XS )

int main() /*should produce ~140 million KISS's per second*/
{int i; unsigned long x;
for(i=0;i<4288;i++) Q=CNG+XS; /* seed the Q[] array with CNG+XS*/
for(i=0;i<1000000000;i++) x=KISS;
printf("Expected:\n x,unsigned:2180808443,x,signed:-2114158853\n");
printf("Observed:\n x,unsigned:%u,x,signed:%d\n",x,x);
}

===========================================================================

Running gcc -O3 took about 7 seconds. Interested readers might
want to cut, paste, compile and try runs of there own,
both to get time comparisons and confirm the arithmetic.

George Marsaglia
 
G

geo

      ^^^^^^
I assume you actually meant unsigned integers here,
not signed integers.
In C: boro=(t<h);
For languages that permit only signed integers, it
is not as simple.  Let s(x) be the sign bit of x,
easily obtained, for example, by a right shift:
    if s(t)=s(h) then boro=s(t-h); else boro=s(h);
Interested readers are invited to provide C,C++,Fortran or
other implementations.   All you need is an array, say X[4288]
for 32-bit or X[2144] for 64-bit CPUs, and return the next
unused element of the array, except that if none are left,
refill the  array with the rules above, reset the array index
and return the first element.  Suitable for signed or
unsigned, 32- or 64-bit integers, but signed may require the
more complicated method to determine the 'boro'.
The arrays X[4288] or X[2144] must be initially seeded with
32-bit or 64-bit integers and an initial boro seed of 0 or 1,
Having so many seed bits is desirable in applications such
as in Law or Gaming, where a minimal, often extremely large,
number of possible results are required, but a nuisance in most
applications where only a few dozen seed bits may be enough.
Most implementations would ask for a few seeds from which the
X[4288] or X[2144] arrays are filled by combining simple RNGs.
The CSWB sequence x_n=x_{n-4288}-x_{n-4160+boro mod 2^32
might do well on the Diehard tests, but I tend to favor
a KISS (Keep It Simple, Stupid) approach, combining it, via
addition mod b=2^32 or B=2^64, Congruential and Xorshift
sequences. Unlike most combinations, such a one would not
increase the period, as 2^32, 2^32-1, 2^64 and 2^64-1 all
divide p-1. But a period exceeding 10^3104 should serve.
A nice idea for an RNG.  Have you run it through the Diehard
suite?  I'm curious to know how it does.  I quickly coded
up a C version (assuming I understood the explanation
correctly), it produces about 200 million 32-bit RN's
per second (on a box that's roughly 2.5 billion IPS).

Yes, running just CSWB() from the C code below yields no significant
extremes among the 230 p-values returned from the Diehard Battery of
Tests.
But for extra safety, I prefer a Keep-It-Simple-Stupid approach, via
macros KISS = CSWB() + CNG + XS.

Here is a complete C listing that will print out the expected
and actual results from  calls for 10^9 KISSes.
======================================================================

#include <stdio.h>

/* CSWB4288 x[n]=x[n-4288]-x[n-4160]-boro mod b=2^32  */

static unsigned long boro=0,indx=4287,
     xs=532456711,xcng=262436069,Q[4288];

unsigned long CSWB(void)
{ unsigned long t,h,i;
  if(indx<4288) return Q[indx++];
   for(i=0   ;i<4160;i++)
   {t=Q; h=Q[i+ 128]+boro; boro=(t<h); Q=h-t-1; }
   for(i=4160;i<4288;i++)
   {t=Q; h=Q[i-4160]+boro; boro=(t<h); Q=h-t-1; }
  indx=1; return Q[0];

}

/* XS: Xorshift macro, CNG: Congruential macro */
#define XS ( xs^=(xs<<13), xs^=(xs>>17), xs^=(xs<<5) )
#define CNG ( xcng=69069*xcng+123 )
#define KISS ( CSWB()+CNG+XS )

int main() /*should produce ~140 million KISS's per second*/
{int i; unsigned long x;
for(i=0;i<4288;i++) Q=CNG+XS; /* seed the Q[] array with CNG+XS*/
for(i=0;i<1000000000;i++) x=KISS;
printf("Expected:\n x,unsigned:2180808443,x,signed:-2114158853\n");
printf("Observed:\n x,unsigned:%u,x,signed:%d\n",x,x);

}

===========================================================================

Running gcc -O3 took  about 7 seconds.  Interested readers might
want to cut, paste, compile and try runs of there own,
both to get time comparisons and confirm the arithmetic.

George Marsaglia- Hide quoted text -

- Show quoted text -


"... runs of their own... Spelling checkers cannot
fix failure to read after typing.
 
B

Ben Bacarisse

geo said:
Here is a complete C listing that will print out the expected
and actual results from calls for 10^9 KISSes.
======================================================================

#include <stdio.h>

/* CSWB4288 x[n]=x[n-4288]-x[n-4160]-boro mod b=2^32 */

static unsigned long boro=0,indx=4287,
xs=532456711,xcng=262436069,Q[4288];

unsigned long CSWB(void)
{ unsigned long t,h,i;
if(indx<4288) return Q[indx++];
for(i=0 ;i<4160;i++)
{t=Q; h=Q[i+ 128]+boro; boro=(t<h); Q=h-t-1; }
for(i=4160;i<4288;i++)
{t=Q; h=Q[i-4160]+boro; boro=(t<h); Q=h-t-1; }
indx=1; return Q[0];
}

/* XS: Xorshift macro, CNG: Congruential macro */
#define XS ( xs^=(xs<<13), xs^=(xs>>17), xs^=(xs<<5) )
#define CNG ( xcng=69069*xcng+123 )
#define KISS ( CSWB()+CNG+XS )

int main() /*should produce ~140 million KISS's per second*/
{int i; unsigned long x;
for(i=0;i<4288;i++) Q=CNG+XS; /* seed the Q[] array with CNG+XS*/
for(i=0;i<1000000000;i++) x=KISS;
printf("Expected:\n x,unsigned:2180808443,x,signed:-2114158853\n");
printf("Observed:\n x,unsigned:%u,x,signed:%d\n",x,x);


You need %lx and %ld here.
}

===========================================================================

Running gcc -O3 took about 7 seconds. Interested readers might
want to cut, paste, compile and try runs of there own,
both to get time comparisons and confirm the arithmetic.

Given the change above, I get the expected answer.

Time is 5.5s. Intel(R) Core(TM)2 Duo CPU P8700 @ 2.53GHz. gcc -O3.
gcc version 4.4.3.
 
D

David Resnick

<snip>


Here is a complete C listing that will print out the expected
and actual results from  calls for 10^9 KISSes.
======================================================================
#include <stdio.h>
/* CSWB4288 x[n]=x[n-4288]-x[n-4160]-boro mod b=2^32  */
static unsigned long boro=0,indx=4287,
     xs=532456711,xcng=262436069,Q[4288];
unsigned long CSWB(void)
{ unsigned long t,h,i;
  if(indx<4288) return Q[indx++];
   for(i=0   ;i<4160;i++)
   {t=Q; h=Q[i+ 128]+boro; boro=(t<h); Q=h-t-1; }
   for(i=4160;i<4288;i++)
   {t=Q; h=Q[i-4160]+boro; boro=(t<h); Q=h-t-1; }
  indx=1; return Q[0];
}

/* XS: Xorshift macro, CNG: Congruential macro */
#define XS ( xs^=(xs<<13), xs^=(xs>>17), xs^=(xs<<5) )
#define CNG ( xcng=69069*xcng+123 )
#define KISS ( CSWB()+CNG+XS )
int main() /*should produce ~140 million KISS's per second*/
{int i; unsigned long x;
for(i=0;i<4288;i++) Q=CNG+XS; /* seed the Q[] array with CNG+XS*/
for(i=0;i<1000000000;i++) x=KISS;
printf("Expected:\n x,unsigned:2180808443,x,signed:-2114158853\n");
printf("Observed:\n x,unsigned:%u,x,signed:%d\n",x,x);


You need %lx and %ld here.
}

Running gcc -O3 took  about 7 seconds.  Interested readers might
want to cut, paste, compile and try runs of there own,
both to get time comparisons and confirm the arithmetic.

Given the change above, I get the expected answer.

Time is 5.5s.  Intel(R) Core(TM)2 Duo CPU P8700 @ 2.53GHz.  gcc -O3.
gcc version 4.4.3.


If you'd like more data:
Time is 12.486 s. Intel(R) Xeon(TM) CPU 2.40GHz (4 cores). gcc -O3.
gcc version 4.1.2.

-David
 
G

Gene

Here is a complete C listing that will print out the expected
and actual results from  calls for 10^9 KISSes.
======================================================================
#include <stdio.h>
/* CSWB4288 x[n]=x[n-4288]-x[n-4160]-boro mod b=2^32  */
static unsigned long boro=0,indx=4287,
     xs=532456711,xcng=262436069,Q[4288];
unsigned long CSWB(void)
{ unsigned long t,h,i;
  if(indx<4288) return Q[indx++];
   for(i=0   ;i<4160;i++)
   {t=Q; h=Q[i+ 128]+boro; boro=(t<h); Q=h-t-1; }
   for(i=4160;i<4288;i++)
   {t=Q; h=Q[i-4160]+boro; boro=(t<h); Q=h-t-1; }
  indx=1; return Q[0];
}
/* XS: Xorshift macro, CNG: Congruential macro */
#define XS ( xs^=(xs<<13), xs^=(xs>>17), xs^=(xs<<5) )
#define CNG ( xcng=69069*xcng+123 )
#define KISS ( CSWB()+CNG+XS )
int main() /*should produce ~140 million KISS's per second*/
{int i; unsigned long x;
for(i=0;i<4288;i++) Q=CNG+XS; /* seed the Q[] array with CNG+XS*/
for(i=0;i<1000000000;i++) x=KISS;
printf("Expected:\n x,unsigned:2180808443,x,signed:-2114158853\n");
printf("Observed:\n x,unsigned:%u,x,signed:%d\n",x,x);

You need %lx and %ld here.
Given the change above, I get the expected answer.
Time is 5.5s.  Intel(R) Core(TM)2 Duo CPU P8700 @ 2.53GHz.  gcc -O3..
gcc version 4.4.3.

If you'd like more data:
Time is 12.486 s. Intel(R) Xeon(TM) CPU 2.40GHz (4 cores).  gcc -O3.
gcc version 4.1.2.

-David


Macbook Intel Core Duo 2.8 GHz
gcc -m32 -O3
version 4.2.1 (Apple Inc. build 5646)

real 0m7.201s
user 0m7.066s
sys 0m0.134s
 
O

orz

      ^^^^^^
I assume you actually meant unsigned integers here,
not signed integers.
In C: boro=(t<h);
For languages that permit only signed integers, it
is not as simple.  Let s(x) be the sign bit of x,
easily obtained, for example, by a right shift:
    if s(t)=s(h) then boro=s(t-h); else boro=s(h);
Interested readers are invited to provide C,C++,Fortran or
other implementations.   All you need is an array, say X[4288]
for 32-bit or X[2144] for 64-bit CPUs, and return the next
unused element of the array, except that if none are left,
refill the  array with the rules above, reset the array index
and return the first element.  Suitable for signed or
unsigned, 32- or 64-bit integers, but signed may require the
more complicated method to determine the 'boro'.
The arrays X[4288] or X[2144] must be initially seeded with
32-bit or 64-bit integers and an initial boro seed of 0 or 1,
Having so many seed bits is desirable in applications such
as in Law or Gaming, where a minimal, often extremely large,
number of possible results are required, but a nuisance in most
applications where only a few dozen seed bits may be enough.
Most implementations would ask for a few seeds from which the
X[4288] or X[2144] arrays are filled by combining simple RNGs.
The CSWB sequence x_n=x_{n-4288}-x_{n-4160+boro mod 2^32
might do well on the Diehard tests, but I tend to favor
a KISS (Keep It Simple, Stupid) approach, combining it, via
addition mod b=2^32 or B=2^64, Congruential and Xorshift
sequences. Unlike most combinations, such a one would not
increase the period, as 2^32, 2^32-1, 2^64 and 2^64-1 all
divide p-1. But a period exceeding 10^3104 should serve.
A nice idea for anRNG.  Have you run it through the Diehard
suite?  I'm curious to know how it does.  I quickly coded
up a C version (assuming I understood the explanation
correctly), it produces about 200 million 32-bit RN's
per second (on a box that's roughly 2.5 billion IPS).

Yes, running just CSWB() from the C code below yields no significant
extremes among the 230 p-values returned from the Diehard Battery of
Tests.
But for extra safety, I prefer a Keep-It-Simple-Stupid approach, via
macros KISS = CSWB() + CNG + XS.

Here is a complete C listing that will print out the expected
and actual results from  calls for 10^9 KISSes.
======================================================================

#include <stdio.h>

/* CSWB4288 x[n]=x[n-4288]-x[n-4160]-boro mod b=2^32  */

static unsigned long boro=0,indx=4287,
     xs=532456711,xcng=262436069,Q[4288];

unsigned long CSWB(void)
{ unsigned long t,h,i;
  if(indx<4288) return Q[indx++];
   for(i=0   ;i<4160;i++)
   {t=Q; h=Q[i+ 128]+boro; boro=(t<h); Q=h-t-1; }
   for(i=4160;i<4288;i++)
   {t=Q; h=Q[i-4160]+boro; boro=(t<h); Q=h-t-1; }
  indx=1; return Q[0];

}

/* XS: Xorshift macro, CNG: Congruential macro */
#define XS ( xs^=(xs<<13), xs^=(xs>>17), xs^=(xs<<5) )
#define CNG ( xcng=69069*xcng+123 )
#define KISS ( CSWB()+CNG+XS )

int main() /*should produce ~140 million KISS's per second*/
{int i; unsigned long x;
for(i=0;i<4288;i++) Q=CNG+XS; /* seed the Q[] array with CNG+XS*/
for(i=0;i<1000000000;i++) x=KISS;
printf("Expected:\n x,unsigned:2180808443,x,signed:-2114158853\n");
printf("Observed:\n x,unsigned:%u,x,signed:%d\n",x,x);

}

===========================================================================

Running gcc -O3 took  about 7 seconds.  Interested readers might
want to cut, paste, compile and try runs of there own,
both to get time comparisons and confirm the arithmetic.

George Marsaglia


That version of KISS takes 9.230 seconds for 10**9 results here.
CSWB4288 takes 5.576 seconds for 10**9 results here.
That's on a 3 GHrz Core 2 Duo, MSVC, 32 bit instruction set.

I ran the output of CSWB4288 through some statistical tests. It
passed all the standard tests I tried, but fails tests if I transform
the data a little. The full KISS (or rather that version of it) would
probably pass every test I can throw at it, so I haven't tried testing
yet.

I used a slightly different seeding method on it than your code
(initializing the array from a small ultra-fast RNG that passes all
tests), but that should not have any impact I think, especially not
any impact that increased with test size.

The transformation I used on CSWB4228 is:
I use only the first 4 bits out of every 512 bytes that the RNG
outputs
i.e. transformed_bitstream[x] = original_bitstream[(x mod 4) + (x/4
(rounded down)) * 512 * 8]
The rationale I used for picking that transform is based upon the CSWB
lags having large powers of 2 as common factors, and the relatively
slow rate of entropy spread between different bit positions in the
CSWB algorithm. Which particular bits are used is not expected to be
of any importance, just that blocks of adjacent bits be selected at
regular intervals from a spacing that is a significant power of 2 bits
apart. I applied only one test to the transformed data, one I wrote
for long range linear correlation detection (also capable of detecting
short range linear correlation), as that test seems more sensitive to
bias produced by Fibonacci style RNGs than my other tests. I expect
other tests would likely work as well, though it might or might not
require tests sensitive to long range correlation.

It took 64 GB of untransformed data (64 MB of transformed data) for
the bias in CSWB4288 to become suspicious, 128 GB (128 MB transformed)
to reach the failure threshold. The time required for the 128 GB test
was 205 seconds for the CSWB calls and 5 seconds for transforming and
testing the data. The results are consistent (sometimes it's under
the failure threshold at 128 GB, but it's always been well over the
threshold at 256 GB). I do not know the distributions of my tests
well enough to give exact p-values, but I'd guestimate the p-value @
64 GB to be on the order of .95, the p-value @ 128 GB to be on the
order of .9999, and the p-value @ 256 GB to be not readily
distinguishable from 1. As a sanity check on the transformation I
also ran a another RNG through the same transform and test - the other
RNG passed with the expected results.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,066
Latest member
VytoKetoReviews

Latest Threads

Top