The CMWC4827 RNG, an improvement on MWC4691.

Discussion in 'C Programming' started by geo, Oct 22, 2010.

  1. geo

    geo Guest

    In August 2010 I posted descriptions of a MWC
    (Multiply-With-Carry) RNG based on, with b=2^32,
    the prime p=8193b^4691-1. With an apparently
    very long period and multiplier a=(2^13+1), the
    basic MWC operation could be carried out using only
    32-bit operations: given 32-bit c and x, imagine
    t=a*x+c in 64 bits, determine the new c and new x
    as the top and bottom 32-bits of t.

    There were two drawbacks: rare nuisance cases that
    had to be accommodated, and no proof of the order of b
    for the prime p, the period of the RNG. We can be
    almost---but not quite---certain that the order of
    b mod p is k=(p-1)/2, since b^k mod p = 1 and for each
    attainable prime divisor q of k---that is, with q one
    of 431,18413, 15799501,1505986643549,
    2883504568596254032007909, b^(k/q) mod p is not 1.
    (My thanks to Darío Alejandro Alpern of Buenos Aires for
    providing the larger, and estimates on future, factors.)

    But there are at least two more, presently unattainable,
    prime divisors of k, each with at least 30 digits.
    With z a primitive root of p and b=z^j, the chances of j
    being a multiple of one of those unknown prime divisors
    of k seem less than 1/10^30. So we can be reasonably---
    indeed, very---confident that the Industrial Grade Order.
    (IGO) of b is (p-1)/2 for the prime p=8193b^4691-1.

    So, with nuisance cases and uncertainties in mind, I put
    three computers to work searching for a prime of the
    form p=a*b^r+1, with a=2^i-1, to avoid the nuisance
    cases arising from a=2^13+1, and p=(2^i-1)*b^r+1 so
    that factoring p-1 would be feasible when r>4000.

    After about 5 days, one of them found p=4095*b^4827+1.
    And it turned out that the order of b mod p was the
    maximum possible, k=(p-1)/2^6, (since we "lose" one 2
    because 2 cannot be primitive, and must lose at least
    five more because b=2^5 and k is 4095*2^154458).
    Thus b^k mod p = 1, and for each prime divisor q of k,
    q=2,3,5,7,13: b^(k/q) mod p is not 1.

    Unlike p=ab^r-1, a prime of the form p=ab^r+1 leads
    to CMWC (Complimentary-Multiply-With-Carry) RNGs, in
    which we again imagine forming t=a*x+c in 64 bits and
    again seek c as the top 32 bits, but rather than
    x=(t mod b) for MWC, the new x is x=(b-1)-(t mod b),
    that is x=~(t mod b), using C's ~ op.

    Here is a C version of the resulting CMWC method for
    p=4095*b^4827+1, using only 32-bit arithmetic, with
    verified period 4095*2^154458, faster and simpler than
    the previously posted MWC4691() and readily adapted
    for either signed or unsigned integers and for Fortran
    or other programming languages:
    _________________________________________________________

    #include <stdio.h>

    static unsigned long Q[4827],carry=1271;

    unsigned long CMWC4827(void)
    {unsigned long t,x; static int j=4827;
    j=(j<4826)? j+1:0;
    x=Q[j]; t=(x<<12)+carry;
    carry=(x>>20)-(t<x);
    return (Q[j]=~(t-x));
    }

    #define CNG ( cng=69069*cng+13579 )
    #define XS ( xs^=(xs<<13), xs^=(xs>>17), xs^=(xs<<5) )
    #define KISS4827 ( CMWC4827()+CNG+XS )

    int main(void)
    {unsigned long int i,x,cng=123456789,xs=362436069;
    /* First seed Q[] with CNG+XS: */
    for(i=0;i<4827;i++) Q=CNG+XS;
    /* Then generate 10^9 CMWC4827()s */
    for(i=0;i<1000000000;i++) x=CMWC4827();
    printf("Does x=1346668762?\n x=%lu\n",x);
    /* followed by 10^9 KISS4827s: */
    for(i=0;i<1000000000;i++) x=KISS4827;
    printf("Does x=4041198809?\n x=%lu\n",x);

    return 0;
    }
    ______________________________________________________

    Getting 10^9 CMWC4827s should take less than four seconds,
    while 10^9 KISS4827s should take less than seven seconds.

    For signed integers, replace that single (t<x) by (t'<x'),
    where t' means flip the sign bit: t^(1<<31) for C
    or ieor(t,ishft(1,-31)) for Fortran,
    (or use hex 10000000 to specify the bit to be flipped).

    CMWC4827() used alone will pass all tests in
    the Diehard Battery of Tests of Randomness
    www.cs.hku.hk/~diehard/
    But as any RNG based on a single mathematical structure
    is likely to have potential flaws---such as, but
    perhaps not as striking as congruential RNGs "falling
    mainly in the planes"---I advocate the KISS
    (Keep-It-Simple-Stupid) approach: combine with
    Congruential and Xorshift sequences invoked inline.

    Why not splurge at a cost of 3 nanoseconds?

    George Marsaglia
    geo, Oct 22, 2010
    #1
    1. Advertising

  2. geo

    Dann Corbit Guest

    In article <df5387d8-95fe-43b0-9b56-
    >, says...
    >
    > In August 2010 I posted descriptions of a MWC
    > (Multiply-With-Carry) RNG based on, with b=2^32,
    > the prime p=8193b^4691-1. With an apparently
    > very long period and multiplier a=(2^13+1), the
    > basic MWC operation could be carried out using only
    > 32-bit operations: given 32-bit c and x, imagine
    > t=a*x+c in 64 bits, determine the new c and new x
    > as the top and bottom 32-bits of t.
    >
    > There were two drawbacks: rare nuisance cases that
    > had to be accommodated, and no proof of the order of b
    > for the prime p, the period of the RNG. We can be
    > almost---but not quite---certain that the order of
    > b mod p is k=(p-1)/2, since b^k mod p = 1 and for each
    > attainable prime divisor q of k---that is, with q one
    > of 431,18413, 15799501,1505986643549,
    > 2883504568596254032007909, b^(k/q) mod p is not 1.
    > (My thanks to Darío Alejandro Alpern of Buenos Aires for
    > providing the larger, and estimates on future, factors.)
    >
    > But there are at least two more, presently unattainable,
    > prime divisors of k, each with at least 30 digits.
    > With z a primitive root of p and b=z^j, the chances of j
    > being a multiple of one of those unknown prime divisors
    > of k seem less than 1/10^30. So we can be reasonably---
    > indeed, very---confident that the Industrial Grade Order.
    > (IGO) of b is (p-1)/2 for the prime p=8193b^4691-1.
    >
    > So, with nuisance cases and uncertainties in mind, I put
    > three computers to work searching for a prime of the
    > form p=a*b^r+1, with a=2^i-1, to avoid the nuisance
    > cases arising from a=2^13+1, and p=(2^i-1)*b^r+1 so
    > that factoring p-1 would be feasible when r>4000.
    >
    > After about 5 days, one of them found p=4095*b^4827+1.
    > And it turned out that the order of b mod p was the
    > maximum possible, k=(p-1)/2^6, (since we "lose" one 2
    > because 2 cannot be primitive, and must lose at least
    > five more because b=2^5 and k is 4095*2^154458).
    > Thus b^k mod p = 1, and for each prime divisor q of k,
    > q=2,3,5,7,13: b^(k/q) mod p is not 1.
    >
    > Unlike p=ab^r-1, a prime of the form p=ab^r+1 leads
    > to CMWC (Complimentary-Multiply-With-Carry) RNGs, in
    > which we again imagine forming t=a*x+c in 64 bits and
    > again seek c as the top 32 bits, but rather than
    > x=(t mod b) for MWC, the new x is x=(b-1)-(t mod b),
    > that is x=~(t mod b), using C's ~ op.
    >


    I encapsulated things a bit, so that the code to call the prng is not
    interspersed among the main program.

    static unsigned long Q[4827],
    carry = 1271,
    cng = 123456789,
    xs = 362436069;

    unsigned long scramble()
    {
    return (cng = 69069 * cng + 13579) + (xs ^= (xs << 13), xs ^= (xs >>
    17), xs ^= (xs << 5));
    }

    void initCMWC4827(void)
    {
    unsigned long int i;

    for (i = 0; i < 4827; i++)
    Q = scramble();
    }

    unsigned long CMWC4827(void)
    {
    unsigned long t,
    x;
    static int j = 4827;
    j = (j < 4826) ? j + 1 : 0;
    x = Q[j];
    t = (x << 12) + carry;
    carry = (x >> 20) - (t < x);
    return (Q[j] = ~(t - x));
    }

    #ifdef UNIT_TEST
    #include <stdio.h>
    int main(void)
    {
    unsigned long int i,
    x =0;

    initCMWC4827();

    for (i = 0; i < 1000000000; i++)
    x = CMWC4827();
    printf("Does x=1346668762?\n x=%lu\n", x);

    for (i = 0; i < 1000000000; i++)
    x = (CMWC4827() + scramble());
    printf("Does x=4041198809?\n x=%lu\n", x);

    return 0;
    }
    #endif

    /*
    C:\tmp>cl /DUNIT_TEST /W4 /Ox k8.c
    Microsoft (R) C/C++ Optimizing Compiler Version 16.00.30319.01 for x64
    Copyright (C) Microsoft Corporation. All rights reserved.

    k8.c
    Microsoft (R) Incremental Linker Version 10.00.30319.01
    Copyright (C) Microsoft Corporation. All rights reserved.

    /out:k8.exe
    k8.obj

    C:\tmp>k8
    Does x=1346668762?
    x=1346668762
    Does x=4041198809?
    x=4041198809

    C:\tmp>
    */
    Dann Corbit, Oct 22, 2010
    #2
    1. Advertising

  3. In comp.lang.c Dann Corbit <> wrote:
    > In article <df5387d8-95fe-43b0-9b56-
    > >, says...


    <George Marsglia's description snipped>

    > I encapsulated things a bit, so that the code to call the prng is not
    > interspersed among the main program.


    > static unsigned long Q[4827],
    > carry = 1271,
    > cng = 123456789,
    > xs = 362436069;


    > unsigned long scramble()
    > {
    > return (cng = 69069 * cng + 13579) + (xs ^= (xs << 13), xs ^= (xs >>
    > 17), xs ^= (xs << 5));
    > }


    > void initCMWC4827(void)
    > {
    > unsigned long int i;


    > for (i = 0; i < 4827; i++)
    > Q = scramble();
    > }


    > unsigned long CMWC4827(void)
    > {
    > unsigned long t,
    > x;
    > static int j = 4827;
    > j = (j < 4826) ? j + 1 : 0;
    > x = Q[j];
    > t = (x << 12) + carry;
    > carry = (x >> 20) - (t < x);
    > return (Q[j] = ~(t - x));
    > }


    > #ifdef UNIT_TEST
    > #include <stdio.h>
    > int main(void)
    > {
    > unsigned long int i,
    > x =0;


    > initCMWC4827();
    >
    > for (i = 0; i < 1000000000; i++)
    > x = CMWC4827();
    > printf("Does x=1346668762?\n x=%lu\n", x);


    > for (i = 0; i < 1000000000; i++)
    > x = (CMWC4827() + scramble());
    > printf("Does x=4041198809?\n x=%lu\n", x);


    > return 0;
    > }
    > #endif


    Based on Dann Corbit's version here's a C (C89) and C++ version
    that also (seems to?) work on 64-bit machines (that's what all
    the ANDing with 0xFFFFFFFFLU is about):

    ========= CMWC.c =====================================

    static unsigned long int Q[ 4827 ];

    unsigned long
    scramble( )
    {
    static unsigned long int cng = 123456789LU;
    static unsigned long int xs = 362436069LU;

    cng = ( 69069 * cng + 13579 ) & 0xFFFFFFFFLU;
    return ( cng + ( xs ^= ( xs << 13 ) & 0xFFFFFFFFLU,
    xs ^= ( xs >> 17 ),
    xs ^= ( xs << 5 ) & 0xFFFFFFFFLU ) ) & 0xFFFFFFFFLU;
    }

    void
    initCMWC4827( void )
    {
    unsigned int i;

    for ( i = 0; i < 4827; i++ )
    Q[ i ] = scramble( );
    }

    unsigned long int
    CMWC4827( void )
    {
    static int j = 4827;
    static unsigned long int carry = 1271LU;

    unsigned long int x = Q[ j = j < 4826 ? j + 1 : 0 ],
    t = ( ( x << 12 ) + carry ) & 0xFFFFFFFFLU;

    carry = ( x >> 20 ) - ( ( t < x ) & 0xFFFFFFFFLU );
    return Q[ j ] = ~ ( t - x ) & 0xFFFFFFFFLU;
    }


    #ifdef UNIT_TEST
    #include <stdio.h>
    int
    main( void )
    {
    unsigned long int i,

    initCMWC4827( );

    for ( i = 0; i < 1000000000; i++ )
    x = CMWC4827( );
    printf( "Does x=1346668762?\n x=%lu\n", x );

    for ( i = 0; i < 1000000000; i++ )
    x = ( CMWC4827( ) + scramble( ) ) & 0xFFFFFFFFLU;
    printf( "Does x=4041198809?\n x=%lu\n", x );

    return 0;
    }

    #endif

    ========= CMWC.c =====================================

    ========= CMWC.cpp ===================================

    class CMWC4827
    {
    public:

    CMWC4827( ) :
    cng( 123456789LU ),
    xs( 362436069LU ),
    carry( 1271LU ),
    j( 4827 )
    {
    for ( int i = 0; i < 4827; i++ )
    Q[ i ] = scramble( );
    }

    unsigned long int
    next( )
    {
    unsigned long int x = Q[ j = j < 4826 ? j + 1 : 0 ],
    t = ( ( x << 12 ) + carry ) & 0xFFFFFFFFLU;

    carry = ( x >> 20 ) - ( ( t < x ) & 0xFFFFFFFFLU );
    return Q[ j ] = ~ ( t - x ) & 0xFFFFFFFFLU;
    }

    unsigned long int
    nextWithScramble( )
    {
    return ( next( ) + scramble( ) ) & 0xFFFFFFFFLU;
    }

    private:

    unsigned long int
    scramble( )
    {
    cng = ( 69069 * cng + 13579 ) & 0xFFFFFFFFLU;
    return ( cng + ( xs ^= ( xs << 13 ) & 0xFFFFFFFFLU,
    xs ^= ( xs >> 17 ),
    xs ^= ( xs << 5 ) & 0xFFFFFFFFLU ) ) & 0xFFFFFFFFLU;
    }

    unsigned long int Q[ 4827 ];
    unsigned long int cng;
    unsigned long int xs;
    unsigned long int carry;
    int j;
    };


    #ifdef UNIT_TEST
    #include <iostream>

    int
    main( )
    {
    CMWC4827 rgen;
    unsigned long int x;

    for ( unsigned long int i = 0; i < 1000000000; i++ )
    x = rgen.next( );
    std::cout << "Does x = 1346668762?\n x = " << x << '\n';

    for ( unsigned long int i = 0; i < 1000000000; i++ )
    x = rgen.nextWithScramble( );
    std::cout << "Does x = 4041198809?\n x = " << x << '\n';

    return 0;
    }

    #endif

    ========= CMWC.cpp ===================================

    Does anyone has a better idea how to get rid/deal more effectively
    with 64-issues? Of course if one could assume the existence of a
    uint32_t things would be simple, but that requires C99 or C++0x...

    Regards, Jens
    --
    \ Jens Thoms Toerring ___
    \__________________________ http://toerring.de
    Jens Thoms Toerring, Oct 22, 2010
    #3
  4. geo

    Ian Collins Guest

    On 10/23/10 12:18 AM, geo wrote:
    >
    > Here is a C version of the resulting CMWC method for
    > p=4095*b^4827+1, using only 32-bit arithmetic, with
    > verified period 4095*2^154458, faster and simpler than
    > the previously posted MWC4691() and readily adapted
    > for either signed or unsigned integers and for Fortran
    > or other programming languages:


    I strongly recommend you use the standard fixed width types if your code
    is designed for 32 bit arithmetic. Your code will then port to 64 bit
    systems without modification.

    --
    Ian Collins
    Ian Collins, Oct 22, 2010
    #4
  5. geo

    Ian Collins Guest

    On 10/23/10 11:00 AM, Jens Thoms Toerring wrote:
    >
    > Based on Dann Corbit's version here's a C (C89) and C++ version
    > that also (seems to?) work on 64-bit machines (that's what all
    > the ANDing with 0xFFFFFFFFLU is about):
    >

    It's much simpler just to replace "unsigned long" with "uint32_t". Then
    you don't have to mess with masks, the same code works "as is" in either
    32 or 64 bit compiles.

    > Does anyone has a better idea how to get rid/deal more effectively
    > with 64-issues? Of course if one could assume the existence of a
    > uint32_t things would be simple, but that requires C99 or C++0x...


    Ah, I didn't see this until I'd written the above! Most if not all
    common platforms have uint32_t defined in their system headers (either
    in C99's <stdint.h> or elsewhere). If a platform doesn't have them,
    they are trivial to define in your own header. Much better than messing
    with masks! More so on annoying systems that have 32 bit longs even in
    64 bit mode, where the masks only waste cycles, assuming they aren't
    optimised away.

    --
    Ian Collins
    Ian Collins, Oct 22, 2010
    #5
  6. In comp.lang.c Ian Collins <> wrote:
    > Ah, I didn't see this until I'd written the above! Most if not all
    > common platforms have uint32_t defined in their system headers (either
    > in C99's <stdint.h> or elsewhere). If a platform doesn't have them,
    > they are trivial to define in your own header.


    Mmm, how do you do that without knowing anything about the
    platform? And how to distinguish between uint32_t already
    being defined or not? I would love to see some clean way to
    do that. Of course, if you can use some kind of configure
    script it's not too complicated, but I'm curious if there
    is a way to get it right without that (even on machines that
    may just have a C89 (or C++98) conforming compiler.

    > Much better than messing with masks!


    Yes, definitely if possible;-)

    Best regards, Jens
    --
    \ Jens Thoms Toerring ___
    \__________________________ http://toerring.de
    Jens Thoms Toerring, Oct 22, 2010
    #6
  7. geo

    Ian Collins Guest

    On 10/23/10 11:41 AM, Jens Thoms Toerring wrote:
    > In comp.lang.c Ian Collins<> wrote:
    >> Ah, I didn't see this until I'd written the above! Most if not all
    >> common platforms have uint32_t defined in their system headers (either
    >> in C99's<stdint.h> or elsewhere). If a platform doesn't have them,
    >> they are trivial to define in your own header.

    >
    > Mmm, how do you do that without knowing anything about the
    > platform? And how to distinguish between uint32_t already
    > being defined or not? I would love to see some clean way to
    > do that. Of course, if you can use some kind of configure
    > script it's not too complicated, but I'm curious if there
    > is a way to get it right without that (even on machines that
    > may just have a C89 (or C++98) conforming compiler.


    A configure script is the usual way to determine the presence of headers
    or types within them (by attempting to compile a snippet that uses them).

    Fixed width type definitions are totally platform and compile mode
    specific unfortunately..

    --
    Ian Collins
    Ian Collins, Oct 22, 2010
    #7
  8. (Jens Thoms Toerring) writes:
    > In comp.lang.c Ian Collins <> wrote:
    >> Ah, I didn't see this until I'd written the above! Most if not all
    >> common platforms have uint32_t defined in their system headers (either
    >> in C99's <stdint.h> or elsewhere). If a platform doesn't have them,
    >> they are trivial to define in your own header.

    >
    > Mmm, how do you do that without knowing anything about the
    > platform? And how to distinguish between uint32_t already
    > being defined or not? I would love to see some clean way to
    > do that. Of course, if you can use some kind of configure
    > script it's not too complicated, but I'm curious if there
    > is a way to get it right without that (even on machines that
    > may just have a C89 (or C++98) conforming compiler.


    Some years ago, Doug Gwyn published a set of files that implement
    some C99 (then C9x) headers in C90.

    http://www.lysator.liu.se/c/q8/

    A simplified example of how you might define uint32_t (not tested):

    #include <limits.h>
    #if ULONG_MAX == 4294967295
    typedef unsigned long uint32_t;
    #elif UINT_MAX == 4294967295
    typedef unsigned int uint32_t;
    #elif USHRT_MAX == 4294967295
    typedef unsigned short uint32_t
    #elif UCHAR_MAX == 4294967295 /* unlikely */
    typedef unsigned char uint32_t
    #else
    #error "Unable to determine type for uint32_t
    #endif

    This ignores padding bits and probably a few other possibilities.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Oct 23, 2010
    #8
  9. On 2010-10-23, christian.bau <> wrote:
    > into shifts if that is faster. I found that using "long" for "j" tends
    > to produce faster code on 64 bit implementations and doesn't change
    > anything on 32 bit implementations.


    Since you're using C99 types anyway (and I agree that that's the
    sensible thing to do), you might try using (u)int_fast16_t for j.

    --
    Ilmari Karonen
    To reply by e-mail, please replace ".invalid" with ".net" in address.
    Ilmari Karonen, Oct 23, 2010
    #9
  10. In comp.lang.c christian.bau <> wrote:
    > On Oct 22, 11:00 pm, (Jens Thoms Toerring) wrote:


    > > unsigned long int
    > > CMWC4827( void )
    > > {
    > >     static int j = 4827;
    > >     static unsigned long int carry = 1271LU;
    > >
    > >     unsigned long int x = Q[ j = j < 4826 ? j + 1 : 0 ],
    > >                       t = ( ( x << 12 ) + carry ) & 0xFFFFFFFFLU;
    > >
    > >     carry = ( x >> 20 ) - ( ( t < x ) & 0xFFFFFFFFLU );
    > >     return Q[ j ] = ~ ( t - x ) & 0xFFFFFFFFLU;
    > >
    > > }


    > It seems the original source code emulated 64 bit arithmetic using 32
    > bit arithmetic, and your code now uses 64 bit arithmetic to emulate 32
    > bit arithmetic emulating 64 bit arithmetic. I'd write, assuming that Q
    > is defined as uint32_t:


    > uint32_t CMWC4827 (void)
    > {
    > static long j = 4827;
    > static uint64_t carry = 1271;


    > if (++j >= 4827) j = 0;
    > uint64_t x = Q [j] * (uint64_t) 4095 + carry;
    > carry = x >> 32;
    > return (Q [j] = x);
    > }


    I see. There's just an omission in the last line, it must be

    return Q[ j ] = ~ x;

    Now that's quite a bit faster;-)

    Best regards, Jens
    --
    \ Jens Thoms Toerring ___
    \__________________________ http://toerring.de
    Jens Thoms Toerring, Oct 23, 2010
    #10
  11. In comp.lang.c Keith Thompson <> wrote:
    > (Jens Thoms Toerring) writes:
    > > In comp.lang.c Ian Collins <> wrote:
    > >> Ah, I didn't see this until I'd written the above! Most if not all
    > >> common platforms have uint32_t defined in their system headers (either
    > >> in C99's <stdint.h> or elsewhere). If a platform doesn't have them,
    > >> they are trivial to define in your own header.

    > >
    > > Mmm, how do you do that without knowing anything about the
    > > platform? And how to distinguish between uint32_t already
    > > being defined or not? I would love to see some clean way to
    > > do that. Of course, if you can use some kind of configure
    > > script it's not too complicated, but I'm curious if there
    > > is a way to get it right without that (even on machines that
    > > may just have a C89 (or C++98) conforming compiler.


    > Some years ago, Doug Gwyn published a set of files that implement
    > some C99 (then C9x) headers in C90.


    > http://www.lysator.liu.se/c/q8/


    > A simplified example of how you might define uint32_t (not tested):


    > #include <limits.h>
    > #if ULONG_MAX == 4294967295
    > typedef unsigned long uint32_t;
    > #elif UINT_MAX == 4294967295
    > typedef unsigned int uint32_t;
    > #elif USHRT_MAX == 4294967295
    > typedef unsigned short uint32_t
    > #elif UCHAR_MAX == 4294967295 /* unlikely */
    > typedef unsigned char uint32_t
    > #else
    > #error "Unable to determine type for uint32_t
    > #endif


    > This ignores padding bits and probably a few other possibilities.


    Thanks, I will play around with these;-)

    Best regards, Jens
    --
    \ Jens Thoms Toerring ___
    \__________________________ http://toerring.de
    Jens Thoms Toerring, Oct 23, 2010
    #11
  12. In comp.lang.c christian.bau <> wrote:
    > On Oct 23, 4:56 pm, Ilmari Karonen <> wrote:


    > > Since you're using C99 types anyway (and I agree that that's the
    > > sensible thing to do), you might try using (u)int_fast16_t for j.


    > Good idea. gcc 4.2 contains the following in its headers:


    > typedef int8_t int_fast8_t;
    > typedef int16_t int_fast16_t;
    > typedef int32_t int_fast32_t;
    > typedef int64_t int_fast64_t;
    > typedef uint8_t uint_fast8_t;
    > typedef uint16_t uint_fast16_t;
    > typedef uint32_t uint_fast32_t;
    > typedef uint64_t uint_fast64_t;


    > I can assure you that using int_fast16_t or uint_fast16_t for j with
    > these definitions will run slower than using "long" when using the gcc
    > compiler on an x86 32 or 64 bit processor. Which is sad. An
    > alternative approach would be not to use an integer j, but a pointer
    > which points to an element of the array Q.


    Seems to be already corrected in gcc 4.4.3, there I have

    /* Unsigned. */
    typedef unsigned char uint_fast8_t;
    #if __WORDSIZE == 64
    typedef unsigned long int uint_fast16_t;
    typedef unsigned long int uint_fast32_t;
    typedef unsigned long int uint_fast64_t;
    #else
    typedef unsigned int uint_fast16_t;
    typedef unsigned int uint_fast32_t;
    __extension__
    typedef unsigned long long int uint_fast64_t;
    #endif

    and I don't see any noticable difference between using long or
    uint_fast32_t here.
    Regards, Jens
    --
    \ Jens Thoms Toerring ___
    \__________________________ http://toerring.de
    Jens Thoms Toerring, Oct 23, 2010
    #12
  13. geo

    geo Guest

    On Oct 25, 7:21 pm, "christian.bau" <>
    wrote:
    > On Oct 22, 12:18 pm, geo <> wrote:
    >
    > > CMWC4827 RNG snipped
    > > George Marsaglia

    >
    > I tried to figure out the maths behind this, and I ended up with a
    > just slightly different algorithm:
    >
    > The important number is p = 4095 * (2^32)^4827 + 1, which is a prime.
    > Let M = (2^32)^4827, then p = 4095 * M + 1. We can start with any
    > number 1 <= x < p, then replace x with (4095 x) modulo p, and again
    > and again. Each value x produces 4827 random integers of 32 bits. The
    > result x will never be 0.
    >
    > A slight change: Instead of storing x, we store x-1. So instead of
    > replacing x with (4095 x) modulo p, we add 1 to x, calculate (4095 x)
    > modulo p, and subtract 1. So we calculate ((4095 (x+1)) modulo p) - 1
    > = (4095 x + 4094) modulo p. Really the same thing, but we now have 0
    > <= x < 4095 * M.
    >
    > Now assume x = X + c * M, where 0 <= X < M, and 0 <= c < 4095. Then
    >
    >     (4095 x + 4094) modulo p =
    >     (4095 * (X + c * M) + 4094) modulo p =
    >     (4095 * X + 4094 + c * (4095 M)) modulo p =
    >     (4095 * X + 4094 + c * (p - 1)) modulo p =
    >     (4095 * X + 4094 - c) modulo p =
    >     4095 * X + (4094 - c).
    >
    > We store X in an array of 4827 32-bit integers and keep c separate.
    > Initialiase X to random integers and c to a random value from 0 to
    > 4094. The random number calculation then is
    >
    > unsigned long   CMWC4827(void)
    > {
    >     unsigned long   t,
    >                     x;
    >     static int      j = 4827;
    >     if (j < 4826) ++j; else { j = 0; c = 4094 - c; }
    >     x = Q[j];
    >     t = (x << 12) + carry;
    >     carry = (x >> 20) - (t < x);
    >     return (Q[j] = (t - x));
    >
    > }
    >
    > So the change is that we replace c with 4094-c when we restart at the
    > beginning of the array, and don't do the "not" operation in the
    > calculation of the next component of x. You could change this to use
    > 64 bit arithmetic and write
    >
    > uint32_t CMWC4827 (void)
    > {
    >     static long j = 4827;
    >     static uint64_t carry = 1271;
    >     if (++j >= 4827) { j = 0; carry = 4094 - carry; }
    >     uint64_t x = Q [j] * (uint64_t) 4095 + carry;
    >     carry = x >> 32;
    >     return (Q [j] = x);
    >
    >
    >
    > }- Hide quoted text -
    >
    > - Show quoted text -


    With b=2^32 and p=4095*b^4827+1, your suggestion
    will generate a sequence of 4827 "digits" forming,
    in reverse order, the base-b expansion of j1/(p-2)
    for some j1, then jump to another set of 4827 in
    the expansion of j2/(p-2) for some other j2, and so on.
    Such "digits" may well serve as satisfactory 32-bit
    random integers, but we cannot be sure of the period,
    or that we will avoid partial overlapping with string
    segments already generated.

    If you change your suggested method so as to keep
    the old carry c---rather than reset c before
    refilling the 4827 array---and go ahead with the
    MWC rather than CMWC method, you will generate,
    in reverse order, the full base-b MWC expansion of
    j/m, with m=4095*b^4827-1 and period the order of b
    for the composite m.

    The CMWC method produces, in reverse order,
    the full base-b=2^32 digits in the expansion of
    some j/p, with p=4095*b^4827+1 and j determined by
    the 4827 seed values and initial c in 0<=c<4095.

    The order of b for the MWC method applied to m
    probably differs from the CMWC order, 4095*2^154458,
    by at most a few dozen powers of 10, and, with 10^46500,
    we have 46,500 in hand.

    But finding that IGO, (Industrial-Grade-Order), which
    amounts to finding all, or all the up-to-30-digit
    prime factors of m=4095*b^4827-1, say p1,p2,...,pn,
    then
    IGO = lcm(order(b,p1),order(b,p2)...,order(b,pn))
    is difficult.

    I have considered that approach, but using
    m=a*b^4096-1, or even m=a*b^8192-1 with a=2^i-1,
    so that one can more easily effect incrementing
    the index of the array element.
    With a couple of my PCs working on finding
    up-to-30-digit factors of such composites,
    I hope to report some interesting IGOs soon.

    Can any of you, with time and factoring
    programs available, find all---or all the
    up-to-30-digit---prime factors of a*b^4096-1
    or a*b^8192-1 with b=2^32 and a=2^i-1, for MWC?

    Or the same for CMWC primes or composites,
    a*b^4096+1 or a*b^8192+1?


    George Marsaglia
    geo, Oct 26, 2010
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mikolaj Machowski

    Querying Schema/RNG files

    Mikolaj Machowski, Feb 14, 2006, in forum: XML
    Replies:
    3
    Views:
    627
    George Bina
    Feb 16, 2006
  2. Jerome

    RNG for FPGA

    Jerome, Jul 25, 2006, in forum: VHDL
    Replies:
    0
    Views:
    570
    Jerome
    Jul 25, 2006
  3. m g william
    Replies:
    2
    Views:
    256
    Paul McGuire
    Oct 4, 2006
  4. ilias
    Replies:
    5
    Views:
    378
    Pete Becker
    Jul 27, 2006
  5. Francois Grieu
    Replies:
    7
    Views:
    412
    Ben Bacarisse
    Apr 8, 2009
Loading...

Share This Page