Re: Random 64 bit number in stdc

Discussion in 'C Programming' started by jadill33@gmail.com, Aug 7, 2013.

  1. Guest

    On Wednesday, August 7, 2013 5:07:44 AM UTC-4, Guillaume Dargaud wrote:
    > Hello all,
    >
    > any idea on writing a macro to generate a pseudo-random number using all 64
    >
    > bits of a unsigned long long ?
    >
    >
    >
    > The problem is that RAND_MAX varies greatly between compilers: I have
    >
    > examples where it's 0x7FFF, 0x7FFFFFFF and 0xFFFFFFFF but no 64 bit version.
    >
    >
    >
    > I'd like to stay with the old rand() of standard C.
    >
    >
    >
    > For instance one solution when RAND_MAX is 0x7FFF is the following:
    >
    > #define RAND64 ( (rand()<<60ULL) | (rand()<<45ULL) | (rand()<<30ULL) |
    >
    > (rand()<<15ULL) | (rand()<<0ULL))
    >
    >
    >
    > But it gives me
    >
    > Warning: Conversion from 'unsigned __int64' to 'int' might lose data.
    >
    > Which kind of surprises me but maybe I'm using it wrong.
    >
    >
    >
    > Anybody can think of a more generic solution independent of RAND_MAX ?


    Here's an excerpt of what I use for type 'int'. It essentially fills
    in each byte with a random number, since 'RAND_MAX' is likely to be
    larger than UCHAR_MAX. There sample has some extra constraint
    checking ('c_return_value_if_fail' and 'c_unexpected'), and 'C_ABS'
    is a macro for absolute value. You should be able to apply the same
    principle to an 'unsigned long long'. If sequential random samples
    from 'rand()' were truly independent, I think it should be a viable
    way to construct a wider random integer value.

    \code
    /*!
    * \brief The maximum number of \c rand samples \c c_random may use
    * before exceeding the retry count.
    *
    * One can more easily test the behavior of the retry count feature
    * by reducing the number of samples to 1.
    */
    #define GC_MAXIMUM_RAND_SAMPLES 10

    unsigned int c_time_seed( void )
    {
    time_t now;
    unsigned char* p = (unsigned char*)&now;
    unsigned int seed = 0;
    size_t byte_itr;

    now = time( NULL );

    for ( byte_itr = 0; byte_itr < sizeof (time_t); ++byte_itr ) {
    seed = seed * ( UCHAR_MAX + 2U ) + p[byte_itr];
    }

    return seed;
    }

    int c_random( int n )
    {
    int limit;
    int r;
    int bias_retry_count;

    c_return_value_if_fail( 0 <= n && n <= RAND_MAX, rand() );

    limit = RAND_MAX - RAND_MAX % n;

    bias_retry_count = GC_MAXIMUM_RAND_SAMPLES;
    do {
    r = rand();
    } while ( bias_retry_count-- > 0 && r >= limit );

    c_unexpected( bias_retry_count < 0 );

    return r % n;
    }

    int c_rand_int( void )
    {
    int rnd = 0;
    size_t n;
    unsigned char* ucp;

    n = sizeof (int);
    ucp = (unsigned char*)&rnd;

    /* Assemble a random integer of width 'int' by filling each byte
    with a random byte from rand(). */
    while ( n-- != 0 ) {
    *ucp++ = (unsigned char)c_random( UCHAR_MAX + 1 );
    }

    /*
    * If the 'int' representation is two's complement, assign INT_MIN
    * to zero to prevent integer overflow when taking the absolute
    * value. The assignment also happens to make the distribution
    * more uniform.
    */
    #if ( INT_MIN < -INT_MAX )
    if ( rnd == INT_MIN ) {
    rnd = 0;
    }
    #endif

    return C_ABS( rnd );
    }
    \endcode

    Whether this is any better or (likely) worse than other random number
    generators is unknown to me.

    Best regards,
    John D.

    (First time posting under new google groups, I guess I'll see how bad
    it is when I post it).
    , Aug 7, 2013
    #1
    1. Advertising

  2. Tim Rentsch Guest

    writes:

    > On Wednesday, August 7, 2013 5:07:44 AM UTC-4, Guillaume Dargaud wrote:
    >> Hello all,
    >> any idea on writing a macro to generate a pseudo-random number using all 64
    >> bits of a unsigned long long ? [snip]
    >> Anybody can think of a more generic solution independent of RAND_MAX ?

    >
    > Here's an excerpt of what I use for type 'int'. It essentially
    > fills in each byte with a random number, since 'RAND_MAX' is
    > likely to be larger than UCHAR_MAX. [and takes the absolute
    > value to produce a non-negative number] [snip elaboration]


    There are two things wrong with this approach:

    1. It can generate trap representations. Note that trap
    representations may exist even in implementations that use
    twos complement and have no padding bits.

    2. The possible existence of multiple representations for a
    single value make it very difficult to produce a truly
    uniform distribution.

    A better approach is to generate an appropriate value directly in
    unsigned char units, eg

    #include <limits.h>

    #define INT_BIT /* .. number of bits in INT_MAX .. */
    /* This can be done directly from INT_MAX using
    the well known IMAX_BITS macro */

    int
    uniform_nonnegative_int(){
    unsigned mask_shift = (CHAR_BIT-1) - (INT_BIT-1) % CHAR_BIT;
    unsigned mask = (unsigned char) -1 >> mask_shift;
    int r = mask & (unsigned char) rand();
    int i = (INT_BIT-1) / CHAR_BIT;

    while( i-- > 0 ) r = r << CHAR_BIT + (unsigned char) rand();
    return r;
    }

    Disclaimer: typed in, not compiled.
    Tim Rentsch, Aug 8, 2013
    #2
    1. Advertising

  3. Phil Carmody Guest

    Tim Rentsch <> writes:
    > writes:

    ....
    > A better approach is to generate an appropriate value directly in
    > unsigned char units, eg
    >
    > #include <limits.h>
    >
    > #define INT_BIT /* .. number of bits in INT_MAX .. */
    > /* This can be done directly from INT_MAX using
    > the well known IMAX_BITS macro */
    >
    > int
    > uniform_nonnegative_int(){
    > unsigned mask_shift = (CHAR_BIT-1) - (INT_BIT-1) % CHAR_BIT;
    > unsigned mask = (unsigned char) -1 >> mask_shift;
    > int r = mask & (unsigned char) rand();
    > int i = (INT_BIT-1) / CHAR_BIT;
    >
    > while( i-- > 0 ) r = r << CHAR_BIT + (unsigned char) rand();


    Conventional wisdom, from decades of experience, says that the lower
    bits tend to be at least as crappy as the upper bits, so shifting them
    out is better than masking them in. I would also presume the "planes"
    property (relations between subseqent terms) is less likely to kick in
    too, but don't have the mathematical teeth to prove such a claim.

    Phil
    --
    If "law-abiding citizens have nothing to fear" from privacy-invading
    technologies and policies, then law-abiding governments should have
    nothing to fear from whistleblowers.
    Phil Carmody, Aug 13, 2013
    #3
  4. Jorgen Grahn Guest

    On Tue, 2013-08-13, Phil Carmody wrote:
    ....
    > Conventional wisdom, from decades of experience, says that the lower
    > bits tend to be at least as crappy as the upper bits, so shifting them
    > out is better than masking them in.


    Decades of experience, or decades-old experience?

    If you're saying you have a rand() with bad randomness in the low bits
    /today/, I'm interested to learn which implementation that is. I've
    asked a few times, but as far as I can recall noone has claimed to
    have seen one since the 1990s or so.

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
    Jorgen Grahn, Aug 13, 2013
    #4
  5. Öö Tiib Guest

    On Tuesday, 13 August 2013 22:34:05 UTC+3, Jorgen Grahn wrote:
    > On Tue, 2013-08-13, Phil Carmody wrote:
    > ...
    > > Conventional wisdom, from decades of experience, says that the lower
    > > bits tend to be at least as crappy as the upper bits, so shifting them
    > > out is better than masking them in.

    >
    > Decades of experience, or decades-old experience?
    >
    > If you're saying you have a rand() with bad randomness in the low bits
    > /today/, I'm interested to learn which implementation that is. I've
    > asked a few times, but as far as I can recall noone has claimed to
    > have seen one since the 1990s or so.


    What implementation has good rand()? Usually it is not good for anything
    serious. Even portable games can not use it because its badness is not
    portable (you get different artifacts of rand() weaknesses on different
    platforms). C++11 standard library <random> has useful generators.
    Öö Tiib, Aug 14, 2013
    #5
  6. Nobody Guest

    On Tue, 13 Aug 2013 19:34:05 +0000, Jorgen Grahn wrote:

    > If you're saying you have a rand() with bad randomness in the low bits
    > /today/, I'm interested to learn which implementation that is. I've asked
    > a few times, but as far as I can recall noone has claimed to have seen one
    > since the 1990s or so.


    The MSVCRT rand() in Visual Studio 10 uses an LCG with a 32-bit state, and
    uses bits 16-31 of the state as the result, so the LSB will have a period
    of at most 2^17.

    Whether or not that's "bad" is subjective, but the lower bits will have
    shorter periods than the higher bits.

    If by "bad" you mean "returns the low bits of an LCG instead of the high
    bits", then I haven't seen that since around 1993-94 (and the compiler
    wasn't necessarily the latest version).
    Nobody, Aug 14, 2013
    #6
  7. Geoff Guest

    On Wed, 14 Aug 2013 07:30:46 +0100, Nobody <> wrote:

    >On Tue, 13 Aug 2013 19:34:05 +0000, Jorgen Grahn wrote:
    >
    >> If you're saying you have a rand() with bad randomness in the low bits
    >> /today/, I'm interested to learn which implementation that is. I've asked
    >> a few times, but as far as I can recall noone has claimed to have seen one
    >> since the 1990s or so.

    >
    >The MSVCRT rand() in Visual Studio 10 uses an LCG with a 32-bit state, and
    >uses bits 16-31 of the state as the result, so the LSB will have a period
    >of at most 2^17.
    >
    >Whether or not that's "bad" is subjective, but the lower bits will have
    >shorter periods than the higher bits.
    >
    >If by "bad" you mean "returns the low bits of an LCG instead of the high
    >bits", then I haven't seen that since around 1993-94 (and the compiler
    >wasn't necessarily the latest version).


    In Visual Studio versions 2005 and later, rand_s() calls
    RtlGenRandom() which is available on Windows XP and later systems. It
    is a cryptographically secure (whatever that means these days) random
    number generator that generates a random number between 0 and UINT_MAX
    and doesn't depend on srand().
    Geoff, Aug 14, 2013
    #7
  8. Nobody Guest

    On Wed, 14 Aug 2013 14:34:38 -0700, Geoff wrote:

    > In Visual Studio versions 2005 and later, rand_s() calls
    > RtlGenRandom() which is available on Windows XP and later systems. It
    > is a cryptographically secure (whatever that means these days) random
    > number generator that generates a random number between 0 and UINT_MAX
    > and doesn't depend on srand().


    In fact, it doesn't seem to be able to be seeded at all (and it isn't
    clear whether it's actually pseudo-random or random), which makes it
    effectively useless for many applications requiring pseudo-random numbers.
    Nobody, Aug 15, 2013
    #8
  9. Phil Carmody Guest

    Jorgen Grahn <> writes:

    > On Tue, 2013-08-13, Phil Carmody wrote:
    > ...
    > > Conventional wisdom, from decades of experience, says that the lower
    > > bits tend to be at least as crappy as the upper bits, so shifting them
    > > out is better than masking them in.

    >
    > Decades of experience, or decades-old experience?


    The former, and fortunately enough of a memory for the latter too.

    > If you're saying you have a rand() with bad randomness in the low bits
    > /today/, I'm interested to learn which implementation that is. I've
    > asked a few times, but as far as I can recall noone has claimed to
    > have seen one since the 1990s or so.


    Well, lets try the system I'm typing on right now:
    gcc 4.6/eglibc6 2.15/linux 3.5/x86_64
    but I'm 100% sure I could reproduce these on systems which differ
    from that configuration in every possible way.


    dieharder on 8*268435456 bits of the 1u<<0 bit, packed into u8s:

    #=============================================================================#
    test_name |ntup| tsamples |psamples| p-value |Assessment
    #=============================================================================#
    diehard_operm5| 0| 1000000| 100|0.00000000| FAILED
    diehard_rank_6x8| 0| 100000| 100|0.00000000| FAILED
    diehard_oqso| 0| 2097152| 100|0.00000000| FAILED
    diehard_count_1s_byt| 0| 256000| 100|0.00000000| FAILED
    diehard_parking_lot| 0| 12000| 100|0.00000000| FAILED
    diehard_2dsphere| 2| 8000| 100|0.00000000| FAILED
    diehard_3dsphere| 3| 4000| 100|0.00000000| FAILED
    diehard_squeeze| 0| 100000| 100|0.00000000| FAILED
    marsaglia_tsang_gcd| 0| 10000000| 100|0.00000000| FAILED
    rgb_bitdist| 3| 100000| 100|0.00001119| WEAK
    rgb_bitdist| 4| 100000| 100|0.00000000| FAILED
    rgb_bitdist| 5| 100000| 100|0.00000000| FAILED
    rgb_bitdist| 6| 100000| 100|0.00000000| FAILED
    rgb_bitdist| 7| 100000| 100|0.00000000| FAILED
    rgb_bitdist| 8| 100000| 100|0.00007518| WEAK
    ....

    dieharder on 8*268435456 bits of the 1u<<14 bit, packed into u8s:

    #=============================================================================#
    test_name |ntup| tsamples |psamples| p-value |Assessment
    #=============================================================================#
    diehard_operm5| 0| 1000000| 100|0.54507141| PASSED
    diehard_rank_6x8| 0| 100000| 100|0.82359872| PASSED
    diehard_oqso| 0| 2097152| 100|0.19046187| PASSED
    diehard_count_1s_byt| 0| 256000| 100|0.67168735| PASSED
    diehard_parking_lot| 0| 12000| 100|0.41559726| PASSED
    diehard_2dsphere| 2| 8000| 100|0.18176928| PASSED
    diehard_3dsphere| 3| 4000| 100|0.00056702| WEAK
    diehard_squeeze| 0| 100000| 100|0.40881629| PASSED
    marsaglia_tsang_gcd| 0| 10000000| 100|0.00004519| WEAK
    rgb_bitdist| 3| 100000| 100|0.91896269| PASSED
    rgb_bitdist| 4| 100000| 100|0.19499995| PASSED
    rgb_bitdist| 5| 100000| 100|0.08335413| PASSED
    rgb_bitdist| 6| 100000| 100|0.56964832| PASSED
    rgb_bitdist| 7| 100000| 100|0.98227556| PASSED
    rgb_bitdist| 8| 100000| 100|0.26112239| PASSED
    ....


    People I know claim to see weak default PRNGs all the time, I have no
    idea where your 1990s claim comes from. You seriously need to reevaluate
    your presumptions in this direction.

    Phil
    --
    If "law-abiding citizens have nothing to fear" from privacy-invading
    technologies and policies, then law-abiding governments should have
    nothing to fear from whistleblowers.
    Phil Carmody, Aug 15, 2013
    #9
  10. Jorgen Grahn Guest

    On Wed, 2013-08-14, Nobody wrote:
    > On Tue, 13 Aug 2013 19:34:05 +0000, Jorgen Grahn wrote:
    >
    >> If you're saying you have a rand() with bad randomness in the low bits
    >> /today/, I'm interested to learn which implementation that is. I've asked
    >> a few times, but as far as I can recall noone has claimed to have seen one
    >> since the 1990s or so.

    >
    > The MSVCRT rand() in Visual Studio 10 uses an LCG with a 32-bit state, and
    > uses bits 16-31 of the state as the result, so the LSB will have a period
    > of at most 2^17.
    >
    > Whether or not that's "bad" is subjective, but the lower bits will have
    > shorter periods than the higher bits.
    >
    > If by "bad" you mean "returns the low bits of an LCG instead of the high
    > bits", then I haven't seen that since around 1993-94 (and the compiler
    > wasn't necessarily the latest version).


    I was looking more generally for a rand() where 'rand() % N' (adjusted
    for the case when RAND_MAX % N != 0) is not the best way to use it to
    generate a random number between 0 and N-1.

    So yes, your VS10 rand() qualifies.

    /Jorgen
    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
    Jorgen Grahn, Aug 17, 2013
    #10
  11. Jorgen Grahn Guest

    On Thu, 2013-08-15, Phil Carmody wrote:
    > Jorgen Grahn <> writes:
    >
    >> On Tue, 2013-08-13, Phil Carmody wrote:
    >> ...
    >> > Conventional wisdom, from decades of experience, says that the lower
    >> > bits tend to be at least as crappy as the upper bits, so shifting them
    >> > out is better than masking them in.

    >>
    >> Decades of experience, or decades-old experience?

    >
    > The former, and fortunately enough of a memory for the latter too.
    >
    >> If you're saying you have a rand() with bad randomness in the low bits
    >> /today/, I'm interested to learn which implementation that is. I've
    >> asked a few times, but as far as I can recall noone has claimed to
    >> have seen one since the 1990s or so.

    >
    > Well, lets try the system I'm typing on right now:
    > gcc 4.6/eglibc6 2.15/linux 3.5/x86_64
    > but I'm 100% sure I could reproduce these on systems which differ
    > from that configuration in every possible way.
    >
    >
    > dieharder on 8*268435456 bits of the 1u<<0 bit, packed into u8s:
    >
    > #==========================================================
    > test_name |ntup| tsamples |psamples| p-value |Assessment
    > #==========================================================
    > diehard_operm5| 0| 1000000| 100|0.00000000| FAILED
    > diehard_rank_6x8| 0| 100000| 100|0.00000000| FAILED

    [etc]

    This tells me nothing, I'm afraid. My problem starts at the Linux libc
    rand(3) manual, which says:

    The versions of rand() and srand() in the Linux C Library use the
    same random number generator as random(3) and srandom(3), so the
    lower-order bits should be as random as the higher-order bits.
    However, on older rand() implementations, and on current
    implementations on different systems, the lower-order bits are
    much less random than the higher- order bits. Do not use this
    function in applications intended to be portable when good
    randomness is needed. (Use random(3) instead.)

    (random(3) came from BSD Unix and is part of POSIX, although I'm not
    sure POSIX specifies the algorithm, size of the state and so on.)

    So the quoted text makes me wonder which these unnamed "current
    implementations on different systems" are. For example, lots of Unix
    clones with SYSV-based standard libraries have faded away into
    irrelevance (for most people) in the last few decades.

    > People I know claim to see weak default PRNGs all the time,


    Depends on what they mean by weak. I'm not expecting rand() to be
    suitable for cryptography, for example. I'm only after the "low-order
    bits are less random" claim.

    > I have no
    > idea where your 1990s claim comes from. You seriously need to reevaluate
    > your presumptions in this direction.


    The "but as far as I can recall noone has claimed to have seen one" is
    based on several earlier discussions like this one in comp.lang.c,
    comp.lang.c++ and comp.lang.c++.moderated, since 2003 or so.

    Looking through my archives, I see that
    - Keith Thompson reported that the Solaris legacy compiler /usr/ucb/cc
    links with a bad rand() [presumably the classic Unix one] but that
    he was unsure if you could even compile ANSI C with it.
    - James Kanze wrote (in 2004) that "quite some time ago" he used to
    see bad implementations "but the trend seemed to be to replacing
    them".
    - "Nobody" mentioned (in this thread) Visual Studio 10 where low bits
    have a shorter period, but IIUC it's not nearly as bad as the
    classic rand().

    That's the only concrete examples I've seen.

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
    Jorgen Grahn, Aug 17, 2013
    #11
  12. James Kuyper Guest

    On 08/17/2013 06:38 AM, Jorgen Grahn wrote:
    ....
    > (random(3) came from BSD Unix and is part of POSIX, although I'm not
    > sure POSIX specifies the algorithm, size of the state and so on.)


    "The random() function shall use a non-linear additive feedback
    random-number generator employing a default state array size of 31 long
    integers to return successive pseudo-random numbers in the range from 0
    to 231-1. The period of this random-number generator is approximately 16
    x (231-1). The size of the state array determines the period of the
    random-number generator. Increasing the state array size shall increase
    the period.
    With 256 bytes of state information, the period of the random-number
    generator shall be greater than 269.

    Like rand(), random() shall produce by default a sequence of numbers
    that can be duplicated by calling srandom() with 1 as the seed.

    The srandom() function shall initialize the current state array using
    the value of seed.

    The initstate() and setstate() functions handle restarting and changing
    random-number generators. The initstate() function allows a state array,
    pointed to by the state argument, to be initialized for future use. The
    size argument, which specifies the size in bytes of the state array,
    shall be used by initstate() to decide what type of random-number
    generator to use; the larger the state array, the more random the
    numbers. Values for the amount of state information are 8, 32, 64, 128,
    and 256 bytes. Other values greater than 8 bytes are rounded down to the
    nearest one of these values. If initstate() is called with 8<=size<32,
    then random() shall use a simple linear congruential random number
    generator. The seed argument specifies a starting point for the
    random-number sequence and provides for restarting at the same point.
    The initstate() function shall return a pointer to the previous state
    information array.

    If initstate() has not been called, then random() shall behave as though
    initstate() had been called with seed=1 and size=128."
    --
    James Kuyper
    James Kuyper, Aug 17, 2013
    #12
  13. James Kuyper Guest

    On 08/17/2013 07:51 AM, James Kuyper wrote:
    ....
    > integers to return successive pseudo-random numbers in the range from 0
    > to 231-1. The period of this random-number generator is approximately 16
    > x (231-1). The size of the state array determines the period of the
    > random-number generator. Increasing the state array size shall increase
    > the period.
    > With 256 bytes of state information, the period of the random-number
    > generator shall be greater than 269. ...


    That was a cut-and-paste, and I didn't notice some formatting that
    didn't paste properly 231 => 2^31, 269=>2^69
    James Kuyper, Aug 19, 2013
    #13
  14. Nobody Guest

    On Sat, 17 Aug 2013 10:38:18 +0000, Jorgen Grahn wrote:

    > - "Nobody" mentioned (in this thread) Visual Studio 10 where low bits
    > have a shorter period, but IIUC it's not nearly as bad as the
    > classic rand().


    That depends upon which version you consider "the classic rand()".

    If you're referring to RANDU, then almost nothing's that bad. But that
    hasn't been used for decades.

    Many early C implementations (and some current implementations) used a
    32-bit LCG. Many (most?) returned bits 16-30, although at least one
    returned bits 0-15.

    rand() has a return type of "int" and is defined to return non-negative
    values, hence the 15-bit limit on 16-bit systems. On early 32-bit systems,
    some implementations returned all bits and left the choice of which bits
    were good enough to the programmer (who would often take the lower bits by
    mistake).

    All LCGs inevitably have the issue that the lower bits have shorter
    periods than the higher bits, as carries propagate from low to high. If
    enough of the low bits are ignored, the period of the lowest returned bit
    may be high enough, but it will still only be half the period of the next
    bit.

    Similarly, all LCGs have the "hyperplane" property, i.e. creating an
    N-tuple from N consecutive samples will result in tuples which lie on at
    most (2^K)^(1/N) hyperplanes, where K is the number of state bits (not
    returned bits; e.g. for a 48-bit LCG which returns 32 bits, 4-tuples will
    lie in 2^12 hyperplanes, not 2^8.

    This much seems to be widely understood now, at least by the compiler
    vendors. rand() is often still implemented using an LCG for simplicity,
    but it's unlikely to have defects which make it behave worse than the
    theoretical limits for an LCG of its size.
    Nobody, Aug 19, 2013
    #14
  15. Tim Rentsch Guest

    Phil Carmody <> writes:

    > Tim Rentsch <> writes:
    >> writes:

    > ...
    >> A better approach is to generate an appropriate value directly in
    >> unsigned char units, eg
    >>
    >> #include <limits.h>
    >>
    >> #define INT_BIT /* .. number of bits in INT_MAX .. */
    >> /* This can be done directly from INT_MAX using
    >> the well known IMAX_BITS macro */
    >>
    >> int
    >> uniform_nonnegative_int(){
    >> unsigned mask_shift = (CHAR_BIT-1) - (INT_BIT-1) % CHAR_BIT;
    >> unsigned mask = (unsigned char) -1 >> mask_shift;
    >> int r = mask & (unsigned char) rand();
    >> int i = (INT_BIT-1) / CHAR_BIT;
    >>
    >> while( i-- > 0 ) r = r << CHAR_BIT + (unsigned char) rand();

    >
    > Conventional wisdom, from decades of experience, says that the lower
    > bits tend to be at least as crappy as the upper bits, so shifting
    > them out is better than masking them in. I would also presume the
    > "planes" property (relations between subseqent terms) is less likely
    > to kick in too, but don't have the mathematical teeth to prove such
    > a claim.


    Implementations of rand() may be of arbitrarily low quality, and I
    grant you that some are pretty bad. Even so, I consider it a
    disservice to promulgate folklore concerning deficiencies in rand(),
    even if that folklore is known to have been (or even still be) correct
    for some implementations. If one doesn't mind getting low quality
    random numbers, the code might just as well use rand() and treat it
    as if all its outputs bits are equally good. If one does mind getting
    low quality random numbers, the only sensible path is not to use
    rand() at all, and use something else that does have high quality
    results across all bit positions.
    Tim Rentsch, Aug 21, 2013
    #15
  16. Phil Carmody Guest

    Tim Rentsch <> writes:
    > Phil Carmody <> writes:
    > > Tim Rentsch <> writes:
    > >> writes:

    > > ...
    > >> A better approach is to generate an appropriate value directly in
    > >> unsigned char units, eg
    > >>
    > >> #include <limits.h>
    > >>
    > >> #define INT_BIT /* .. number of bits in INT_MAX .. */
    > >> /* This can be done directly from INT_MAX using
    > >> the well known IMAX_BITS macro */
    > >>
    > >> int
    > >> uniform_nonnegative_int(){
    > >> unsigned mask_shift = (CHAR_BIT-1) - (INT_BIT-1) % CHAR_BIT;
    > >> unsigned mask = (unsigned char) -1 >> mask_shift;
    > >> int r = mask & (unsigned char) rand();
    > >> int i = (INT_BIT-1) / CHAR_BIT;
    > >>
    > >> while( i-- > 0 ) r = r << CHAR_BIT + (unsigned char) rand();

    > >
    > > Conventional wisdom, from decades of experience, says that the lower
    > > bits tend to be at least as crappy as the upper bits, so shifting
    > > them out is better than masking them in. I would also presume the
    > > "planes" property (relations between subseqent terms) is less likely
    > > to kick in too, but don't have the mathematical teeth to prove such
    > > a claim.

    >
    > Implementations of rand() may be of arbitrarily low quality, and I
    > grant you that some are pretty bad. Even so, I consider it a
    > disservice to promulgate folklore concerning deficiencies in rand(),
    > even if that folklore is known to have been (or even still be) correct
    > for some implementations.


    I consider it a public service. My advice is free and worth at least
    twice what you paid for it.

    > If one doesn't mind getting low quality
    > random numbers, the code might just as well use rand() and treat it
    > as if all its outputs bits are equally good. If one does mind getting
    > low quality random numbers, the only sensible path is not to use
    > rand() at all, and use something else that does have high quality
    > results across all bit positions.


    I agree that those who care about their random should use a generator that
    is suitable to their specific needs, but should also understand what those
    needs are, and why the generator is suitable. I treat "use of the Mersenne
    Twister" as a 99% reliable indicator that the programmer doesn't have a
    clue about random numbers, for example. (OK, it's about 10/10 reliable,
    but someone did inform me that he encountered one project that did actually
    use it correctly once.)

    Phil
    --
    If "law-abiding citizens have nothing to fear" from privacy-invading
    technologies and policies, then law-abiding governments should have
    nothing to fear from whistleblowers.
    Phil Carmody, Aug 23, 2013
    #16
  17. James Kuyper Guest

    On 08/23/2013 04:42 AM, Phil Carmody wrote:
    ....
    > needs are, and why the generator is suitable. I treat "use of the Mersenne
    > Twister" as a 99% reliable indicator that the programmer doesn't have a
    > clue about random numbers, for example.


    If the programmer doesn't have a clue about random numbers, where did he
    hear about Mersenne Twister? A person completely clueless about random
    numbers wouldn't even know about rand(), much less srand(), because that
    counts as at least one clue.
    --
    James Kuyper
    James Kuyper, Aug 23, 2013
    #17
  18. Phil Carmody <> writes:
    [...]
    > I agree that those who care about their random should use a generator that
    > is suitable to their specific needs, but should also understand what those
    > needs are, and why the generator is suitable. I treat "use of the Mersenne
    > Twister" as a 99% reliable indicator that the programmer doesn't have a
    > clue about random numbers, for example. (OK, it's about 10/10 reliable,
    > but someone did inform me that he encountered one project that did actually
    > use it correctly once.)


    Can you briefly summarize (or provide a citation for) the problems with
    the Mersenne Twister? The Wikipedia article has a small "Disadvantages"
    section, but I don't see anything there that suggests it's as
    problematic as you say.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Working, but not speaking, for JetHead Development, Inc.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Aug 23, 2013
    #18
  19. James Kuyper Guest

    On 08/23/2013 10:55 AM, Phil Carmody wrote:
    > James Kuyper <> writes:
    >
    >> On 08/23/2013 04:42 AM, Phil Carmody wrote:
    >> ...
    >>> needs are, and why the generator is suitable. I treat "use of the Mersenne
    >>> Twister" as a 99% reliable indicator that the programmer doesn't have a
    >>> clue about random numbers, for example.

    >>
    >> If the programmer doesn't have a clue about random numbers, where did he
    >> hear about Mersenne Twister? A person completely clueless about random
    >> numbers wouldn't even know about rand(), much less srand(), because that
    >> counts as at least one clue.

    >
    > So you've never heard the phrase "A little knowledge is a dangerous thing"?


    Yes, of course, but "clueless" is zero knowledge, not "a little knowledge".
    James Kuyper, Aug 23, 2013
    #19
  20. Öö Tiib Guest

    On Friday, 23 August 2013 18:32:16 UTC+3, Keith Thompson wrote:
    > Phil Carmody <> writes:
    > [...]
    > > I agree that those who care about their random should use a generator that
    > > is suitable to their specific needs, but should also understand what those
    > > needs are, and why the generator is suitable. I treat "use of the Mersenne
    > > Twister" as a 99% reliable indicator that the programmer doesn't have a
    > > clue about random numbers, for example. (OK, it's about 10/10 reliable,
    > > but someone did inform me that he encountered one project that did actually
    > > use it correctly once.)

    >
    > Can you briefly summarize (or provide a citation for) the problems with
    > the Mersenne Twister? The Wikipedia article has a small "Disadvantages"
    > section, but I don't see anything there that suggests it's as
    > problematic as you say.


    It is often (I would agree with Phil that almost idiomatically) seeded
    with wrong things like 'time(0)' that Wikipedia's article mentions
    several times: "The Mersenne Twister is sensitive to poor initialization
    and can take a long time to recover from a zero-excess initial state."
    Öö Tiib, Aug 23, 2013
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Avner
    Replies:
    1
    Views:
    558
    Larry I Smith
    Oct 15, 2005
  2. Jeff.M
    Replies:
    6
    Views:
    171
    Lasse Reichstein Nielsen
    May 4, 2009
  3. Ben Bacarisse

    Re: Random 64 bit number in stdc

    Ben Bacarisse, Aug 7, 2013, in forum: C Programming
    Replies:
    2
    Views:
    194
    Eric Sosman
    Aug 7, 2013
  4. James Dow Allen

    Re: Random 64 bit number in stdc

    James Dow Allen, Aug 19, 2013, in forum: C Programming
    Replies:
    18
    Views:
    281
    James Dow Allen
    Aug 21, 2013
  5. Chris DH

    Re: Random 64 bit number in stdc

    Chris DH, Aug 25, 2013, in forum: C Programming
    Replies:
    29
    Views:
    370
    Keith Thompson
    Sep 7, 2013
Loading...

Share This Page