Problem with rand() % range+1

  • Thread starter Rafael Cunha de Almeida
  • Start date
R

Rafael Cunha de Almeida

Hi,

I've found several sites on google telling me that I shouldn't use

rand() % range+1

and I should, instead, use something like:

lowest+int(range*rand()/(RAND_MAX + 1.0))

They fail to make it clear why. There seems to be less randomness in the
lower bits or something like that. I don't know what less randomness
would mean. Would it not be normally distributed or something like that?
 
P

peter koch

Hi,

I've found several sites on google telling me that I shouldn't use

        rand() % range+1

and I should, instead, use something like:

        lowest+int(range*rand()/(RAND_MAX + 1.0))

They fail to make it clear why. There seems to be less randomness in the
lower bits or something like that. I don't know what less randomness
would mean. Would it not be normally distributed or something like that?

Take an example of a random generator delivering numbers in the range
0..99. If you want numbers in the range 0..79, your solution would
give an expectancy of numbers 0..19 twice as high as the range 20..79.
So your method is bad if you require an even distribution (but might
be ok if you just want somewhat random distribution as eg for a game).
The other method is better, but not perfect.

/Peter
 
P

peter koch

Well, the other method is better in that it makes the non-randomness
less obvious. <g> But with the same sample ranges it still produces
twenty values twice as often as the rest; it's just that they're not
the twenty lowest ones any more.
You are right - the other method also produces twenty numbers twice as
frequent as the first one - just spread evenly. To my embarassment, I
must admit that I did not realise this when I replied. So much for
perfection!

/Peter
 
A

acehreli

The number of values that you start with has to be an exact multiple of
the number of values that you want to get. The way to do that is to
throw away some values from rand(), namely, those that exceed the
maximum multiple of range that's less than RAND_MAX+1. Like this:

int mult = (RAND_MAX + 1) / range;

There is a problem if RAND_MAX equals INT_MAX. Adding one would wrap
the value and mult would become zero. So RAND_MAX could be casted to a
suitable type and I think 'unsigned int' works. Or an unsigned 1 would
work:

int mult = (RAND_MAX + 1u) / range;

I remember reading about this algorithm and that correction to it by
Andrew Koenig on clc++m posts.
int max = range * mult;
int temp = rand();
while (max < temp)
temp = rand();
return temp % range;

Ali
 
Z

zaimoni

Hi,

I've found several sites on google telling me that I shouldn't use

rand() % range+1

and I should, instead, use something like:

lowest+int(range*rand()/(RAND_MAX + 1.0))

Or function doing roughly the equivalent using integer math, which
might be useful if range is large (to avoid signed integer overflow).
They fail to make it clear why. There seems to be less randomness in the
lower bits or something like that. I don't know what less randomness
would mean. Would it not be normally distributed or something like that?

If rand() is implemented with a linear congruential generator f(x)=ax
+b mod m, then it is a relatively easy automated check that the lower
bits are in general highly correlated (given the first, the second is
somewhat predictable). Directly verifying algebraically is harder,
but a standard homework exercise in the field. The lowest separation
required to demonstrate correlation is usually two through six
invocations apart.

More sophisticated algorithms are harder to verify, but reputedly have
similar problems unless explicitly designed otherwise. [As an extreme
case, the Mersenne Twister has been formally verified to not to be
correlated across 637 consecutive invocations.]

It's moderately unusual to use a high-quality RNG to implement rand().

The overall results would still be as approximately uniformly
distributed as rand(), except for unavoidable discretization errors as
mentioned in earlier replies (which wouldn't be that large for small
moduli since RAND_MAX should be at least 32767).
 
Z

zaimoni

More sophisticated algorithms are harder to verify, but reputedly have
similar problems unless explicitly designed otherwise. [As an extreme
case, the Mersenne Twister has been formally verified to not to be
correlated across 637 consecutive invocations.]

Wrong: 623 consecutive invocations. Please accept my apologies.
 
J

Jorgen Grahn

If rand() is implemented with a linear congruential generator f(x)=ax
+b mod m, then it is a relatively easy automated check that the lower
bits are in general highly correlated (given the first, the second is
somewhat predictable). ....
It's moderately unusual to use a high-quality RNG to implement rand().

"High-quality" is a bit too vague here. Are you saying that rand() is
still broken in the low-order bits in many libraries? (Yeah, "broken"
is also vague, but you know what I mean.)

The Linux rand(3) manual says this:

The versions of rand() and srand() in the Linux C Library use
the same random number generator as random() and srandom(), so
the lower-order bits should be as random as the higher-order
bits. However, on older rand() implementations, and on current
implementations on different systems, the lower-order bits are
much less random than the higher- order bits. Do not use this
function in applications intended to be portable when good
randomness is needed.

but I have never seen anyone explicitly list over current
implementation with this problem. I kind of assumed they were gone by
now. Or at least, I wouldn't be surprised if they were, and what used
to be a fact ended up as urban legent.

See also a discussion in and around Message-ID:
<[email protected]> back in 2003.

/Jorgen
 
J

Jorgen Grahn

If rand() is implemented with a linear congruential generator f(x)=ax
+b mod m, then it is a relatively easy automated check that the lower
bits are in general highly correlated (given the first, the second is
somewhat predictable). ....
It's moderately unusual to use a high-quality RNG to implement rand().

"High-quality" is a bit too vague here. Are you saying that rand() is
still broken in the low-order bits in many libraries? (Yeah, "broken"
is also vague, but you know what I mean.)

The Linux rand(3) manual says this:

The versions of rand() and srand() in the Linux C Library use
the same random number generator as random() and srandom(), so
the lower-order bits should be as random as the higher-order
bits. However, on older rand() implementations, and on current
implementations on different systems, the lower-order bits are
much less random than the higher- order bits. Do not use this
function in applications intended to be portable when good
randomness is needed.

but I have never seen anyone explicitly list over current
implementation with this problem. I kind of assumed they were gone by
now. Or at least, I wouldn't be surprised if they were, and what used
to be a fact ended up as urban legent.

See also a discussion in and around Message-ID:
<[email protected]> back in 2003.

/Jorgen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top