# How can get random number in your range

Discussion in 'C Programming' started by bipi, Jul 27, 2007.

1. ### bipiGuest

Dear all,

I found function rand(), it can create random number but this function
can not define the range of number which I want to get it, such as, I
want to get random number in the range from 0 to 100 [0-100].

Many thanks,

bipi, Jul 27, 2007

2. ### jacob naviaGuest

13.16: How can I get random integers in a certain range?

A: The obvious way,

rand() % N /* POOR */

(which tries to return numbers from 0 to N-1) is poor, because
the low-order bits of many random number generators are
distressingly *non*-random. (See question 13.18.) A better
method is something like

(int)((double)rand() / ((double)RAND_MAX + 1) * N)

If you're worried about using floating point, you could use

rand() / (RAND_MAX / N + 1)

Both methods obviously require knowing RAND_MAX (which ANSI
#defines in <stdlib.h>), and assume that N is much less than
RAND_MAX.

(Note, by the way, that RAND_MAX is a *constant* telling you
what the fixed range of the C library rand() function is. You
cannot set RAND_MAX to some other value, and there is no way of
requesting that rand() return numbers in some other range.)

If you're starting with a random number generator which returns
floating-point values between 0 and 1, all you have to do to get
integers from 0 to N-1 is multiply the output of that generator
by N.

References: K&R2 Sec. 7.8.7 p. 168; PCS Sec. 11 p. 172.

jacob navia, Jul 27, 2007

3. ### Peter J. HolzerGuest

See the FAQ, question 13.16.

hp

Peter J. Holzer, Jul 27, 2007
4. ### Ben BacarisseGuest

It may be worth adding to this FAQ entry. rand() % N is poor even in
some cases where the low-order bits are fine. If N does not divide
RAND_MAX + 1, there will be a bias towards the numbers between 0 and
RAND_MAX % N. The bias is small but grows as N gets closer to
RAND_MAX.

Ben Bacarisse, Jul 27, 2007
5. ### Keith ThompsonGuest

[snip]

jacob, thanks for making use of the FAQ, but surely it would be better
to provide a *pointer* to it rather than just quoting it. And you
forgot to mention what you were quoting, which fails to give credit to
the FAQ and its author and potentially denies the OP the opportunity
to read the rest of it (not that it's difficult to find).

When I cite the FAQ, I usually just provide the FAQ's URL
<http://www.c-faq.com/> and a question number (13.16 in this case).

Ben Bacarisse also made a good point about another problem with using
one result from rand() to generate one number is a specified range.
If there are, for example, 32768 possible results (RAND_MAX==32767),
and you want numbers in the range 0..99, you can't possibly produce a
uniform distribution, because 32768 is not a multiple of 100. In some
applications, the slight skew is acceptable. If it isn't, you can
reduce the range to 32700 possible values by rejecting any results in
the range 32700..32767, and map 327 distinct rand() results to each
value in the range 0..9. With a bit of work, the method can be
generalized to arbitrary ranges and arbitrary values of RAND_MAX, as
long as the desired range contains no more than RAND_MAX values. In
the very worst case, this will reject fewer than half of the values
returned by rand(), on average. This is random, of course, so you
might wait arbitrarily long for a valid result.

All this assumes that the rand() implementation is good enough that
all this work is worthwhile.

With some more arithmetic, I *think* there are ways to get a sequence
of results in a specified range from a sequence of results in
different specified range without rejecting anything. That's probably
more than the OP needs, though, and I'm too lazy to work it out.

Keith Thompson, Jul 27, 2007
6. ### Walter RobersonGuest

couple of days ago,

"After that... find some power of 2 that is "just a bit" above a
multiple of r (minimize the difference between the power of 2
and the multiple), mask off that many bits from the returned
random number, then reject the result if it is in the upper portion
of the range (between the last full multiple and 2**N - 1)."

I didn't specifically speak of bias the way Ben did, but that
bias is the reason for the stated algorithm.

Walter Roberson, Jul 27, 2007
7. ### Keith ThompsonGuest

I think for best effect (fewest rejected random numbers), you'd want
to miminize the *ratio* of the chosen power of 2 and a multiple of the
size of your range, not the difference.

Using a power of 2, I think, makes part of the operation quicker (you
can use bitwise operations rather than division), but I think that by
ignoring powers of 2 you can reject fewer of the input random numbers.

I think you can avoid rejecting *any* input random numbers at the
expense of some fairly heavy calculations. For example, if your
desired range has 100 elements, you can treat a sequence of, say, 1000
16-bit numbers as a single, very large, base-100 number, and grab one
"digit" at a time. Generalizing this to arbitrarily long sequences is
likely to require operations on arbitrarily large integer (bignums),
and will be worthwhile only if the input random numbers are
*extremely* expensive, or if you absolutely can't afford to reject too
many in a row. But for such an application, it would probably be much
easier to write your own random number generator in the first place.

If RAND_MAX+1 is a multiple of the number of elements in your desired
range, all these problems go away.

Keith Thompson, Jul 27, 2007
8. ### Army1987Guest

This is already pointed out in the Web version of the FAQ.
http://www.c-faq.com/lib/randrange.html
Also, if N is very small, if it isn't a divisor of RAND_MAX + 1,
then RAND_MAX % N is better than if it is a divisor of it.
For example, RAND_MAX % 2 alternates in some RNG, but RAND_MAX % 3
uses all of the bits, so it is less predictable.

Army1987, Aug 1, 2007
9. ### Ben BacarisseGuest

I can't find it. I am happy to assume it is me and leave it at that.
It is not a major issue, so if it is covered somewhere, that is more

Ben Bacarisse, Aug 1, 2007
10. ### Keith ThompsonGuest

It's covered in the cited web page (which is question 13.16 of the FAQ):

When N is close to RAND_MAX, and if the range of the random number
generator is not a multiple of N (i.e. if (RAND_MAX+1) % N != 0),
all of these methods break down: some outputs occur more often
than others. (Using floating point does not help; the problem is
that rand returns RAND_MAX+1 distinct values, which cannot always
be evenly divvied up into N buckets.) If this is a problem, about
the only thing you can do is to call rand multiple times,

unsigned int x = (RAND_MAX + 1u) / N;
unsigned int y = x * N;
unsigned int r;
do {
r = rand();
} while(r >= y);
return r / x;

Keith Thompson, Aug 1, 2007
11. ### websnarfGuest

The FAQ is deceptive and incomplete on this issue (and what the hell
is drand48?). You can get a better explanation here:

http://www.pobox.com/~qed/random.html

websnarf, Aug 2, 2007
12. ### Ben BacarisseGuest

So it is. Must pay more attention.

Ben Bacarisse, Aug 2, 2007
13. ### Keith ThompsonGuest

How so?
That's answered in question 13.21, which question 13.16 directly
refers to.
You raise 3 major points on that page.

1. If the desired range does not divide the range of values returned
by rand() a bias is introduced. FAQ 13.16 directly addresses that.

2. The quality of rand() is often not very good. FAQ 13.16 also

3. There are problems if the desired range is wider than RAND_MAX. I
think that's a valid criticism, though it certainly doesn't justify
using the word "deceptive". (And in this particular context, the OP
wanted range of 0..100, and RAND_MAX is guaranteed to be at least
32767.)

Your page appears to contain a lot of good information, and I hope to
have the time to read the whole thing at some point, but it's probably
more than would be appropriate for the clc FAQ. Adding a pointer to
<http://www.c-faq.com/lib/sd15.html> might be a good idea.

Perhaps the FAQ has been updated since the last time you read it?

Keith Thompson, Aug 2, 2007
14. ### CBFalconerGuest

However people should be aware that some RNGs will never return the
value 0. If this is so, and it is probably not documented, then
you need some additional constructs to ensure even handed
operation.

The lack of a zero is the result of a shift algorithm, which will
lock up on the value 0.

CBFalconer, Aug 2, 2007
15. ### websnarfGuest

It gives an "answer", that is insufficient, and accompanied by
incorrect explanations.
Right, and its off topic for this newsgroup, right? I mean, what is
it even doing in the FAQ?
The FAQ suggests that if the range is small enough, then you are ok.
This is untrue if accuracy is required, regardless of of what the
range is. Specifically, if you take 1000 * (RAND_MAX / RANGE)
samples, you will see a statistically significant deviation from the
samples from the techniques they suggest and from a proper evenly
distributed random sample. This is tantamount to the FAQ saying
"don't sample this too much". On modern machines, given that RAND_MAX
is commonly 32767, that number of samples is going to pretty much
always be "smallish".
It addresses the wrong things. In order to get more bits out of it,
you have to call it a successive number of times, however, rand()
implementations with too small of a state cannot produce all possible
permutations of successive calls -- in fact successive calls are
typically deterministic after 32 bits are output. For this reason, to
be ideal, you need to use a PRNG that has a large enough state (such
as the Mersenne Twister) to avoid this issue.

If you know anything about computer graphics; using a small state
rand() successively is like using (stochastic) dithering for images,
while using a large state rand() is like having truly higher
resolution. Having the higher resolution is generally better than
dithering which is basically trying to gloss over the fundamental
weakness of having lower resolution.
It said there are problems if N is close to RAND_MAX (they should
point out that its devastatingly wrong if RAND_MAX > N > RAND_MAX/2),
which includes N being slightly larger than RAND_MAX. If it said "if
N approaches RAND_MAX from below" or something like that you could
almost forgive them, but they didn't.
Give a man a fish vs teaching him to fish ...
Well the FAQ could simply give my implementation of randbiased () and
randrange () (the two together are not prohibitively long) and say
that why it works is an exercise to the reader, or point to my page
Looks like more half-answer kind of things going on in there. It kind
of reminds me of being forced to vote for the democratic nominee or
the republican nominee instead of the candidate you actually want for
the office.

It was, but insufficiently.

websnarf, Aug 2, 2007
16. ### Ernie WrightGuest

<ot>
I know something about graphics, and this analogy misses the mark.

Higher resolution with uniform samples is like a bad PRNG with lots of
bits. Stochastic sampling at lower resolution is like a good PRNG with
fewer bits. Uniform sampling produces aliasing; higher resolution just
changes the artifact wavelengths. Stochastic sampling randomizes (!)
the unavoidable error of discrete sampling, trading aliasing for much
less visually objectionable noise.

They aren't mutually exclusive, either. Just like you'd want your PRNG
with lots of bits to be good, you can stochastically sample at high
resolution. But for CG, if I have to choose, I'll take sampling quality
over raw resolution.

See

http://graphics.pixar.com/StochasticSampling/paper.pdf

So I think you need a different analogy.
</ot>

- Ernie http://home.comcast.net/~erniew

Ernie Wright, Aug 2, 2007
17. ### David ThompsonGuest

This is a recurring topic on sci.crypt, where it is sometimes applied
to _true_ random sources (which can be expensive, at least in time) as
well as PRNGs.

IIRC the final outcome of the most recent round of discussions was to
immediately use (uniformly) values below the highest exact multiple of
R not over RAND_RANGE, as you noted upthread, and use 'rejected'
values over that as base RAND_RANGE-N*R digits collected into an
auxiliary accumulator, from which a value is taken when the
digits=entropy in it get up to R and any overflow reused similarly.

Unfortunately that group has been under a massive hipcrime attack for
about two months now, which may make finding any useful information
difficult, unless gooja has done an atypically good job of filtering.

- formerly david.thompson1 || achar(64) || worldnet.att.net

David Thompson, Aug 26, 2007