Quality of rand()

T

Tom McCallum

Can any tell me why rand() is such a bad pseudo-random number generator.
In all the articles I have read they say that you can predict the outcome
of rand() but when I used its output with NIST's random number testsuite
it passed all of the tests thrown at it for randomness. Can anyone
suggest a test I could use that it would fail? A pointer to a suitable
article would be appreciated.

Thanks for your help in advance,

Tom
 
I

Ivan Vecerina

Tom McCallum said:
Can any tell me why rand() is such a bad pseudo-random number generator.
In all the articles I have read they say that you can predict the outcome
of rand() but when I used its output with NIST's random number testsuite
it passed all of the tests thrown at it for randomness. Can anyone
suggest a test I could use that it would fail? A pointer to a suitable
article would be appreciated.
You cannot make a general statement about the quality of rand, as its actual
performance and properties will vary from one library implementation to the
other.
But as a rule of thumb, rand() is designed to be fast and uses a simple
Linear Congruential Generator. Historically, there has also been many
complaints that the LCG constants used by some implementations were poor.
So the real problem is: you cannot rely on rand() producing good quality
random numbers.

For an intro to LCG and other random number generators, see for example
http://www.taygeta.com/rwalks/node1.html, or read Knuth's TAOCP
http://www-cs-faculty.stanford.edu/~knuth/taocp.html.

Games and many casual applications can do with a 'predictable'
random number generator (RNG), but many security-sensitive
applications (network protocols, encryption) require truly
random sources. I'm not familiar with the NIST suite you used,
does it look for things such as attractors and bits of randomness?
The following article may provide an interesting illustration
of tests that evaluate the security weaknesses of a RNGs:
http://lcamtuf.coredump.cx/newtcp/

hth -Ivan
 
T

Tom McCallum

You cannot make a general statement about the quality of rand, as its
actual
performance and properties will vary from one library implementation to
the
other.
But as a rule of thumb, rand() is designed to be fast and uses a simple
Linear Congruential Generator. Historically, there has also been many
complaints that the LCG constants used by some implementations were poor.
So the real problem is: you cannot rely on rand() producing good quality
random numbers.

For an intro to LCG and other random number generators, see for example
http://www.taygeta.com/rwalks/node1.html, or read Knuth's TAOCP
http://www-cs-faculty.stanford.edu/~knuth/taocp.html.

Games and many casual applications can do with a 'predictable'
random number generator (RNG), but many security-sensitive
applications (network protocols, encryption) require truly
random sources. I'm not familiar with the NIST suite you used,
does it look for things such as attractors and bits of randomness?
The following article may provide an interesting illustration
of tests that evaluate the security weaknesses of a RNGs:
http://lcamtuf.coredump.cx/newtcp/

hth -Ivan

Thanks for your reply, in answer to your question the NIST suite uses the
following tests:
[01] Frequency [02] Block Frequency
[03] Cumulative Sums [04] Runs
[05] Longest Run of Ones [06] Rank
[07] Discrete Fourier Transform [08] Nonperiodic Template
Matchings
[09] Overlapping Template Matchings [10] Universal Statistical
[11] Approximate Entropy [12] Random Excursions
[13] Random Excursions Variant [14] Serial
[15] Lempel-Ziv Complexity [16] Linear Complexity

I am not sure if these cover 'attractors and bits of randomness' but as
far as I can tell they seem to be a reasonable collection.

Cheers

Tom
 
M

Marco Manfredini

Tom said:
Can any tell me why rand() is such a bad pseudo-random number
generator. In all the articles I have read they say that you can
predict the outcome of rand() but when I used its output with NIST's
random number testsuite

You are mixing up two different things. The outcome of rand() is
predictable for the sheer reason, that it is implementing an
deterministic algorithm. From a given starting configuration (which may
be unknown) it generates number that look random and pass the tests.
Knowing the algorithm, it can be concluded from a couple of given
values, what will follow, but the property 'how hard is it to predict
the next number if you consider that rand() is deterministic' is
nothing that a randomness test checks (and it would be too hard to be
done, btw.), because it doesn't play a role for the usual purposes of a
random generator, the numbers alone should look random, that's enough.

The notable exception is cryptography, where "real" randomness is a
security issue.

Marco
 
J

Joe C

Tom McCallum said:
Can any tell me why rand() is such a bad pseudo-random number generator.
In all the articles I have read they say that you can predict the outcome
of rand() but when I used its output with NIST's random number testsuite
it passed all of the tests thrown at it for randomness.

This is curious. I tested rand()'s(gcc 3.2) output some time back using the
Diehard battery of tests, and it failed miserably, giving p-values of 1.0000
for loads of the tests. Can you show me the program you used to write your
random data and tell me what compiler you are using? Thanks, Joe
 
O

osmium

Tom said:
Can any tell me why rand() is such a bad pseudo-random number generator.
In all the articles I have read they say that you can predict the outcome
of rand() but when I used its output with NIST's random number testsuite
it passed all of the tests thrown at it for randomness. Can anyone
suggest a test I could use that it would fail? A pointer to a suitable
article would be appreciated.

The problem is that rand it not an algorithm, it is the name of a function.
Writers who say such things are showing off how erudite they are, a bit like
correcting soneone when someone say "random number generator" and the
sophisticated person responds, "Of course this uninformed doofus actually
means *pseudo* random number generator.

Nowhere is it written that rand() must be bad, or is necessarily bad or that
any of the current generators are necessarily bad. What they really mean
is:"Once upon a time there was a bad implementation of a PRNG that was
called by rand(). Being able to predict the output of a generator is -
given sufficient effort - , a property of *all* PRNGs, it goes with the
territory.
 
K

Karl Heinz Buchegger

Joe said:
This is curious.

Not at all. The standard doesn't dictate which formula to use for rand()
There are good ones and there are bad ones.
 
I

Ivan Vecerina

Tom McCallum said:
Thanks for your reply, in answer to your question the NIST suite uses the
following tests: ....
I am not sure if these cover 'attractors and bits of randomness' but as
far as I can tell they seem to be a reasonable collection.

If the rand() implementation you are using passes the NIST tests,
it is likely to be reasonably good for many casual applications.
But the scope of these tests is quite limited (and on another platform,
rand() may not perform as well).

I really think that you should read the TCP/IP spoofing article I sent a
link to, or even better, the original article it refers to:
http://minilien.com/?LtytjTcByE (.pdf, describes the approach in detail)
To summarize things: the plots are basically the position of points whose
coordinates are generated from consecutive values obtained from a "random"
source. A truly random source would generate a uniform cloud of points. When
points tend to concentrate in some areas ("attractors"), it is obvious that
the data source is not as random as it would seem: the attractors allow you
to make a guess of the next value if the previous sequence is similar to a
previously encountered one. For a given analysis technique applied to a
source stream of data, the actual bits of randomness are defined by your
probability of making a correct guess.

Going back to your original question and rand() itself, let's see how easy
it would be to predict the output of rand():
First of all, unless you call srand(), you will notice that the sequence of
values returned by rand() is always the same -- and this is a requirement of
the ISO C/C++ standard.
If you do call srand() with a given value, the resulting sequence is
required to always be the same. So the set of possible sequences is limited
to the set of values your program may pass to srand().
Now let's assume you are actually calling srand() with a truly random 32-bit
value:
On a cheap hard disk, with access to your library implementation, it might
take me less than an hour to store a look-up table that matches all the
initial value(s) returned by rand() to a seed value passed to srand().
Given the first value(s) returned by rand(), I could then instantly compute
the seed and generate/predict the rest of the random sequence.

So would you rely on rand() to implement any security-sensitive application
?

Cheers, Ivan
 
K

Kelsey Bjarnason

[snips]
Nowhere is it written that rand() must be bad, or is necessarily bad or that
any of the current generators are necessarily bad. What they really mean
is:"Once upon a time there was a bad implementation of a PRNG that was
called by rand().

Not quite. The issue is that since there are no guarantees about the
performance or characteristics of rand(), an implementation can get away
with very poor quality PRNGs if it wants - and some have. As a result
of this lack of guarantees, one cannot simultaneously assume that a)
rand() will be good and b) code relying on rand() being good will be
readily usable under other implementations.

Net result: if it's okay to possibly wind up with a poor PRNG, go ahead
and use rand(). If you need something that is going to produce better
results, don't rely on rand(), use something else.
 
A

AngleWyrm

Kelsey Bjarnason said:
Not quite. The issue is that since there are no guarantees about the
performance or characteristics of rand(), an implementation can get away
with very poor quality PRNGs if it wants - and some have. As a result
of this lack of guarantees, one cannot simultaneously assume that a)
rand() will be good and b) code relying on rand() being good will be
readily usable under other implementations.

Net result: if it's okay to possibly wind up with a poor PRNG, go ahead
and use rand(). If you need something that is going to produce better
results, don't rely on rand(), use something else.

One of the problems with the standard implementation of rand() is that on
most systems it uses a fast formula called Linear Congruential. This
formula, when coupled with a time seed, produces a sequential number as it's
first output after seeding. On my system, the first number generated after
seeding steps by three once a second. This problem relates to time(NULL)
being a number that doesn't change much.

So if you use rand(), and srand( time(NULL) ), then it's a good idea to toss
the first result.
 
O

osmium

AngleWyrm said:
One of the problems with the standard implementation of rand() is that on
most systems it uses a fast formula called Linear Congruential.

But there IS NO STANDARD implementation of rand(). It is the name of a
FUNCTION, not an ALGORITHM!!!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,053
Latest member
billing-software

Latest Threads

Top