You might like to explain that to the average casino.
The explanation goes "Even differences that are usually
considered small, e.g. 3%. can be very very important."
If the casino is still in business they will tell
me to teach my Grandmother to suck eggs.
Card counting
in black
jack gives player around a 1% advantage over the house,
Teach your Grandmother to suck eggs
and the casinos kick
out such people whenever they are discovered. People who attack
defective casino games rely on people with attitudes like yours.
Even if you are implementing something as simple as a 1d20 (where each
choice
itself is only 5%) in a dungeons and dragons game, the players will
easily see
that bias over time.
Piffle (even if we are talking about a 3% bias and not
the less than .1% bias you get with rand()%20 ).
And even if someone took the trouble to notice (e.g. tabulated
1000's of rolls and applied statistical techniques) they would notice
that the bias had no practical import.
[...] If your
problem is such that you are worried about the 3% you should probably
be more worried about the fact that the rand() you are using has an
output space of only 16 bits.
That statement doesn't follow any line of logic of any relevance. If
you
care, then you care, and you want to get a correct ranged random
number
generator. If you go up to 32 bits, but still have bias that's just
a
little smaller, how can you be happy?
E.g. I am interested at tail distributions. The performance of
my random generator has gone from terrible to reasonable.
And if you want to write portable code, then what are you going to do?
Either I don't need much, in which case I can use
the system rand() or I provide my_rand().
True enough, but this fundamentally comes from the lack of analysis.
The
C.L.C. FAQ just continues this tradition by failing to give effective
analysis of the problem.
Did you know that a simple counter will produce numbers that are
exactly
uniformly distributed in 0 ... RAND_MAX?
Indeed, one needs more than uniformly distributed.
The basic point, that the rand() implementation needs
to be really bad to produce unreasonable results
with the FAQ technique, but the rand() implementation
only needs to be a bit bad to produce unreasonable
results with the rand()%n technique remains.
You know, basic
understanding is
sometimes actually useful on occasion.
The technique shown in the CLC FAQ also has no such guarantee. Its
totally besides the point.
Define small. If you want to test how often a hash function will map
to
a common bucket either (rand() % n) or
(rand() * (double) n / (RAND_MAX + 1)) will make no difference. It
will
produce worthless results no matter what.
No. A test that looks for perfection vs bias, will find a bias,
but since there are lots and lots of ways of introducing a
insignficant (note I did _not_ say "statistically insignificant")
bias, a test that looks for perfection vs bias is stupid.
[...] Do not confuse detectablity with importance.
I assure you, I am not the one confused. The C.L.C. FAQ is giving a
solution that assumes a policy where low bit determinism is a worse
problem than pure measurable bias and also a worse problem than a
simple
range issue. In fact the CLC FAQ is promoting confusion by not
explaining the issue correctly and consequently how one might deal
with
the problem.
(The use of "significance" in the term "statistical significance"
leads many people astray).
What has that got to do with anything? If you wish to test something
with a very small probability which is lower than the bias being
introduced by such short-sighted techniques then what good is the
C.L.C.
FAQs discussion on the subject?
Who actually wants to use a PRNG which is biased
[I recall a wonderful poem about an archer who claimed
he was best, because, although he never came near
the target, he was unbiased, Lack of bias is not
everything!]
Lots of people don't care a fig. If I want to shuffle cards for a
bridge game
then I don't care about a 3% bias. (If I want to shuffle cards for
a computer poker game, then the fact that the average rand()
is about as cryptographicly secure as a Ceasar cypher is more
important than a 3% bias). If the wumpus alternates between
being in the left half of the maze and the right half of the maze,
I care a lot!
The CLC FAQ solves a real (although probably now historical) problem.
The system rand() may not be suitable for many applicatins.
Fixing one problem, which is not a problem in most applications
where the system rand() is suitable, does not magically make rand()
produce high quality random numbers.
- William Hughes