I have written a function to shuffle a deck of cards, as follows:
[...]
void shuffle(deck_t * deckptr)
{
static unsigned int times = 0;
int i;
srand((unsigned)clock() ^ (unsigned)time(0) + times++);
Uh ... you are reseeding every shuffle? This is ok if your entropy
has high quality and you don't care about performance.
Well, I didn't think initially that there aren't more than
UINT_MAX + 1 things srand() can do. The point which CBFalconer
made downthread applies. But anyway, on a system where the number
of states of rand() is less than UINT_MAX + 1 reseeding each time
*would* make it less predictable, I think. (But I don't think such
systems are common.)
I know, but it is negligible IMO.
deckptr->size - i is at least 630 times smaller than RAND_MAX,
and it will become even smaller as the shuffle proceeds.
A rough (over)estimate is (1 + 1/630)**52 = 1.086, so with this
bias, assuming rand() was perfect, some shuffles would be 9% more
likely than others. This pales in comparison with the fact which
too few states causes the probability of a shuffle to be either 0
or zillions of times larger than it should.
[...]
The problem is: this cannot portably shuffle a deck, because it
requires rand() to have at least 52! states. (If it has less states,
the pigeonhole principle shows that not all permutations are possible.)
Actually any quantity of shuffles you make < 52! will always leave some
shuffles unavailable to you. If you actually want all 52! states to be
possible with roughly equal proportion, then you can use the Mersenne
Twister, which has far more states (it has 2**19937 states). But you
still have to come up with log(2,52!) bits worth of entropy to seed the
thing and that is not coming from clock(), time() and a counter alone.
The simplest way of creating/generating that much entropy is probably to
maintain an array a rotating offset pointer, and simply xor one of
Cannot parse that sentence. But I did consider using
(unsigned long)&some_auto_variable among the entropy sources (yes,
I think it is way insufficient by its own, but if I need 226 bits
of seed, it can help). Also a hash of *(unsigned char *)stdin,
*((unsigned char *)stdin + 1), ... *((unsigned char *)stdin +
sizeof(FILE) - 1) could help, couldn't it?
time() or clock() into that pointer's location, and maintain the state
of this array on disk (so that it is incrementally updated from run to
run.) The array would be the entropic input, that you would only need
to set once (not once per shuffle).
[...]
Well that bias will color things more than anything else really. Getting
an accurate discrete random sample is actually a bit of an exercise on
binary computers as I explain here:
http://www.pobox.com/~qed/random.html
Well, for deckptr->card[0] (which, given the way I implemented
card_t draw_card(deck_t *deckptr), is actually the *last* card
which will be drawn), in a full deck (deckptr->size ==
MAX_DECKSIZE, i.e. 52), deckptr->size - i will be 52, and RAND_MAX
cannot be less that 32767, so 1000 * RAND_MAX / range will be at
least 630000. This is potentially almost 10 times larger than
UINT_MAX, (and will become even larger as the shuffle proceeds)
so it is well possible that we can't be able to see the bias even
examining the outcome of the shuffle seeded with srand(0), the one
with srand(1) and so on through srand(UINT_MAX). (Yes, UINT_MAX is
very likely to be larger than that, but so is RAND_MAX. Anyway the
point is that "more than anything else really" sounds like a
little bit of an exaggeration to me. Using a loop to discard the
return values of rand() larger than RAND_MAX-(RAND_MAX + 1)%range
would only solve the smallest part of the problem.