portably shuffle a deck

Army1987 · Aug 1, 2007

I have written a function to shuffle a deck of cards, as follows:

#include <time.h>
#include <stdlib.h>
#define MAX_DECKSIZE 52

typedef struct card {
enum rank { Joker, Ace, Jack = 11, Queen, King } rank;
enum suit { Spades, Clubs, Diamonds, Hearts } suit;
} card_t;

typedef struct deck {
int size;
card_t card[MAX_DECKSIZE];
} deck_t;

void shuffle(deck_t * deckptr)
{
static unsigned int times = 0;
int i;
srand((unsigned)clock() ^ (unsigned)time(0) + times++);
for (i = 0; i < deckptr->size; i++) {
int r = rand() / (RAND_MAX / (deckptr->size - i) + 1);
if (r) {
card_t tmp = deckptr->card;
deckptr->card = deckptr->card[i + r];
deckptr->card[i + r] = tmp;
}
}
}

The problem is: this cannot portably shuffle a deck, because it
requires rand() to have at least 52! states. (If it has less
states, the pigeonhole principle shows that not all permutations
are possible.) Should I call the function several times, hoping
that the increment of clock() (and possibly of time(0)) will be
unpredictable enough? Is there anything less ridiculous I could
do? Should I resort to system-specific ways <ot>e.g. reading from
/dev/urandom<ot>? I don't need it to be cryptographically secure
(I didn't even bother to cope with the bias due to RAND_MAX + 1
not always being a multiple of deckptr->size - i), but I wouldn't
like that some permutations are more likely than others by several
orders of magnitude.

Keith Thompson · Aug 1, 2007

Army1987 said:
I have written a function to shuffle a deck of cards, as follows:
[code snipped]

The problem is: this cannot portably shuffle a deck, because it
requires rand() to have at least 52! states. (If it has less
states, the pigeonhole principle shows that not all permutations
are possible.)

[...]

Is that really a problem?

To support 52! states, you need at least 226 bits of state. The
sample rand() implementation in the standard has only 32 bits of
state, assuming unsigned long is 32 bits.

That means you can only get about 4 billion shuffles out of about 80
unvigintillion possible shuffles. (Yes, that's the word; look it up.)
But unless you run the program a whole bunch of times, you're not even
going to use up all 4 billion shuffles. Perhaps it doesn't matter.

On the other hand, the odds of a specific shuffle *should* be one in
80 unvigintillion; with only 32 bits of state, the actual odds will be
either one in 4 billion (way too high) or zero. It's possible that
you can't possibly deal four royal flushes, for example.

If this is a problem for you, I think your best bet is to resort to
non-portable code <OT>such as reading /dev/random or
/dev/urandom</OT>. Or you could implement an alternative random
number generator in portable C (I understand the "Mersenne Twister" is
good), but you still have the problem of initializing its state.

Ben Bacarisse · Aug 1, 2007

Army1987 said:
I have written a function to shuffle a deck of cards, as follows:

#include <time.h>
#include <stdlib.h>
#define MAX_DECKSIZE 52

typedef struct card {
enum rank { Joker, Ace, Jack = 11, Queen, King } rank;
enum suit { Spades, Clubs, Diamonds, Hearts } suit;
} card_t;

typedef struct deck {
int size;
card_t card[MAX_DECKSIZE];
} deck_t;

void shuffle(deck_t * deckptr)
{
static unsigned int times = 0;
int i;
srand((unsigned)clock() ^ (unsigned)time(0) + times++);
for (i = 0; i < deckptr->size; i++) {
int r = rand() / (RAND_MAX / (deckptr->size - i) + 1);
if (r) {
card_t tmp = deckptr->card;
deckptr->card = deckptr->card[i + r];
deckptr->card[i + r] = tmp;
}
}
}

The problem is: this cannot portably shuffle a deck, because it
requires rand() to have at least 52! states. (If it has less
states, the pigeonhole principle shows that not all permutations
are possible.) Should I call the function several times, hoping
that the increment of clock() (and possibly of time(0)) will be
unpredictable enough?

No! The only portable way is to implement your own PRNG using an
algorithm that has a lot of internal state. There are may suggestions
on the web, and some freely available implementations. Even then, I
don't think you will be able to be *sure* of getting all possible
shufflings equally often. However...

Is there anything less ridiculous I could
do? Should I resort to system-specific ways <ot>e.g. reading from
/dev/urandom<ot>? I don't need it to be cryptographically secure
(I didn't even bother to cope with the bias due to RAND_MAX + 1
not always being a multiple of deckptr->size - i), but I wouldn't
like that some permutations are more likely than others by several
orders of magnitude.

Click to expand...

No indeed, but you need to ask how you will ever tell? You only need
a PRNG that is "good enough" -- one that gives you a set of shufflings
that is not likely to lead to a bias in whatever you do with the
cards. To take an extreme case, if you want to find the most likely
five-card poker hand that will be drawn from a small deck, you can
make do with no PRNG at all -- just enumerate all the possible hands
and count. With a full deck, you will do better sampling the possible
hands, but you only need a generator that does not bias the end
result. Even one with very little internal state might give you a
reasonable sample.

<OT>If there is a normal practice in this area, I would guess it
involves picking a good PRNG (one backed by sound research), testing
your implementation using standard statistical tests, and then just
hoping that the is no hidden interaction between the algorithm and
what you sample with it. More on this would have to move to, say,
comp.programming?</OT>

Gene · Aug 1, 2007

I have written a function to shuffle a deck of cards, as follows:

#include <time.h>
#include <stdlib.h>
#define MAX_DECKSIZE 52

typedef struct card {
enum rank { Joker, Ace, Jack = 11, Queen, King } rank;
enum suit { Spades, Clubs, Diamonds, Hearts } suit;

} card_t;

typedef struct deck {
int size;
card_t card[MAX_DECKSIZE];

} deck_t;

void shuffle(deck_t * deckptr)
{
static unsigned int times = 0;
int i;
srand((unsigned)clock() ^ (unsigned)time(0) + times++);
for (i = 0; i < deckptr->size; i++) {
int r = rand() / (RAND_MAX / (deckptr->size - i) + 1);
if (r) {
card_t tmp = deckptr->card;
deckptr->card = deckptr->card[i + r];
deckptr->card[i + r] = tmp;
}
}

}

The problem is: this cannot portably shuffle a deck, because it
requires rand() to have at least 52! states. (If it has less
states, the pigeonhole principle shows that not all permutations
are possible.) Should I call the function several times, hoping
that the increment of clock() (and possibly of time(0)) will be
unpredictable enough? Is there anything less ridiculous I could
do? Should I resort to system-specific ways <ot>e.g. reading from
/dev/urandom<ot>? I don't need it to be cryptographically secure
(I didn't even bother to cope with the bias due to RAND_MAX + 1
not always being a multiple of deckptr->size - i), but I wouldn't
like that some permutations are more likely than others by several
orders of magnitude.

With a rand() period approaching 2^32, there are 4 billion possible
shuffles. If it really matters that 52! - 2^32 shuffles are
inaccessible, seed with /dev/random. In Windows you can use
CryptGenRandom() instead.

Eric Sosman · Aug 1, 2007

Gene said:
With a rand() period approaching 2^32, there are 4 billion possible
shuffles. If it really matters that 52! - 2^32 shuffles are
inaccessible, seed with /dev/random. In Windows you can use
CryptGenRandom() instead.

You haven't worked through the arithmetic. `52! - 2^32' is
80658175170943878571660636856403766975289505440883277823995705032704
hands. To fifty-seven decimal places, that's 100% of the total
(unity minus 5E-59, times 100%).

Also, seeding the generator with a high-quality source of
randomness does *not* help. If the generator is only capable of
2^32 distinct sequences, it doesn't matter how fairly or how
unpredictably you choose between them: only 2^32 sequences are
possible. Flip a coin, taking extraordinary care to randomize
the initial conditions, and you will get either Heads or Tails,
never Bellybuttons.

2^32 is utterly negligible compared to 52! -- hell, 2^128
is utterly negligible compared to 52!.

Bill Reid · Aug 1, 2007

Ben Bacarisse said:
Army1987 said:

I have written a function to shuffle a deck of cards, as follows:

#include <time.h>
#include <stdlib.h>
#define MAX_DECKSIZE 52

typedef struct card {
enum rank { Joker, Ace, Jack = 11, Queen, King } rank;
enum suit { Spades, Clubs, Diamonds, Hearts } suit;
} card_t;

typedef struct deck {
int size;
card_t card[MAX_DECKSIZE];
} deck_t;

void shuffle(deck_t * deckptr)
{
static unsigned int times = 0;
int i;
srand((unsigned)clock() ^ (unsigned)time(0) + times++);
for (i = 0; i < deckptr->size; i++) {
int r = rand() / (RAND_MAX / (deckptr->size - i) + 1);
if (r) {
card_t tmp = deckptr->card;
deckptr->card = deckptr->card[i + r];
deckptr->card[i + r] = tmp;
}
}
}

The problem is: this cannot portably shuffle a deck, because it
requires rand() to have at least 52! states. (If it has less
states, the pigeonhole principle shows that not all permutations
are possible.) Should I call the function several times, hoping
that the increment of clock() (and possibly of time(0)) will be
unpredictable enough?

Click to expand...

No! The only portable way is to implement your own PRNG using an
algorithm that has a lot of internal state. There are may suggestions
on the web, and some freely available implementations. Even then, I
don't think you will be able to be *sure* of getting all possible
shufflings equally often. However...

Is there anything less ridiculous I could
do? Should I resort to system-specific ways <ot>e.g. reading from
/dev/urandom<ot>? I don't need it to be cryptographically secure
(I didn't even bother to cope with the bias due to RAND_MAX + 1
not always being a multiple of deckptr->size - i), but I wouldn't
like that some permutations are more likely than others by several
orders of magnitude.

Click to expand...

No indeed, but you need to ask how you will ever tell? You only need
a PRNG that is "good enough" -- one that gives you a set of shufflings
that is not likely to lead to a bias in whatever you do with the
cards. To take an extreme case, if you want to find the most likely
five-card poker hand that will be drawn from a small deck, you can
make do with no PRNG at all -- just enumerate all the possible hands
and count. With a full deck, you will do better sampling the possible
hands, but you only need a generator that does not bias the end
result. Even one with very little internal state might give you a
reasonable sample.

<OT>If there is a normal practice in this area, I would guess it
involves picking a good PRNG (one backed by sound research), testing
your implementation using standard statistical tests, and then just
hoping that the is no hidden interaction between the algorithm and
what you sample with it. More on this would have to move to, say,
comp.programming?</OT>

There used to be a fair amount of discussion on this topic on the
old rec.gambling.blackjack group, which is now completely dead
except for the spammers...maybe there are some gambling groups
still working where they discuss this type of stuff, or try the archives.

In any event, I'm starting to get d-chills remembering the debates
about the sufficiency of random number generators (although
occasionally with good reasoning for rejecting truly pathological
RNG schemes).

Here was my bottom line: I just tend to use "realistic" shuffles
and card pickup routines, modeling the card flow as closely as
possible as it actually occurs in the game.

This has several advantages for a simulation, not the least of
which it reduces the critique of the RNG anal-retentives down
to a simple assertion that I don't KNOW that I am accurately
modeling any specific actual shuffle. Of course I don't, and in
the "real world" they don't know this either; they just kind of mix
the cards up using a set of shuffle techniques that are more or
less mandated by "the boss".

But in both cases, you get some well and truly mixed-up
cards (studies of shuffling have shown that a little shuffling
goes a LONG way towards fully randomizing the cards no
matter how you do it, and introducing another element of
randomness by picking the cards out of a discard deck that
is already mixed up by a previous round of play just makes
it that much more "random").

Of course, the other great advantage for a MONTE-CARLO
SIMULATION is that I am re-creating the actual game play as
much a possible. They do NOT use a 52! state RNG to pick
cards out of a statically-ordered array in any cardroom, even
those using the mechanical shuffling machines. So if your goal
is to SIMULATE the game, you shouldn't either...

....except in general, you CAN use a reasonably good RNG,
with occasional "random" reseeding, and at least picking from
the previously-shuffled array, and get results that are indistinguishable
from a "realistic" shuffle (I speak from the experience of running
several hundred billion hand game simulations using both "realistic"
and a reasonably "random" RNG algorithm and getting the same
results no matter what). It should be obvious that the possible
unique card orderings are much greater than 2^32 using an RNG
strategy like that, and though I can't vouch for the full 52! possible
states, I can say it seems to be "good enough".

Also, we have to be careful as to what our goals are, because
I noted this:

To take an extreme case, if you want to find the most likely
five-card poker hand that will be drawn from a small deck, you can
make do with no PRNG at all -- just enumerate all the possible hands
and count. With a full deck, you will do better sampling the possible
hands, but you only need a generator that does not bias the end
result.

Click to expand...

Actually, with a full deck, you'd be "better off" just writing a
"combinatorial analyzer" to perform this relatively trivial task,
unless I'm missing the point here (for any truly "random" deck,
the odds of drawing any particular hand are just the results of
a "Bayesian chain" of conditional probabilities). However,
the important point is that you want to define what you're
trying to accomplish in the first place, and you may NOT
need a RNG at all...but if you do, I hope you got some
ideas about how to be reasonably assured you are not
introducing errors due to a poor RNG algorithm.

Richard Heathfield · Aug 1, 2007

Eric Sosman said:

You haven't worked through the arithmetic. `52! - 2^32' is
80658175170943878571660636856403766975289505440883277823995705032704
hands. To fifty-seven decimal places, that's 100% of the total
(unity minus 5E-59, times 100%).

Also, seeding the generator with a high-quality source of
randomness does *not* help. If the generator is only capable of
2^32 distinct sequences, it doesn't matter how fairly or how
unpredictably you choose between them: only 2^32 sequences are
possible. Flip a coin, taking extraordinary care to randomize
the initial conditions, and you will get either Heads or Tails,
never Bellybuttons.

2^32 is utterly negligible compared to 52! -- hell, 2^128
is utterly negligible compared to 52!.

Fortunately, this is quite solvable in ISO C, by writing your own
LCPRNG, and then having several different instances of it, each of
which maintains its own state (preferably in a struct, a pointer to
which you give to your customised PRNG function, and even more
preferably having independently configurable 'constants' for the LC
part of the LCPRNG).

That way, you can get all the bits you could possibly want - well, 226
wouldn't be any big deal, anyway. If each LCPRNG produces, say, 16 bits
of state, then all you need to do is glue (226+15)/16= 15 PRNs
together. Note, however, that your PRNG cannot call rand(). If it does
so, then you're simply finding another way to express the original
problem, because there will be internal relationships between your
"entropy" bits. But if you have 15 truly independent 16-bit PRN
streams, then you have a fighting chance of a fair deal, so to speak.

Army1987 · Aug 1, 2007

Army1987 said:
Army1987 said:

I have written a function to shuffle a deck of cards, as follows:

Click to expand...

[code snipped]

[relevant line unsnipped]
srand((unsigned)clock() ^ (unsigned)time(0) + times++);

The problem is: this cannot portably shuffle a deck, because it
requires rand() to have at least 52! states. (If it has less
states, the pigeonhole principle shows that not all permutations
are possible.)

Click to expand...

[...]

Is that really a problem?

To support 52! states, you need at least 226 bits of state. The
sample rand() implementation in the standard has only 32 bits of
state, assuming unsigned long is 32 bits.

That means you can only get about 4 billion shuffles out of about 80
unvigintillion possible shuffles. (Yes, that's the word; look it up.)
But unless you run the program a whole bunch of times, you're not even
going to use up all 4 billion shuffles. Perhaps it doesn't matter.

Depends on what I'm going to do. For now, it is a Blackjack game,
where suits are irrelevant, so actually the combinations are
52!/(52C13 * 39C13 * 26C13) (unless I am missing something, but
let's not go too far off topic...). Hey, that's a huge number
again! (About 1.5e+39.) But jacks, queens, and kings behave all
alike, so... OK, got the point. But, if I want to extend it to
another game... (Actually it was more a theoretical curiosity than
anything else.)

On the other hand, the odds of a specific shuffle *should* be one in
80 unvigintillion; with only 32 bits of state, the actual odds will be
either one in 4 billion (way too high) or zero. It's possible that
you can't possibly deal four royal flushes, for example. Here's the point.
If this is a problem for you, I think your best bet is to resort to
non-portable code <OT>such as reading /dev/random or
/dev/urandom</OT>. Or you could implement an alternative random
number generator in portable C (I understand the "Mersenne Twister" is
good), but you still have the problem of initializing its state.

I had missed this last point. No matter how many states rand()
has, even on my system, there are only UINT_MAX + 1 things which
srand((unsigned)clock() ^ (unsigned)time(0) + times++);
can do. On my system this is not entirely ridiculous (even if it
is by many tens of orders of magnitude less than 52!), but it
could be as little as 65536 somewhere else.

Eric Sosman · Aug 1, 2007

Richard said:
Eric Sosman said:

Fortunately, this is quite solvable in ISO C, by writing your own
LCPRNG, and then having several different instances of it, each of
which maintains its own state (preferably in a struct, a pointer to
which you give to your customised PRNG function, and even more
preferably having independently configurable 'constants' for the LC
part of the LCPRNG).

<topicality level=marginal>

Is there a particular reason to insist on LC? I can't
think of one, and it seems to me there are other PRNG types
that more easily encompass >=226 bits of internal state. MT
is often mentioned, and George Marsaglia posted a few to this
newsgroup before the insult artists drove him off.

> [...] Note, however, that your PRNG cannot call rand(). [...]

Well, it *could*, but that'd probably just make things
messier. rand() could be the source for up to lg(UINT_MAX+1)
bits of the PRNG's initial state, but the rest would need to
come from elsewhere. (Using rand() this way really just
pushes the problem back to choosing an srand() argument.)
Also, I think it would be all right to use rand() to "stir"
the output of some other PRNG, a la TAOCP algorithm 3.2.2M.

But you're certainly right to warn against using repeated
rand() calls to seed a fancier generator: If the fancy PRNG's
large internal state is really just rand()'s own internal
state "decompressed," we're right back where we started.

</topicality>

Richard Heathfield · Aug 1, 2007

Eric Sosman said:

Is there a particular reason to insist on LC?

No. It's just the one that sprang to mind.

Ben Bacarisse · Aug 1, 2007

Richard Heathfield said:
Fortunately, this is quite solvable in ISO C, by writing your own
LCPRNG, and then having several different instances of it, each of
which maintains its own state (preferably in a struct, a pointer to
which you give to your customised PRNG function, and even more
preferably having independently configurable 'constants' for the LC
part of the LCPRNG).

That way, you can get all the bits you could possibly want - well, 226
wouldn't be any big deal, anyway. If each LCPRNG produces, say, 16 bits
of state, then all you need to do is glue (226+15)/16= 15 PRNs
together. Note, however, that your PRNG cannot call rand(). If it does
so, then you're simply finding another way to express the original
problem, because there will be internal relationships between your
"entropy" bits. But if you have 15 truly independent 16-bit PRN
streams, then you have a fighting chance of a fair deal, so to
speak.

I may, just, be able to squeeze this on-topic in the last few lines...

The trouble is the throw away "truly independent" part. That is
impossible to guess and hard to prove (so I am told). Fortunately
there are PRNGs with very long periods and large amounts of state that
are very fast to implement. The biggest of Marsaglia's CMWC
(Complimentary Multiply With Carry) generators has 4096 32-bit longs
for its state and has a period of 1.72e39466 and can be implemented in
a few lines of C:

See:
<and
http://school.anhb.uwa.edu.au/personalpages/kwessen/shared/Marsaglia03.html

kieran · Aug 1, 2007

I have written a function to shuffle a deck of cards, as follows:

[code snipped]

The problem is: this cannot portably shuffle a deck, because it
requires rand() to have at least 52! states. (If it has less
states, the pigeonhole principle shows that not all permutations
are possible.) Should I call the function several times, hoping
that the increment of clock() (and possibly of time(0)) will be
unpredictable enough? Is there anything less ridiculous I could
do?

This paper doesn't answer your question directly (and is probably
rather off-topic for this group), but it's a very good read:

"How We Learned to Cheat at Online Poker: A Study in Software
Security" by Brad Arkin, Frank Hill, Scott Marks, Matt Schmid and
Thomas John Walls. September 1999.

http://www.cigital.com/papers/download/developer_gambling.php

The link describes a real world exploit of an online poker system -
the attackers were able to determine the card deck based on the player
cards, the flop and the current server time.

It relied on a flawed shuffle algorithm, a poor PRNG and the PRNG
being seeded with the system clock.

Regards,
Kieran Elby

websnarf · Aug 1, 2007

I have written a function to shuffle a deck of cards, as follows:
[...]

void shuffle(deck_t * deckptr)
{
static unsigned int times = 0;
int i;
srand((unsigned)clock() ^ (unsigned)time(0) + times++);

Uh ... you are reseeding every shuffle? This is ok if your entropy
has high quality and you don't care about performance.

for (i = 0; i < deckptr->size; i++) {
int r = rand() / (RAND_MAX / (deckptr->size - i) + 1);

This has a bias problem.

[...]

The problem is: this cannot portably shuffle a deck, because it
requires rand() to have at least 52! states. (If it has less
states, the pigeonhole principle shows that not all permutations
are possible.)

Actually any quantity of shuffles you make < 52! will always leave
some shuffles unavailable to you. If you actually want all 52! states
to be possible with roughly equal proportion, then you can use the
Mersenne Twister, which has far more states (it has 2**19937 states).
But you still have to come up with log(2,52!) bits worth of entropy to
seed the thing and that is not coming from clock(), time() and a
counter alone.

The simplest way of creating/generating that much entropy is probably
to maintain an array a rotating offset pointer, and simply xor one of
time() or clock() into that pointer's location, and maintain the state
of this array on disk (so that it is incrementally updated from run to
run.) The array would be the entropic input, that you would only need
to set once (not once per shuffle).

[...] Should I call the function several times, hoping
that the increment of clock() (and possibly of time(0)) will be
unpredictable enough? Is there anything less ridiculous I could
do? Should I resort to system-specific ways <ot>e.g. reading from
/dev/urandom<ot>? I don't need it to be cryptographically secure
(I didn't even bother to cope with the bias due to RAND_MAX + 1
not always being a multiple of deckptr->size - i), but I wouldn't
like that some permutations are more likely than others by several
orders of magnitude.

Well that bias will color things more than anything else really.
Getting an accurate discrete random sample is actually a bit of an
exercise on binary computers as I explain here:

http://www.pobox.com/~qed/random.html

Keith Thompson · Aug 1, 2007

Richard Heathfield said:
Fortunately, this is quite solvable in ISO C, by writing your own
LCPRNG, and then having several different instances of it, each of
which maintains its own state (preferably in a struct, a pointer to
which you give to your customised PRNG function, and even more
preferably having independently configurable 'constants' for the LC
part of the LCPRNG).

LCPRNG means linear congruential pseudo-random number generator, yes?

That way, you can get all the bits you could possibly want - well, 226
wouldn't be any big deal, anyway. If each LCPRNG produces, say, 16 bits
of state, then all you need to do is glue (226+15)/16= 15 PRNs
together. Note, however, that your PRNG cannot call rand(). If it does
so, then you're simply finding another way to express the original
problem, because there will be internal relationships between your
"entropy" bits. But if you have 15 truly independent 16-bit PRN
streams, then you have a fighting chance of a fair deal, so to speak.

But you still have to provide 226 or more bits of state as *input* to
your PRNG(s) in order to be able to (potentially) produce any of the
52! possible shuffles. In portable C, the time() and clock()
functions are the only sources of entropy I can think of, and neither
is very good (time() is typically too coarse and predicatble, and
clock() typically returns CPU time since the start of the program, so
you're likely to get exactly the same result on successive runs.

There are almost certain to be system-specific sources of entropy that
you can use. And you can approach some degree of portability by
reading from a file with a specified (or provided) name, and somehow
arranging outside the program for the contents of the file to be a
good source of entropy. /dev/random, on systems that support it, is
the obvious solution. Another possibility, if you have networking
capability, is to download images from a webcam pointed at a lava
lamp.

The gory details are perhaps better discussed in sci.crypt (which may
well cover this in its FAQ).

Keith Thompson · Aug 1, 2007

I have written a function to shuffle a deck of cards, as follows:
[...]

void shuffle(deck_t * deckptr)
{
static unsigned int times = 0;
int i;
srand((unsigned)clock() ^ (unsigned)time(0) + times++);

Click to expand...

Uh ... you are reseeding every shuffle? This is ok if your entropy
has high quality and you don't care about performance.

[big snip]

I agree about the entropy, but I don't see how this necessarily hurts
performance significantly. I see no reason to assume that the calls
to srand(), clock(), and time() are going to be much slower than a
call to rand(). Of course you're making those calls in addition to
calling rand(), so there is some overhead, but I'd expect it to be
proportionally quite small -- and if you output something on each
shuffle, the I/O operations will take much longer than the
computations.

Richard Tobin · Aug 1, 2007

Keith Thompson said:
I agree about the entropy, but I don't see how this necessarily hurts
performance significantly. I see no reason to assume that the calls
to srand(), clock(), and time() are going to be much slower than a
call to rand().

On many operating systems, clock() and time() are system calls while
rand() and srand() are simple function calls. So I would expect
clock() and time() to be much slower.

I tried calling each 10 million times (on a Mac Powerbook) and the
times were

rand() - 0.45s
srand() - 0.35s
clock() - 18.2s
time() - 1.61s

As expected, clock() and time() are slower, but I don't know why clock()
is so much slower than time().

-- Richard

Richard Heathfield · Aug 1, 2007

Keith Thompson said:

Richard Heathfield said:
Richard Heathfield said:

Fortunately, this is quite solvable in ISO C, by writing your own
LCPRNG, and then having several different instances of it, each of
which maintains its own state (preferably in a struct, a pointer to
which you give to your customised PRNG function, and even more
preferably having independently configurable 'constants' for the LC
part of the LCPRNG).

Click to expand...

LCPRNG means linear congruential pseudo-random number generator, yes?
Yes.

[...] if you have 15 truly independent
16-bit PRN streams, then you have a fighting chance of a fair deal,
so to speak.

Click to expand...

But you still have to provide 226 or more bits of state as *input* to
your PRNG(s) in order to be able to (potentially) produce any of the
52! possible shuffles.

Yes, I believe that's necessary and sufficient.

In portable C, the time() and clock()
functions are the only sources of entropy I can think of,

Well, you could shove some system-generated entropy into a file and have
the C program read the file.

There are almost certain to be system-specific sources of entropy that
you can use. And you can approach some degree of portability by
reading from a file with a specified (or provided) name, and somehow
arranging outside the program for the contents of the file to be a
good source of entropy. /dev/random, on systems that support it, is
the obvious solution.

Ah, I see you're ahead of me as usual.

websnarf · Aug 2, 2007

On many operating systems, clock() and time() are system calls while
rand() and srand() are simple function calls. So I would expect
clock() and time() to be much slower.

Indeed, and my point was more generally about entropy sources, not
just those calls. In general entropy will be obtained by some kind of
system device or derivative of some complex OS object. Some good
entropy sources, such as key polling and mouse interrupts, for
example, are pretty slow to access and don't generate a whole hell of
a lot of entropy per second.

I tried calling each 10 million times (on a Mac Powerbook) and the
times were

rand() - 0.45s
srand() - 0.35s
clock() - 18.2s
time() - 1.61s

As expected, clock() and time() are slower, but I don't know why
clock() is so much slower than time().

In most multitasking OSes you would imagine that clock() needs to add
the total number of clock ticks accumulated thus far up until the last
time slice (probably a field in the task state somewhere, but which
needs to be obtained by an OS call anyways) and add the total number
of ticks elapsed in the current slice (in order to remain correctly
monotonic) which would be a device call and probably another system
call (for the tick offset of the current slice.) I doesn't surprise
me that it would be slow.

The OS scheduler needs fast access to the global tick counter, or
system timers itself. So its very likely that the OS calls backing
time() have a cached copy of the current time (a volatile updated on
timer interrupt) in memory that it has direct access to.

CBFalconer · Aug 2, 2007

Keith said:
(e-mail address removed) writes:
.... snip ...

Uh ... you are reseeding every shuffle? This is ok if your entropy
has high quality and you don't care about performance.

Click to expand...

[big snip]

I agree about the entropy, but I don't see how this necessarily hurts
performance significantly. I see no reason to assume that the calls
to srand(), clock(), and time() are going to be much slower than a
call to rand(). Of course you're making those calls in addition to
calling rand(), so there is some overhead, but I'd expect it to be
proportionally quite small -- and if you output something on each
shuffle, the I/O operations will take much longer than the
computations.

Each reseed has, probably, one of 2^32 possible sequences to
follow. This is not enough. However, if you don't reseed then the
actual sequences can be multiplied by the possiblities inherent in
the RNG internal sequencing. This may or may not be enough (see
the Mersenne Twister), but at least you have gotten past the 2^32
possibility limit.

Gene · Aug 2, 2007

You haven't worked through the arithmetic. `52! - 2^32' is
80658175170943878571660636856403766975289505440883277823995705032704
hands. To fifty-seven decimal places, that's 100% of the total
(unity minus 5E-59, times 100%).

Also, seeding the generator with a high-quality source of
randomness does *not* help. If the generator is only capable of
2^32 distinct sequences, it doesn't matter how fairly or how
unpredictably you choose between them: only 2^32 sequences are
possible. Flip a coin, taking extraordinary care to randomize
the initial conditions, and you will get either Heads or Tails,
never Bellybuttons.

2^32 is utterly negligible compared to 52! -- hell, 2^128
is utterly negligible compared to 52!.

You're right of course. Should have said shuffle with /dev/random,
not seed.

A question about program logic... program provided	20	Sep 20, 2013
self studying: this program slows down after repeated runs on my machine.	10	Sep 2, 2013
Shuffle method error	4	Jun 19, 2006
Deck of Cards -- Shuffling (void shuffle)	4	Feb 19, 2007
Needing Help With My First JavaScript script	1	Mar 22, 2018
Portably extracting data from a bytestring	17	Oct 25, 2005
How do you guys limit rand() results to a range?	133	Jun 1, 2014
Python: Code is ignoring the if and else	11	Aug 3, 2013

portably shuffle a deck

Army1987

Keith Thompson

Ben Bacarisse

Gene

Eric Sosman

Bill Reid

Richard Heathfield

Army1987

Eric Sosman

Richard Heathfield

Ben Bacarisse

kieran

websnarf

Keith Thompson

Keith Thompson

Richard Tobin

Richard Heathfield

websnarf

CBFalconer

Gene

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads