A question about program logic... program provided

G

G G

This program is an example from Deitel and Deitel, "C How to Program"
My question is about the logic in the shuffle function.

j = rand() % 52;

What keeps j from having the same value more than once?

------------------------------------------------------
/* The program comes from "C How to Program"
* by Deitel and Deitel
*/

/* Fig. 10.3: fig10_03.c
The card shuffling and dealing program using structures
*/

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

/* card structure definition */
struct card
{
const char *face; /* defind pointer face */
const char *suit; /* defind pointer suit */

}; /* end structure card */

typedef struct card Card; /* new type name for structure */

/* prototypes */
void fillDeck( Card * const wDeck, const char * wFace[],
const char * wSuit[] );

void shuffle( Card * const wDeck );
void deal( const Card * const wDeck );

int main( void )
{
Card deck[ 53 ]; /* define array of Cards */

/* initialize array of pointers */
const char *face[] = { "Ace", "Deuce", "Three", "Four", "Five",
"Six", "Seven", "Eight", "Nine",
"Ten", "Jack", "Queen", "King" };

/* initialize array of pointers */
const char *suit[] = { "Hearts", "Diamonds", "Clubs", "Spades" };

srand( (unsigned int) time( NULL ) ); /* randomize */

fillDeck( deck, face, suit ); /* load the deck with Card */
shuffle( deck ); /* deal all 52 Cards */
deal( deck ); /*deal all 52 Cards */
return 0; /* indicates successful termination */

} /* end main */


/* Place string into Card structures */
void fillDeck( Card * const wDeck, const char * wFace[],
const char * wSuit[] )
{
int i; /* counter */

/* loop through wDeck */
for ( i = 0; i <= 51; i++ )
{
wDeck[ i ].face = wFace[ i % 13 ];
wDeck[ i ].suit = wSuit[ i / 13 ];
} /* end for */
}

/********************** shuffle() *******************************/
/* shuffle cards */
void shuffle( Card * const wDeck )
{
int i; /* counter */
int j; /* variable to hold random value between 0 - 51 */

Card temp; /* define temporary structure for swapping Cards */

/* loop through wDeck randomly swapping Cards */
for ( i = 0; i <= 51; i++ )
{
j = rand() % 52;
temp = wDeck[ i ];
wDeck[ i ] = wDeck[ j];
wDeck[ j ] = temp;

} /* end for */

} /* end function shuffle */
/*****************************************************************/

/* deal cards */
void deal( const Card * const wDeck )
{
int i; /* counter */

/* loop through wDeck */
for ( i = 0; i <= 51; i++ )
{
printf( "%5s of %-8s%s", wDeck[ i ].face, wDeck[ i ].suit,
( i + 1 ) % 4 ? " ": "\n" );
} /* end for */

} /* end function deal */
 
M

Mark Storkamp

G G said:
This program is an example from Deitel and Deitel, "C How to Program"
My question is about the logic in the shuffle function.

j = rand() % 52;

What keeps j from having the same value more than once?

Nothing. But it doesn't matter if it does or not. The cards have already
been assigned, so you won't have any duplicate cards. j is only the
position in the deck that you will be switching with the card in
position i. It's not an optimal shuffling routing, but without using a
better PRNG than rand(), it won't make any difference.
 
B

Ben Bacarisse

G G said:
This program is an example from Deitel and Deitel, "C How to Program"
My question is about the logic in the shuffle function.

j = rand() % 52;

What keeps j from having the same value more than once?

Nothing, if rand() is a good pseudo-random generator.

If, by some cleverness, you could be sure that you could evaluate that
expression 52 times with no duplicates, then you would not really need a
shuffling algorithm -- just put cards 0, 1, 2 ... 51 in the positions
return by the magic expression.

That is in effect what a shuffling algorithm must do. It must take some
sequence of uniform random numbers and produce the numbers 0 to 51 in
some order, with every order being equally likely.

The shuffling algorithm you show is not very good at doing that. It's
main problem is that it is not uniform -- not all permutations of the
deck are equally probable.

/********************** shuffle() *******************************/
/* shuffle cards */
void shuffle( Card * const wDeck )
{
int i; /* counter */
int j; /* variable to hold random value between 0 - 51 */

Card temp; /* define temporary structure for swapping Cards */

/* loop through wDeck randomly swapping Cards */
for ( i = 0; i <= 51; i++ )
{
j = rand() % 52;
temp = wDeck[ i ];
wDeck[ i ] = wDeck[ j];
wDeck[ j ] = temp;

} /* end for */

} /* end function shuffle */
/*****************************************************************/
<snip>
 
E

Eric Sosman

This program is an example from Deitel and Deitel, "C How to Program"
My question is about the logic in the shuffle function.

j = rand() % 52;

What keeps j from having the same value more than once?

Nothing.

Any more questions?
 
B

Ben Bacarisse

Mark Storkamp said:
Nothing. But it doesn't matter if it does or not. The cards have already
been assigned, so you won't have any duplicate cards. j is only the
position in the deck that you will be switching with the card in
position i. It's not an optimal shuffling routing, but without using a
better PRNG than rand(), it won't make any difference.

Do you mean that the quality of the shuffle routine won't matter if rand
is bad? I don't see that. Even if rand() is bad, why introduce more
bias with a non-uniform shuffle?
 
G

G G

Mark said:
Nothing. But it doesn't matter if it does or not. The cards have already
been assigned, so you won't have any duplicate cards. j is only the
position in the deck that you will be switching with the card in
position i. It's not an optimal shuffling routing, but without using a
better PRNG than rand(), it won't make any difference.

Ok, so what may happen is that as i goes to 51, there may be some cards
that are switch back to there original position?
 
W

Willem

Mark Storkamp wrote:
) In article <[email protected]>,
)
)> This program is an example from Deitel and Deitel, "C How to Program"
)> My question is about the logic in the shuffle function.
)>
)> j = rand() % 52;
)>
)> What keeps j from having the same value more than once?
)>
)
) Nothing. But it doesn't matter if it does or not. The cards have already
) been assigned, so you won't have any duplicate cards. j is only the
) position in the deck that you will be switching with the card in
) position i. It's not an optimal shuffling routing, but without using a
) better PRNG than rand(), it won't make any difference.

Yes it will, a huge difference even. I once made a histogram for a
couple thousand runs of a 10-card shuffle, and there were clear outliers.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
S

Seebs

Ok, so what may happen is that as i goes to 51, there may be some cards
that are switch back to there original position?

That is certainly possible.

If it weren't, that would be a pretty bad shuffle...

-s
 
G

G G

That is certainly possible.
If it weren't, that would be a pretty bad shuffle...

Ok, thanks everyone.

i haven't gotten to the exercise section of this chapter yet, but there
probably is a problem, an exercise, that ask for a better random
generator to be created.. :)

g.
 
W

Willem

Seebs wrote:
)> Ok, so what may happen is that as i goes to 51, there may be some cards
)> that are switch back to there original position?
)
) That is certainly possible.
)
) If it weren't, that would be a pretty bad shuffle...

No, it wouldn't. Of course, a card should have a chance to remain in its
original position (a 1/52 chance), but there is no reason why a shuffling
algorithm should switch a card to another position, and then back to its
original position.

For shits and giggles, you should make a program that uses the OP's
algorithm to shuffle a deck, and then runs that a large number of times
to build up a 52x52 matrix where each cell holds the probability of
card X ending up in position Y.

Or for bonus points, do it quantummechanically. I.E., after each step of
the algorithm, record the probability that each card is in each position.
You can use the 52x52 matrix for that.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
M

Mark Storkamp

Ben Bacarisse said:
Do you mean that the quality of the shuffle routine won't matter if rand
is bad? I don't see that. Even if rand() is bad, why introduce more
bias with a non-uniform shuffle?

As I understand the problem, you are looking at either 52**52
combinations, vs 52! combinations. and since (52**52) MOD (52!) is not
0, there is some pigeon holing, and my guess is that any such error is
swamped by rand() being limited to 2**32 or 2**64 states. Since 52! /
(2**64) = 4.37x10^48, you're getting a miniscule subset of the possible
shuffles anyway.
 
B

Ben Bacarisse

Mark Storkamp said:
As I understand the problem, you are looking at either 52**52
combinations, vs 52! combinations. and since (52**52) MOD (52!) is not
0, there is some pigeon holing, and my guess is that any such error is
swamped by rand() being limited to 2**32 or 2**64 states. Since 52! /
(2**64) = 4.37x10^48, you're getting a miniscule subset of the possible
shuffles anyway.

My point is that it seems unwise to say that since rand is, by
specification, unable to give you every sequence, you should not care
about adding yet more bias. It may be that the one form of bias is more
or less significant to whatever the program is ultimately doing, so I
don't think you can be casual about one just because of the other.
 
M

Mark Storkamp

Ben Bacarisse said:
My point is that it seems unwise to say that since rand is, by
specification, unable to give you every sequence, you should not care
about adding yet more bias. It may be that the one form of bias is more
or less significant to whatever the program is ultimately doing, so I
don't think you can be casual about one just because of the other.

True, but I wouldn't want to try to explain, through USENET, the details
to someone just learning to program. IMO, better to let them know of the
potential problem, and they can research it on their own, or ask for
clarification, if they wish. And even though I haven't tried to prove
it, I still believe that rand() is the greater source of bias. So it's
being casual based on intuitions. It might be an interesting exercise to
find out for sure. I've got a PRNG with enough states to get a proper
shuffle, but I'm not sure how long I'd have to run it before noticing an
effect from the two shuffle routines.
 
J

James Dow Allen

As I understand the problem, you are looking at either 52**52
combinations, vs 52! combinations...
Since 52! / (2**64) = 4.37x10^48, you're getting a miniscule
subset of the possible shuffles anyway.

You're missing the point. As Willem points out,
the shuffle routine is simply defective. You don't
need simulations to see this; you can count cases for
small decksizes. There are 4^4 cases for a 4-deck and
one will see
0th goes to 0'th 64 times
1st goes to 0'th 75 times
2nd goes to 0'th 63 times
3rd goes to 0'th 54 times
This bias will dwarf any flaw in your PRNG.

James
 
B

Ben Bacarisse

Mark Storkamp said:
True, but I wouldn't want to try to explain, through USENET, the details
to someone just learning to program. IMO, better to let them know of the
potential problem, and they can research it on their own, or ask for
clarification, if they wish. And even though I haven't tried to prove
it, I still believe that rand() is the greater source of bias.

I don't think that measuring the bias and deciding how much is due to
what is necessarily very useful. As I say below, the exact *kind* may
be much more significant.
So it's
being casual based on intuitions. It might be an interesting exercise to
find out for sure. I've got a PRNG with enough states to get a proper
shuffle, but I'm not sure how long I'd have to run it before noticing an
effect from the two shuffle routines.

No, I think it should be easy. The killer is that the kind of bias
introduced by the "simple shuffle" algorithm is easy to analyse. For
example, the relative probability of the last card ending up in the
first position is 2 * (n-1)^(n-1) * n^(1-n) for an n-card deck. That's
0.742913... (rather than 1) when n is 52. That's an edge I'd be
prepared to bet on in some circumstances!

The bias introduced by a limited-state rand is going to be huge (there
will be many impossible sequences, for example) but I would not want to
try to work out any particular exploitable property given the PRNG and
shuffle algorithm. (Actually, I'd have a go if I knew that the rand()
was *really* awful and had say, say, an alternating low bit, but it's
still going to be quote hard and, anyway, are we even considering that
kind of bad?)

Of course, your original point was about which is worse when the two are
combined -- a bad rand *and* the simple shuffle -- and that's hard to
predict, but I'd guess that the order bias would still be detectable. I
might try it and see (there's lots of bad rands out there!) if I have
time.

If there is a big picture here, it's that the amount of bias (from one
source or another) is often far less important that how that bias
interacts with the overall task.
 
T

Tim Rentsch

Mark Storkamp said:
As I understand the problem, you are looking at either 52**52
combinations, vs 52! combinations. and since (52**52) MOD (52!) is not
0, there is some pigeon holing, and my guess is that any such error is
swamped by rand() being limited to 2**32 or 2**64 states. Since 52! /
(2**64) = 4.37x10^48, you're getting a miniscule subset of the possible
shuffles anyway.

Your guess is wrong. If you try actually playing around with
various RNG implementations I expect you'll find the observed
behaviors contradict your intuition in this case. Even the
notoriously bad RANDU(), with only 30 bits of state, does a fair
job of producing good 52-element shuffles using an "ideal"
shuffle algorithm, but miserable results using the "bad" shuffle
algorithm. It's not hard to get tolerable results out of RNG's
having as little as 20 bits of internal state (I didn't try
anything smaller) using an unbiased shuffle, but basically
impossible using the "just swap each element once to anywhere"
algorithm.
 
K

Keith Thompson

Tim Rentsch said:
Your guess is wrong. If you try actually playing around with
various RNG implementations I expect you'll find the observed
behaviors contradict your intuition in this case. Even the
notoriously bad RANDU(), with only 30 bits of state, does a fair
job of producing good 52-element shuffles using an "ideal"
shuffle algorithm, but miserable results using the "bad" shuffle
algorithm. It's not hard to get tolerable results out of RNG's
having as little as 20 bits of internal state (I didn't try
anything smaller) using an unbiased shuffle, but basically
impossible using the "just swap each element once to anywhere"
algorithm.

Hmm. I think it depends on what you mean by a good shuffle.

Suppose you have an algorithm that produces a perfect shuffle given
an ordered deck of cards and a sequence of truly random numbers,
where I define "perfect shuffle" to mean that all 52! (about 8.1e67)
possible orders are equally probable.

Using that algorithm with a PRNG that has only 30 bits of state
can generate no more than 2**30 distinct shuffles, which is smaller
than the set of all possible orderings by a factor of about 7.5e58.

Perhaps those 2**30 shuffles will be satisfactory in some sense,
but they're only a tiny fraction of all possible orderings.
For example, it will very probably *never* produce an ordering
that's the exact reverse of the input (or, much less likely, it
will do so far more often than it should); a perfect shuffle would
produce such an ordering one out of 52! times.

Depending on the application, this may or may not matter.
 
P

Paul N

Tim Rentsch said:
[...]
As I understand the problem, you are looking at either 52**52
combinations, vs 52! combinations. and since (52**52) MOD (52!) is not
0, there is some pigeon holing, and my guess is that any such error is
swamped by rand() being limited to 2**32 or 2**64 states. Since 52! /
(2**64) = 4.37x10^48, you're getting a miniscule subset of the possible
shuffles anyway.
Your guess is wrong. If you try actually playing around with
various RNG implementations I expect you'll find the observed
behaviors contradict your intuition in this case. Even the
notoriously bad RANDU(), with only 30 bits of state, does a fair
job of producing good 52-element shuffles using an "ideal"
shuffle algorithm, but miserable results using the "bad" shuffle
algorithm. It's not hard to get tolerable results out of RNG's
having as little as 20 bits of internal state (I didn't try
anything smaller) using an unbiased shuffle, but basically
impossible using the "just swap each element once to anywhere"
algorithm.



Hmm. I think it depends on what you mean by a good shuffle.



Suppose you have an algorithm that produces a perfect shuffle given

an ordered deck of cards and a sequence of truly random numbers,

where I define "perfect shuffle" to mean that all 52! (about 8.1e67)

possible orders are equally probable.



Using that algorithm with a PRNG that has only 30 bits of state

can generate no more than 2**30 distinct shuffles, which is smaller

than the set of all possible orderings by a factor of about 7.5e58.



Perhaps those 2**30 shuffles will be satisfactory in some sense,

but they're only a tiny fraction of all possible orderings.

For example, it will very probably *never* produce an ordering

that's the exact reverse of the input (or, much less likely, it

will do so far more often than it should); a perfect shuffle would

produce such an ordering one out of 52! times.



Depending on the application, this may or may not matter.

I remember reading an article on just this point recently (though forget where!)

If your random number generator has only a 32-bit seed, then the number of possible shuffles is very low compared to the actual possibilities, such that it is theoretically possible to deduce the the whole pack after seeing only the first six or seven cards. Clearly this would be unsuitable for a gambling application. Indeed, as the most normal way of getting a seed involves reading the time, the number of shuffles becomes relatively tiny. The article considered the case where a gambling application rashly published theprogram that did the shuffle. If the game is played once, it is possible (after about two days of number-crunching) to establish what seed was used, and thus to see what the time is on the shuffling computer and how much it differs from yours. So next time, if you know the time of the shuffle say to within 10 seconds, there are only about 10000 possible shuffles, and yourcomputer can run through them in the time allowed for playing the game.
 
E

Eric Sosman

[...]
Hmm. I think it depends on what you mean by a good shuffle.

Suppose you have an algorithm that produces a perfect shuffle given
an ordered deck of cards and a sequence of truly random numbers,
where I define "perfect shuffle" to mean that all 52! (about 8.1e67)
possible orders are equally probable.

Using that algorithm with a PRNG that has only 30 bits of state
can generate no more than 2**30 distinct shuffles, which is smaller
than the set of all possible orderings by a factor of about 7.5e58.

Perhaps those 2**30 shuffles will be satisfactory in some sense,
but they're only a tiny fraction of all possible orderings.
For example, it will very probably *never* produce an ordering
that's the exact reverse of the input (or, much less likely, it
will do so far more often than it should); a perfect shuffle would
produce such an ordering one out of 52! times.

<topicality level="low" better_suited_for="comp.sci.math">

Consider an experiment in which you are confronted with two
shuffle routines A and B. One of them (you don't know which) uses
the well-known "fair" shuffle, the other uses the faulty shuffle
from the O.P.'s code. Both use PRNG's of identical quality.

Now, the question: How many shuffles must you perform with
each routine to have a better-than-even chance of identifying
which is which?

Follow-up question: How many shuffles ... to make the
identification with X% confidence, instead of the 50% of the
original question?

Follow-up follow-up question: If you could perform one
shuffle per nanosecond (with whichever routine you select for
that particular trial), what confidence level could you have
attained if you'd been shuffling continuously since the Big
Bang, which in this context we might call the Big Deal? (I
estimate you could do not quite 5e26 shuffles in that time;
would that be enough to discern which routine achieved the
fairer distribution over 8e67 possibilities?)

Depending on the application, this may or may not matter.

A-yeh.
 
T

Tim Rentsch

Keith Thompson said:
Tim Rentsch said:
Your guess is wrong. If you try actually playing around with
various RNG implementations I expect you'll find the observed
behaviors contradict your intuition in this case. Even the
notoriously bad RANDU(), with only 30 bits of state, does a fair
job of producing good 52-element shuffles using an "ideal"
shuffle algorithm, but miserable results using the "bad" shuffle
algorithm. It's not hard to get tolerable results out of RNG's
having as little as 20 bits of internal state (I didn't try
anything smaller) using an unbiased shuffle, but basically
impossible using the "just swap each element once to anywhere"
algorithm.

Hmm. I think it depends on what you mean by a good shuffle.
[snip elaboration]

True, it does. Here's a thought experiment.

Compare a PRNG-based shuffle against a truly random shuffle (eg,
based on the outcome of quantum mechanical events). Do this by
perfoming a million iterations of shuffling followed by dealing
out seven five-card poker hands, tabulating results for each of
the two methods. Compute some set of statistical measures, eg,
how many of each kind of hand, how often one pair is beaten by
two pair, etc. If the PRNG-based shuffle is less than, say,
one standard deviation away from the truly random shuffle, then
I think it's reasonable to say the PRNG has done a fair job of
producing a good shuffle (all the foregoing made with the
understanding that the PRNG-based shuffle is done using one of
the standard unbiased algorithms).

As the number of iterations increases, the truly random shuffle
will get closer and closer to an ideal distribution, whereas
a PRNG-based shuffle will not have this property if its internal
state is substantially less than, eg, 52! in this case.

However, if all we're interested in is a few tens of millions
of shuffles, there's no reason a PRNG with only 30 bits of
internal state can't do a reasonable job of approximating
a truly random shuffle.

Note that a million shuffles is more than enough to clearly
differentiate an unbiased shuffle algorithm and the "just swap
each element once to anywhere" algorithm, using the simple
measure of which card ends up in which position. The results
are quite stark, and not at all hard to reproduce if one is
interested in doing so.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top