Smith college diploma problem

Surya Dutt · Feb 11, 2007

Hello,

Did someone happen to come accross and algorithm or a C++ program that simulates the Smith Collge Diploma problem?

"At Smith College, the graduation exercises traditionally proceed as follows: Although
each diploma is made out to a particular girl, all the diplomas are initially given out at random. All of the girls who do not get their own diplomas then form a circle, and each passes the diploma she has to the girl on her right. Those who now have their own diplomas drop out, and the remaining girls again pass their diplomas to the right, and so on. This procedure is repeated until each girl has her own diploma. If there are n girls in the graduating class what is the probability that it takes precisely k passes before each girl has her own diploma?"

Thanks.

John Harrison · Feb 11, 2007

Surya said:
Hello,

Did someone happen to come accross and algorithm or a C++ program that
simulates the Smith Collge Diploma problem?

"At Smith College, the graduation exercises traditionally proceed as
follows: Although
each diploma is made out to a particular girl, all the diplomas are
initially given out at random. All of the girls who do not get their own
diplomas then form a circle, and each passes the diploma she has to the
girl on her right. Those who now have their own diplomas drop out, and
the remaining girls again pass their diplomas to the right, and so on.
This procedure is repeated until each girl has her own diploma. If there
are n girls in the graduating class what is the probability that it
takes precisely k passes before each girl has her own diploma?"

Thanks.

This doesn't seem like a particularly difficult coding problem (I pass
on the math problem however). I suggest you do your own homework and
write the code. If you are having problems with the code then by all
means post the code here and you will get some help.

But try for yourself first.

John

osmium · Feb 14, 2007

Surya Dutt said:
Did someone happen to come accross and algorithm or a C++ program that
simulates the Smith Collge Diploma problem?

"At Smith College, the graduation exercises traditionally proceed as
follows: Although
each diploma is made out to a particular girl, all the diplomas are
initially given out at random. All of the girls who do not get >their own
diplomas then form a circle, and each passes the diploma she has to the
girl on her right. Those who now have >their own diplomas drop out, and the
remaining girls again pass their diplomas to the right, and so on. This
procedure is >repeated until each girl has her own diploma. If there are n
girls in the graduating class what is the probability that it takes

I don't think a computer simulation provides a useful way of answering the
question asked. Consider the results of running a million simulations for n
= 500 students. Now make a histogram of the resulting k's with the
probability of k on the y axis. For most of its length the graph will be a
line close to zero with a fairly narrow and very high spike around the
expected number of interchanges. Now do another million simulations. Very
little information is added. There is no assurance at all that the answers
are even *relatively* right, that is a k value of 100 could show a lower
probability of occurring than a k of 75. I would expect most of the
probabilities to start out with two or three zeroes after the decimal point
before encountering a significant digit

If the question were "What is the expected number of interchanges required,
a simulation would be fine. But IMO to get the "probability of precisely k
passes" via simulation is doomed to failure Even if we are willing to
settle for any reasonable approximation of precisely, it is still doomed.

The problem is not so inordinately difficult that analyzing it is plain
impossible; as a matter of fact the very first hit I got on Google provided
an analytic solution. But note that it might be necessary to use a big
number library to come up with numerical answers.

Moral: Programmers do not eliminate the need for mathematicians.

hrlnaik · Feb 14, 2007

This doesn't seem like a particularly difficult coding problem (I pass
on the math problem however). I suggest you do your own homework and
write the code. If you are having problems with the code then by all
means post the code here and you will get some help.

But try for yourself first.

John

Hello John,

Myself Hiral Naik
I am student of Master Of Computer Science

I phased the same situation...... so if u understand the problem then
please help me
can u explain me what this program say ?????????????????????

Please i m waitting for your response

hrlnaik · Feb 14, 2007

This doesn't seem like a particularly difficult coding problem (I pass
on the math problem however). I suggest you do your own homework and
write the code. If you are having problems with the code then by all
means post the code here and you will get some help.

But try for yourself first.

John

Hello John,

Myself Hiral Naik
I am student of Master Of Computer Science

I phased the same situation...... so if u understand the problem then
please help me
can u explain me what this program say ?????????????????????

Please i m waitting for your response

Victor Bazarov · Feb 14, 2007

[..]
I phased the same situation...... so if u understand the problem then
please help me
can u explain me what this program say ?????????????????????

You need a list of "girls" each holding a pointer to "diploma" (self
is the most convenient), which will represent the student body. You
need a function to "pass the diploma". You need a function to remove
the "girls" who have "their own diploma". You need a running counter
of "passes". The list will keep the counter of remaining "girls".
Allocate "girls", place them in the list, and in the loop: remove
those that have "their own diploma", then if any left in the list,
call the "pass to the next" function, otherwise you're done. Count
the number of loops. Implement the "pass to the next" and "remove
the matched" yourself. We don't do homeworks.

V

Mensanator · Feb 15, 2007

I don't think a computer simulation provides a useful way of answering the
question asked.
Really?

Consider the results of running a million simulations for n
= 500 students.

Ok, but I did 1000 so it would fit on my spreadsheet.

Now make a histogram of the resulting k's with the
probability of k on the y axis.
Ok.

For most of its length the graph will be a
line close to zero with a fairly narrow and very high spike around the
expected number of interchanges.

Not if you autoscale the chart.

Now do another million simulations.
Why?

Very little information is added.

Right, that's why the first run is sufficient.

There is no assurance at all that the answers
are even *relatively* right, that is a k value of 100 could show a lower
probability of occurring than a k of 75.

Huh? The mean for N students is N/2 (that's what the
simulation will tell if you didn't already know it),
so k=100 will not have a lower probability than k=75,
that much is obvious.

I would expect most of the
probabilities to start out with two or three zeroes after the decimal point
before encountering a significant digit

Yeah, but what does that have to do with determining the
probability of k passes?

If the question were "What is the expected number of interchanges required,
a simulation would be fine. But IMO to get the "probability of precisely k
passes" via simulation is doomed to failure
Really?

Even if we are willing to
settle for any reasonable approximation of precisely, it is still doomed.

My reasonable approximation looked pretty good.

The problem is not so inordinately difficult that analyzing it is plain
impossible; as a matter of fact the very first hit I got on Google provided
an analytic solution. But note that it might be necessary to use a big
number library to come up with numerical answers.

Uh, maybe you're looking at it the wrong way?

Moral: Programmers do not eliminate the need for mathematicians.

If I use the Excel function for Normal Distribution using
Mean and StandardDeviation obtained from the simulation for
N=500, wouldn't you say that this is a "reasonable approximation"
when compared to the actual probabilities from the simulation?

k Actual NORMDIST(k,Mean,StandardDeviation,FALSE)
233 0.10% 0.11%
234 0.20% 0.18%
235 0.60% 0.27%
236 0.10% 0.40%
237 0.60% 0.57%
238 1.10% 0.80%
239 1.00% 1.10%
240 1.40% 1.47%
241 1.70% 1.92%
242 2.80% 2.44%
243 3.40% 3.01%
244 3.40% 3.62%
245 4.70% 4.25%
246 5.30% 4.86%
247 4.20% 5.42%
248 5.90% 5.88%
249 6.00% 6.21%
250 5.50% 6.40%
251 7.80% 6.43%
252 6.10% 6.28%
253 4.60% 5.99%
254 5.50% 5.56%
255 5.30% 5.03%
256 4.90% 4.43%
257 4.80% 3.80%
258 3.50% 3.18%
259 2.50% 2.59%
260 2.00% 2.06%
261 1.10% 1.59%
262 1.40% 1.20%
263 0.70% 0.88%
264 0.70% 0.63%
265 0.30% 0.44%
266 0.20% 0.30%
267 0.20% 0.20%
268 0.10% 0.13%
269 0.20% 0.08%

osmium · Feb 17, 2007

Mensanator said:
Ok, but I did 1000 so it would fit on my spreadsheet.

Not if you autoscale the chart.

Right, that's why the first run is sufficient.

Huh? The mean for N students is N/2 (that's what the
simulation will tell if you didn't already know it),
so k=100 will not have a lower probability than k=75,
that much is obvious.

Then why does your simulation show that p(235) is .6 and p(236) is .1? 236
is closer to what I call the "spike", why is it not the larger number? Make
a few more million runs and *this particular* glitch will most likely go
away. But it
won't solve new, similar glitches for much smaller values of k - such as
100.

Yeah, but what does that have to do with determining the
probability of k passes?

I think you glossed over that sentence, your response is all about the cases
which I explicitly bless in that sentence. I was simply warning the OP that
he could not answer the question as written via a simulation. He *could*,
however, answer a somewhat similar question - which is what you did.

But IMO to get the "probability of precisely k

My reasonable approximation looked pretty good.

You only provide values for 37 of the 500 possible k's and leave the reader
to assume all the other cases are zero. But in fact, they just happened to
be 0 in only 1000 cases you used to simulate 500! cases.

When I see the word "precisely", as in this problem, my ears perk up.
There is no permission to ignore or slough off on certain values of k.
Consider
a similar situation. If someone said "Toss a fair coin 500 times.
Precisely how mnany times do you get 27 heads?" What would the answer be?
I see these possible answers:

o I don't know
o a really small number
o I don't care
o zero
o I'm tired and want to go to sleep
o a number which answers the question.

Is there somne reason you would not answer "zero" as you did in the Smith
College situation? The only useful right
answer is the last one, computed from knowledge of the binomail distributon.
You can't get the answer from simulation, you won't live long enough.
*That* was my point.

Uh, maybe you're looking at it the wrong way?

It really comes down to, what does the word "precisely" mean, if anything,
in the problem statement. I took its
literal meaning, since I have no reason to believe it was a "noise" word
inserted to confuse me.

Mensanator · Feb 17, 2007

Then why does your simulation show that p(235) is .6 and p(236) is .1?

Because it's a simulation. You don't use the results
of the simulation to determine the probability, you
use the simulation to determine the distribution.

236
is closer to what I call the "spike", why is it not the larger number?

That's called "luck" in the trade.

Make
a few more million runs and *this particular* glitch will most likely go
away.

At 10000, it didn't go away, but it was less pronounced.

But it
won't solve new, similar glitches for much smaller values of k - such as
100.

And it's not supposed to. You're not seeing the forest
for the trees. The individual trees (data points from
the simulation) don't tell you the probability. But
since collectively they show that the simulation is a
normal distribution (forest), we can determine
mathematically what the probability is for k=100 or
k=75.

To wit:

3.4838E-130 for k=100
2.7294E-176 for k=75

Run your own simulation and prove me wrong.

I think you glossed over that sentence, your response is all about the cases
which I explicitly bless in that sentence.

No, it wasn't. It was about using a normal distribution
to determine the probabilities. I guess I took for
granted that you could see how to use this for k=100
and k=75.

I was simply warning the OP that
he could not answer the question as written via a simulation.

IF you only look at the trees.

He *could*,
however, answer a somewhat similar question - which is what you did.

No, I answered the original question
(the NORMDIST(k,Mean,StandardDeviation,FALSE)
column shown below) by looking at the forest.

You only provide values for 37 of the 500 possible k's and leave the reader
to assume all the other cases are zero.

I didn't leave the reader to assume that. Again, I took
for granted that the reader would realize that he can
simply plug any value of k into
NORMDIST(k,Mean,StandardDeviation,FALSE) to see what the
probability is.

But in fact, they just happened to
be 0

They came out 0 in the simulation, but we don't
use the simulation to find the probability, we use
the simulation to find the distribution from which
we can find the probability for any k. For example,
here are the probabilities for some k values outside
my simulation results.

k probability
300 0.00000000000010928528115%
301 0.00000000000002984982860%
302 0.00000000000000794360258%
303 0.00000000000000205962746%
304 0.00000000000000052030181%
305 0.00000000000000012806118%
306 0.00000000000000003070967%
307 0.00000000000000000717511%
308 0.00000000000000000163334%
309 0.00000000000000000036226%
310 0.00000000000000000007828%
311 0.00000000000000000001648%
312 0.00000000000000000000338%
313 0.00000000000000000000068%

See, no simulation necessary.

in only 1000 cases you used to simulate 500! cases.

You never took a statistics class, eh?

When I see the word "precisely", as in this problem, my ears perk up.
There is no permission to ignore or slough off on certain values of k.

I didn't ignore or slough off those other k values.
I gave you an Excel function, all you have to do is
plug in any value you want (provided you know the
mean and the standard deviation).

Consider
a similar situation. If someone said "Toss a fair coin 500 times.
Precisely how mnany times do you get 27 heads?"

Wrong use of "precisely" here. You _can_ ask prcisely
how many 500-bit numbers have a pop-count of 27:
334810521009929044447947514934325903028914000

But you cannot say precisely which of the
3273390607896141870013189696827599152216642046043
0647894832913680961337964046745548832700923259041
5715088668412756007100921725654588539305332852758
9376
outcomes you'll get.

What would the answer be?

(The probability is left as an excercize for the student.)

I see these possible answers:

o I don't know
o a really small number
o I don't care
o zero
o I'm tired and want to go to sleep
o a number which answers the question.

Is there somne reason you would not answer "zero" as you did in the Smith
College situation?

I never answered 0, I left the values of k not covered
by the simulation undetermined. But undetermined
doesn't mean they are 0 as I showed above.

The only useful right
answer is the last one, computed from knowledge of the binomail distributon.
You can't get the answer from simulation, you won't live long enough.
*That* was my point.

And my point was you aren't supposed to get the answer
from the simulation, that's why I said you're looking
at it he wrong way.

It really comes down to, what does the word "precisely" mean, if anything,
in the problem statement. I took its
literal meaning, since I have no reason to believe it was a "noise" word
inserted to confuse me.

What part of NORMDIST(k,Mean,StandardDeviation,FALSE)
don't you understand?

DIPLOMAXPRESS	0	Dec 16, 2007
Problem with a login script, SESSION user rights and put this together so it works with the other pages and MySQL. Code examples.	2	May 5, 2023
Company time spent of Gadget the Canadian Hacker	0	Jun 27, 2008
phony French doc defrauding holistic healthcare practitioners via web	0	Apr 12, 2009
TF-IDF	1	Aug 19, 2021
News: College For The Baby-On-Board Generation	2	Apr 21, 2005
cannot find symbol?	1	May 5, 2013
Life	0	Sep 22, 2010

Smith college diploma problem

Surya Dutt

John Harrison

osmium

hrlnaik

hrlnaik

Victor Bazarov

Mensanator

osmium

Mensanator

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads