Then why does your simulation show that p(235) is .6 and p(236) is .1?
Because it's a simulation. You don't use the results
of the simulation to determine the probability, you
use the simulation to determine the distribution.
236
is closer to what I call the "spike", why is it not the larger number?
That's called "luck" in the trade.
Make
a few more million runs and *this particular* glitch will most likely go
away.
At 10000, it didn't go away, but it was less pronounced.
But it
won't solve new, similar glitches for much smaller values of k - such as
100.
And it's not supposed to. You're not seeing the forest
for the trees. The individual trees (data points from
the simulation) don't tell you the probability. But
since collectively they show that the simulation is a
normal distribution (forest), we can determine
mathematically what the probability is for k=100 or
k=75.
To wit:
3.4838E-130 for k=100
2.7294E-176 for k=75
Run your own simulation and prove me wrong.
I think you glossed over that sentence, your response is all about the cases
which I explicitly bless in that sentence.
No, it wasn't. It was about using a normal distribution
to determine the probabilities. I guess I took for
granted that you could see how to use this for k=100
and k=75.
I was simply warning the OP that
he could not answer the question as written via a simulation.
IF you only look at the trees.
He *could*,
however, answer a somewhat similar question - which is what you did.
No, I answered the original question
(the NORMDIST(k,Mean,StandardDeviation,FALSE)
column shown below) by looking at the forest.
You only provide values for 37 of the 500 possible k's and leave the reader
to assume all the other cases are zero.
I didn't leave the reader to assume that. Again, I took
for granted that the reader would realize that he can
simply plug any value of k into
NORMDIST(k,Mean,StandardDeviation,FALSE) to see what the
probability is.
But in fact, they just happened to
be 0
They came out 0 in the simulation, but we don't
use the simulation to find the probability, we use
the simulation to find the distribution from which
we can find the probability for any k. For example,
here are the probabilities for some k values outside
my simulation results.
k probability
300 0.00000000000010928528115%
301 0.00000000000002984982860%
302 0.00000000000000794360258%
303 0.00000000000000205962746%
304 0.00000000000000052030181%
305 0.00000000000000012806118%
306 0.00000000000000003070967%
307 0.00000000000000000717511%
308 0.00000000000000000163334%
309 0.00000000000000000036226%
310 0.00000000000000000007828%
311 0.00000000000000000001648%
312 0.00000000000000000000338%
313 0.00000000000000000000068%
See, no simulation necessary.
in only 1000 cases you used to simulate 500! cases.
You never took a statistics class, eh?
When I see the word "precisely", as in this problem, my ears perk up.
There is no permission to ignore or slough off on certain values of k.
I didn't ignore or slough off those other k values.
I gave you an Excel function, all you have to do is
plug in any value you want (provided you know the
mean and the standard deviation).
Consider
a similar situation. If someone said "Toss a fair coin 500 times.
Precisely how mnany times do you get 27 heads?"
Wrong use of "precisely" here. You _can_ ask prcisely
how many 500-bit numbers have a pop-count of 27:
334810521009929044447947514934325903028914000
But you cannot say precisely which of the
3273390607896141870013189696827599152216642046043
0647894832913680961337964046745548832700923259041
5715088668412756007100921725654588539305332852758
9376
outcomes you'll get.
What would the answer be?
(The probability is left as an excercize for the student.)
I see these possible answers:
o I don't know
o a really small number
o I don't care
o zero
o I'm tired and want to go to sleep
o a number which answers the question.
Is there somne reason you would not answer "zero" as you did in the Smith
College situation?
I never answered 0, I left the values of k not covered
by the simulation undetermined. But undetermined
doesn't mean they are 0 as I showed above.
The only useful right
answer is the last one, computed from knowledge of the binomail distributon.
You can't get the answer from simulation, you won't live long enough.
*That* was my point.
And my point was you aren't supposed to get the answer
from the simulation, that's why I said you're looking
at it he wrong way.
It really comes down to, what does the word "precisely" mean, if anything,
in the problem statement. I took its
literal meaning, since I have no reason to believe it was a "noise" word
inserted to confuse me.
What part of NORMDIST(k,Mean,StandardDeviation,FALSE)
don't you understand?