Probability Problem

E

Elliot Temple

Problem: Randomly generate 10 integers from 0-100 inclusive, and sum
them. Do that twice. What is the probability the two sums are 390 apart?

I have code to do part of it (below), and I know how to write code to
do the rest. The part I have calculates the number of ways the dice
can come out to a given number. The problem is the main loop has 9
iterations and it takes about 2.5 minutes to begin the 4th one, and
each iteration is about 101 times longer than the previous one. So:
5045631.5622908585

It'd take 5,000 millennia. (If my computer didn't run out of memory
after about 4 minutes, that is.)

Any suggestions? Either a way to do the same thing much more
efficiently (enough that I can run it) or a different way to solve
the problem.

Code:

li = range(101)
li2 = []
range101 = range(101)
for x in xrange(9):
print "x is %s" % x
li2 = []
for y in li:
for z in range101:
li2 += [y+z]
li = li2
print li.count(800)
# prints how many ways the dice can add to 800


This link may help:
http://www.math.csusb.edu/faculty/stanton/m262/intro_prob_models/
calcprob.html

-- Elliot Temple
http://www.curi.us/blog/
 
L

Lawrence D'Oliveiro

Elliot Temple said:
Problem: Randomly generate 10 integers from 0-100 inclusive, and sum
them. Do that twice. What is the probability the two sums are 390 apart?

I think the sum would come close to a normal distribution.
 
A

Alex Martelli

Lawrence D'Oliveiro said:
I think the sum would come close to a normal distribution.

Yes, very close indeed, by the law of large numbers.

However, very close (in a math course at least) doesn't get the cigar.

You can compute the requested answer exactly with no random number
generation whatsoever: compute the probability of each result from 0 to
1000, then sum the probabilities of entries that are exactly 390 apart.


Alex
 
E

Elliot Temple

Yes, very close indeed, by the law of large numbers.

However, very close (in a math course at least) doesn't get the cigar.

You can compute the requested answer exactly with no random number
generation whatsoever: compute the probability of each result from
0 to
1000, then sum the probabilities of entries that are exactly 390
apart.

That was the plan, but how do I get the probability of any given
result? (in a reasonable amount of time)

BTW I'm not in a math course, just curious.

-- Elliot Temple
http://www.curi.us/blog/
 
A

Alex Martelli

Elliot Temple said:
That was the plan, but how do I get the probability of any given
result? (in a reasonable amount of time)

BTW I'm not in a math course, just curious.

OK, I'll trust that last assertion (sorry for the hesitation, but it's
all too easy to ``help'' somebody with a homework assignment and
actually end up damaging them by doing it FOR them!-).


I'm still going to present this in a way that stimulates thought, rather
than a solved problem -- humor me...!-)


You're generating a uniformly distributed random number in 0..100 (101
possibilities), 10 times, and summing the 10 results.

How do you get a result of 0? Only 1 way: 0 at each attempt --
probability 1 (out of 1010 possibilities).

How do you get a result of 1? 10 ways: 1 at one attempt and 0 at each
of the others - probability 10 (again in 1010'ths;-).

How do you get a result of 2? 10 ways for '2 at one attempt and 0 at
each of the others', plus, 10*9/2 ways for '1 at two attempts and 0 at
each of the others' -- probability 55 (ditto).

....and so forth, but you'd rather not work it out...


So, suppose you start with a matrix of 101 x 10 entries, each of value 1
since all results are equiprobable (or, 1/1010.0 if you prefer;-).

You want to compute the number in which you can combine two rows. How
could you combine the first two rows (each of 101 1's) to make a row of
201 numbers corresponding to the probabilities of the sum of two throws?

Suppose you combine the first entry of the first row with each entry of
the second, then the second entry of the first row with each entry of
the second, etc; each time, you get a sum (of two rolls) which gives you
an index of a entry (in an accumulator row starting at all zeros) to
increment by the product of the entries you're considering...


Can you generalize that? Or, do you need more hints? Just ask!


Alex
 
T

Tim Peters

[Alex Martelli]
[Elliot Temple]
That was the plan, but how do I get the probability of any given
result?

As the link you gave suggests, the number of ways to get a given sum s
is the coefficient of x**s in (1 + x + x**2 + ... +x**100)**10.
Divide that by 101**10 to compute the probability of getting sum s.
(in a reasonable amount of time)

BTW I'm not in a math course, just curious.

Computing exact results is probably too hard for "just curious". This
little Python program computes the exact number of ways to get each
sum in 0 .. 1000 quickly (fraction of a second on my box):

"""
def mul(p1, p2):
result = [0] * (len(p1) + len(p2) - 1)
for i, x1 in enumerate(p1):
for j, x2 in enumerate(p2):
result[i+j] += x1 * x2
while result and result[-1] == 0:
del result[-1]
return result

base = [1] * 101
result = [1]
for dummy in range(10):
result = mul(result, base)

assert len(result) == 1001
assert sum(result) == 101**10
"""

Then result is the exact number of ways to get sum s, for each s in
range(1001).

Figuring out why that works is the "probably too hard for 'just
curious'" part. If you want to pursue that, it's a straightforward
implementation of fully multiplying out:

(1 + x + x**2 + ... +x**100)**10

A polynomial here is represented by a list L of coefficients, where
L is the coefficient of x**i. `mul()` implements polynomial
multiplication using this format. For example,
[-1, 0, 1]

or, IOW, (x+1)*(x-1) = (x**2 - 1).
 
E

Elliot Temple

I think I got it. I noticed my code is essentially the same as Tim
Peter's (plus the part of the problem he skipped). I read his code 20
minutes before recreating mine from Alex's hints. Thanks!

def main():
ways = ways_to_roll()
total_ways = float(101**10)
running_total = 0
for i in range(1000-390+1):
j = i + 390
running_total += ways * ways[j]
print running_total / total_ways**2
print ways[:10]

def ways_to_roll():
result = [1]
for i in xrange(10):
result = combine([1] * 101, result)
return result

def combine(a, b):
results = [0] * (len(a) + len(b) - 1)
for i, ele in enumerate(a):
for j, ele2 in enumerate(b):
results[i+j] += ele * ele2
return results

main()
# output: 3.21962542309e-05 and
# [1, 10, 55, 220, 715, 2002, 5005, 11440, 24310, 48620]
# 3.21962542309e-05 is 32 out of a million



OK, I'll trust that last assertion (sorry for the hesitation, but it's
all too easy to ``help'' somebody with a homework assignment and
actually end up damaging them by doing it FOR them!-).


I'm still going to present this in a way that stimulates thought,
rather
than a solved problem -- humor me...!-)


You're generating a uniformly distributed random number in 0..100 (101
possibilities), 10 times, and summing the 10 results.

How do you get a result of 0? Only 1 way: 0 at each attempt --
probability 1 (out of 1010 possibilities).

How do you get a result of 1? 10 ways: 1 at one attempt and 0 at each
of the others - probability 10 (again in 1010'ths;-).

How do you get a result of 2? 10 ways for '2 at one attempt and 0 at
each of the others', plus, 10*9/2 ways for '1 at two attempts and 0 at
each of the others' -- probability 55 (ditto).

...and so forth, but you'd rather not work it out...


So, suppose you start with a matrix of 101 x 10 entries, each of
value 1
since all results are equiprobable (or, 1/1010.0 if you prefer;-).

You want to compute the number in which you can combine two rows. How
could you combine the first two rows (each of 101 1's) to make a
row of
201 numbers corresponding to the probabilities of the sum of two
throws?

Suppose you combine the first entry of the first row with each
entry of
the second, then the second entry of the first row with each entry of
the second, etc; each time, you get a sum (of two rolls) which
gives you
an index of a entry (in an accumulator row starting at all zeros) to
increment by the product of the entries you're considering...


Can you generalize that? Or, do you need more hints? Just ask!


Alex

-- Elliot Temple
http://www.curi.us/blog/
 
P

Peter Tillotson

I had a possibly similar problem calculating probs related to premium
bond permutation. With 10^12 memory ran out v quickly. In the end I got
round it by writing a recursive function and quantising the probability
density function.

Elliot said:
Problem: Randomly generate 10 integers from 0-100 inclusive, and sum
them. Do that twice. What is the probability the two sums are 390 apart?

I have code to do part of it (below), and I know how to write code to do
the rest. The part I have calculates the number of ways the dice can
come out to a given number. The problem is the main loop has 9
iterations and it takes about 2.5 minutes to begin the 4th one, and each
iteration is about 101 times longer than the previous one. So:
5045631.5622908585

It'd take 5,000 millennia. (If my computer didn't run out of memory
after about 4 minutes, that is.)

Any suggestions? Either a way to do the same thing much more efficiently
(enough that I can run it) or a different way to solve the problem.

Code:

li = range(101)
li2 = []
range101 = range(101)
for x in xrange(9):
print "x is %s" % x
li2 = []
for y in li:
for z in range101:
li2 += [y+z]
li = li2
print li.count(800)
# prints how many ways the dice can add to 800


This link may help:
http://www.math.csusb.edu/faculty/stanton/m262/intro_prob_models/calcprob.html


-- Elliot Temple
http://www.curi.us/blog/
 
T

Tim Peters

[Elliot Temple]
I think I got it. I noticed my code is essentially the same as Tim
Peter's (plus the part of the problem he skipped). I read his code 20
minutes before recreating mine from Alex's hints. Thanks!

def main():
ways = ways_to_roll()
total_ways = float(101**10)
running_total = 0
for i in range(1000-390+1):
j = i + 390
running_total += ways * ways[j]
print running_total / total_ways**2
print ways[:10]

def ways_to_roll():
result = [1]
for i in xrange(10):
result = combine([1] * 101, result)
return result

def combine(a, b):
results = [0] * (len(a) + len(b) - 1)
for i, ele in enumerate(a):
for j, ele2 in enumerate(b):
results[i+j] += ele * ele2
return results

main()
# output: 3.21962542309e-05 and
# [1, 10, 55, 220, 715, 2002, 5005, 11440, 24310, 48620]
# 3.21962542309e-05 is 32 out of a million


You should sanity-check the computation by generalizing it, then
applying it to a case so small you can easily work out the result via
exhaustive enumeration by hand.

For example, suppose you took integers from the much smaller set {0,
1}, and did that only twice. Then the possible sums and their
probabilities are clearly:

0 1/4
1 1/2
2 1/4

If you did this twice, what's the probability that the sums differ by
1? Suitably generalized, your program above would compute 1/4. Is
that actually right? It depends on what exactly "the sums differ by
1" means. If it means the second sum is one larger than the first
sum, 1/4 is correct. Ditto if it means the second sum is one smaller
than the first sum. But if it means 1 is the absolute value of the
difference of the sums, the right answer is 1/2. I'm not sure which
meaning you have in mind, but the last one was my guess.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top