# Re: implement random selection in Python

Discussion in 'Python' started by J. Clifford Dyer, Nov 17, 2007.

1. ### J. Clifford DyerGuest

On Fri, 2007-11-16 at 16:47 -0800, Bruza wrote:
I think I need to explain on the probability part: the "prob" is a
> relative likelihood that the object will be included in the output
> list. So, in my example input of
>
> items = [('Mary',30), ('John', 10), ('Tom', 45), ('Jane', 15)]
>
> So, for any size of N, 'Tom' (with prob of 45) will be more likely to
> be included in the output list of N distinct member than 'Mary' (prob
> of 30) and much more likely than that of 'John' (with prob of 10).
>
> I know "prob" is not exactly the "probability" in the context of
> returning a multiple member list. But what I want is a way to "favor"
> some member in a selection process.
>

My understanding of what you are looking for is that on each individual selection, the probability of picking a given name is that name's prob value, divided by the sum of all prob values. That is, in the following case:

items = [('Mary',96),('Shimon',1),('Aisha',1),('Toshiro',2)]
N = 3

On the first pass Mary has a 96% chance of getting selected, Shimon a 1% Chance, Aisha a 1% chance and Toshiro a 2% chance.

If Mary gets selected, then on the second round Shimon and Aisha should each have a 25% chance of getting selected, and Toshiro a 50% chance.

If Shimon gets selected, then on round three, Aisha will have a ~33% chance, and Toshiro a little less than 67%.

If that is correct, then the following might meet your needs:

from random import choice

def select(weighted_list, n=1):
selected = set()
for iteration in xrange(n):
print iteration
selection_list = []
for item in weighted_list:
if item[0] not in selected:
selection_list.extend([item[0]] * item[1])
#print selection_list
this_choice = choice(selection_list)
#print this_choice
selected.add(this_choice)
return selected

if __name__ == '__main__':
items = [('Mary',96),('Shimon',1),('Aisha',1),('Toshiro',2)]
N = 3
print select(items, 3)

It creates a new selection list at every pass, but it should be a marked improvement over solutions that have to flail against already chosen values, especially in unbalanced cases like the one I've shown above. I have not run tests to establish this for certain. Exercise left to the reader.

If you care who was chosen first or last (like, say, if this is for selecting members of a kick ball team), you may want to replace the selected set with a list. Or you may want to keep both the set and a list. Depends on your needs.

Cheers,
Cliff

J. Clifford Dyer, Nov 17, 2007

2. ### Scott David DanielsGuest

J. Clifford Dyer wrote:
> My understanding of what you are looking for is that on each individual selection,
> the probability of picking a given name is that name's prob value, divided by the
> sum of all prob values. That is, in the following case:
> items = [('Mary', 96),('Shimon', 1), ('Aisha', 1), ('Toshiro', 2)]
> ... On the first pass Mary has a 96% chance of getting selected, ....
> If that is correct, then the following might meet your needs:
>
> from random import choice
>
> def select(weighted_list, n=1):
> selected = set()
> for iteration in xrange(n):
> print iteration
> selection_list = []
> for item in weighted_list:
> if item[0] not in selected:
> selection_list.extend([item[0]] * item[1])
> #print selection_list
> this_choice = choice(selection_list)
> #print this_choice
> selected.add(this_choice)
> return selected

or even:

def picks(weighted_list, N):
result = []
adict = dict(weighted_list)
total = sum(adict.values())
for i in range(N):
score = random.random() * total
for choice, weight in cp.iteritems():
if weight < score:
score -= weight
else:
result.append(choice)
total -= cp.pop(choice)
break
else:
raise ValueError('ran out of data')
return result

-Scott

Scott David Daniels, Nov 20, 2007

## Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.