Re: implement random selection in Python

Discussion in 'Python' started by J. Clifford Dyer, Nov 17, 2007.

  1. On Fri, 2007-11-16 at 16:47 -0800, Bruza wrote:
    I think I need to explain on the probability part: the "prob" is a
    > relative likelihood that the object will be included in the output
    > list. So, in my example input of
    >
    > items = [('Mary',30), ('John', 10), ('Tom', 45), ('Jane', 15)]
    >
    > So, for any size of N, 'Tom' (with prob of 45) will be more likely to
    > be included in the output list of N distinct member than 'Mary' (prob
    > of 30) and much more likely than that of 'John' (with prob of 10).
    >
    > I know "prob" is not exactly the "probability" in the context of
    > returning a multiple member list. But what I want is a way to "favor"
    > some member in a selection process.
    >

    My understanding of what you are looking for is that on each individual selection, the probability of picking a given name is that name's prob value, divided by the sum of all prob values. That is, in the following case:

    items = [('Mary',96),('Shimon',1),('Aisha',1),('Toshiro',2)]
    N = 3

    On the first pass Mary has a 96% chance of getting selected, Shimon a 1% Chance, Aisha a 1% chance and Toshiro a 2% chance.

    If Mary gets selected, then on the second round Shimon and Aisha should each have a 25% chance of getting selected, and Toshiro a 50% chance.

    If Shimon gets selected, then on round three, Aisha will have a ~33% chance, and Toshiro a little less than 67%.

    If that is correct, then the following might meet your needs:

    from random import choice

    def select(weighted_list, n=1):
    selected = set()
    for iteration in xrange(n):
    print iteration
    selection_list = []
    for item in weighted_list:
    if item[0] not in selected:
    selection_list.extend([item[0]] * item[1])
    #print selection_list
    this_choice = choice(selection_list)
    #print this_choice
    selected.add(this_choice)
    return selected

    if __name__ == '__main__':
    items = [('Mary',96),('Shimon',1),('Aisha',1),('Toshiro',2)]
    N = 3
    print select(items, 3)


    It creates a new selection list at every pass, but it should be a marked improvement over solutions that have to flail against already chosen values, especially in unbalanced cases like the one I've shown above. I have not run tests to establish this for certain. Exercise left to the reader.

    If you care who was chosen first or last (like, say, if this is for selecting members of a kick ball team), you may want to replace the selected set with a list. Or you may want to keep both the set and a list. Depends on your needs.

    Cheers,
    Cliff
     
    J. Clifford Dyer, Nov 17, 2007
    #1
    1. Advertising

  2. J. Clifford Dyer wrote:
    > My understanding of what you are looking for is that on each individual selection,
    > the probability of picking a given name is that name's prob value, divided by the
    > sum of all prob values. That is, in the following case:
    > items = [('Mary', 96),('Shimon', 1), ('Aisha', 1), ('Toshiro', 2)]
    > ... On the first pass Mary has a 96% chance of getting selected, ....
    > If that is correct, then the following might meet your needs:
    >
    > from random import choice
    >
    > def select(weighted_list, n=1):
    > selected = set()
    > for iteration in xrange(n):
    > print iteration
    > selection_list = []
    > for item in weighted_list:
    > if item[0] not in selected:
    > selection_list.extend([item[0]] * item[1])
    > #print selection_list
    > this_choice = choice(selection_list)
    > #print this_choice
    > selected.add(this_choice)
    > return selected


    or even:

    def picks(weighted_list, N):
    result = []
    adict = dict(weighted_list)
    total = sum(adict.values())
    for i in range(N):
    score = random.random() * total
    for choice, weight in cp.iteritems():
    if weight < score:
    score -= weight
    else:
    result.append(choice)
    total -= cp.pop(choice)
    break
    else:
    raise ValueError('ran out of data')
    return result

    -Scott
     
    Scott David Daniels, Nov 20, 2007
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Simon Niederberger
    Replies:
    2
    Views:
    16,617
    Christian Kaufhold
    Jan 7, 2005
  2. Andrew Crowe
    Replies:
    1
    Views:
    4,496
    Andrew Crowe
    Sep 13, 2004
  3. Bruza
    Replies:
    18
    Views:
    634
    Neil Cerutti
    Nov 19, 2007
  4. globalrev
    Replies:
    4
    Views:
    782
    Gabriel Genellina
    Apr 20, 2008
  5. VK
    Replies:
    15
    Views:
    1,210
    Dr J R Stockton
    May 2, 2010
Loading...

Share This Page