Weighted "random" selection from list of lists

Discussion in 'Python' started by Jesse Noller, Oct 8, 2005.

  1. Jesse Noller

    Jesse Noller Guest

    Hello -

    I'm probably missing something here, but I have a problem where I am
    populating a list of lists like this:

    list1 = [ 'a', 'b', 'c' ]
    list2 = [ 'dog', 'cat', 'panda' ]
    list3 = [ 'blue', 'red', 'green' ]

    main_list = [ list1, list2, list3 ]

    Once main_list is populated, I want to build a sequence from items
    within the lists, "randomly" with a defined percentage of the sequence
    coming for the various lists. For example, if I want a 6 item
    sequence, I might want:

    60% from list 1 (main_list[0])
    30% from list 2 (main_list[1])
    10% from list 3 (main_list[2])

    I know how to pull a random sequence (using random()) from the lists,
    but I'm not sure how to pick it with the desired percentages.

    Any help is appreciated, thanks

    -jesse
     
    Jesse Noller, Oct 8, 2005
    #1
    1. Advertising

  2. Jesse Noller

    Ron Adam Guest

    Jesse Noller wrote:


    > 60% from list 1 (main_list[0])
    > 30% from list 2 (main_list[1])
    > 10% from list 3 (main_list[2])
    >
    > I know how to pull a random sequence (using random()) from the lists,
    > but I'm not sure how to pick it with the desired percentages.
    >
    > Any help is appreciated, thanks
    >
    > -jesse


    Just add up the total of all lists.

    total = len(list1)+len(list2)+len(list3)
    n1 = .60 * total # number from list 1
    n2 = .30 * total # number from list 2
    n3 = .10 * total # number from list 3

    You'll need to decide how to handle when a list has too few items in it.

    Cheers,
    Ron
     
    Ron Adam, Oct 8, 2005
    #2
    1. Advertising

  3. Jesse Noller

    Peter Otten Guest

    Jesse Noller wrote:

    > I'm probably missing something here, but I have a problem where I am
    > populating a list of lists like this:
    >
    > list1 = [ 'a', 'b', 'c' ]
    > list2 = [ 'dog', 'cat', 'panda' ]
    > list3 = [ 'blue', 'red', 'green' ]
    >
    > main_list = [ list1, list2, list3 ]
    >
    > Once main_list is populated, I want to build a sequence from items
    > within the lists, "randomly" with a defined percentage of the sequence
    > coming for the various lists. For example, if I want a 6 item
    > sequence, I might want:
    >
    > 60% from list 1 (main_list[0])
    > 30% from list 2 (main_list[1])
    > 10% from list 3 (main_list[2])
    >
    > I know how to pull a random sequence (using random()) from the lists,
    > but I'm not sure how to pick it with the desired percentages.



    If the percentages can be normalized to small integral numbers, just make a
    pool where each list is repeated according to its weight, e. g.
    list1 occurs 6, list2 3 times, and list3 once:

    pools = [list1, list2, list3]
    weights = [6, 3, 1]
    sample_size = 10

    weighted_pools = []
    for p, w in zip(pools, weights):
    weighted_pools.extend([p]*w)

    sample = [random.choice(random.choice(weighted_pools))
    for _ in xrange(sample_size)]


    Another option is to use bisect() to choose a pool:

    pools = [list1, list2, list3]
    sample_size = 10

    def isum(items, sigma=0.0):
    for item in items:
    sigma += item
    yield sigma

    cumulated_weights = list(isum([60, 30, 10], 0))
    sigma = cumulated_weights[-1]

    sample = []
    for _ in xrange(sample_size):
    pool = pools[bisect.bisect(cumulated_weights, random.random()*sigma)]
    sample.append(random.choice(pool))

    (all code untested)

    Peter
     
    Peter Otten, Oct 8, 2005
    #3
  4. Jesse Noller wrote:
    <paraphrased>
    > Once main_list is populated, I want to build a sequence from items
    > within the lists, "randomly" with a defined percentage of the sequence
    > coming for the various lists. For example:
    > 60% from list 1 (main_list[0]), 30% from list 2 (main_list[1]), 10% from list 3 (main_list[2])



    import bisect, random
    main_list = [['a', 'b', 'c'],
    ['dog', 'cat', 'panda'],
    ['blue', 'red', 'green']]
    weights = [60, 30, 10]

    cumulative = []
    total = 0
    for index, value in enumerate(weights):
    total += value
    cumulative.append(total)

    for i in range(20):
    score = random.random() * total
    index = bisect.bisect(cumulative, score)
    print random.choice(main_list[index]),


    --
    -Scott David Daniels
     
    Scott David Daniels, Oct 8, 2005
    #4
  5. On Sat, 08 Oct 2005 12:48:26 -0400, Jesse Noller wrote:

    > Once main_list is populated, I want to build a sequence from items
    > within the lists, "randomly" with a defined percentage of the sequence
    > coming for the various lists. For example, if I want a 6 item
    > sequence, I might want:
    >
    > 60% from list 1 (main_list[0])
    > 30% from list 2 (main_list[1])
    > 10% from list 3 (main_list[2])


    If you are happy enough to match the percentages statistically rather than
    exactly, simply do something like this:

    pr = random.random()
    if pr < 0.6:
    list_num = 0
    elif pr < 0.9:
    list_num = 1
    else:
    list_num = 2
    return random.choice(main_list[list_num])

    or however you want to extract an item.

    On average, this will mean 60% of the items will come from list1 etc, but
    for small numbers of trials, you may have significant differences.



    --
    Steven.
     
    Steven D'Aprano, Oct 9, 2005
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Digital Puer
    Replies:
    5
    Views:
    12,030
    marcus
    Nov 29, 2004
  2. =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==

    List of lists of lists of lists...

    =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==, May 8, 2006, in forum: Python
    Replies:
    5
    Views:
    439
    =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==
    May 15, 2006
  3. Manuel Ebert
    Replies:
    3
    Views:
    357
    Steven D'Aprano
    Aug 31, 2008
  4. Pat
    Replies:
    16
    Views:
    873
  5. C Barrington-Leigh
    Replies:
    1
    Views:
    1,281
    Tim Leslie
    Sep 10, 2010
Loading...

Share This Page