Random Drawing Simulation -- performance issue

Discussion in 'Python' started by Brendon Towle, Sep 12, 2006.

  1. I need to simulate scenarios like the following: "You have a deck of
    3 orange cards, 5 yellow cards, and 2 blue cards. You draw a card,
    replace it, and repeat N times."

    So, I wrote the following code, which works, but it seems quite slow
    to me. Can anyone point out some obvious thing that I'm doing
    inefficiently? Or, is this more or less as good as it gets?

    For reference, my typical numbers look like this:

    2 <= len(population) <= 7
    4 <= len(mapping) <= 50
    10 <= count <= 1000000

    B.

    <code>
    #!/usr/bin/env python

    import random

    def randomDrawing(count, population):
    """Simulates drawing <count> items from <population>, with
    replacement.
    population is a list of lists: [[count1, type1], [count2,
    type2], ...]

    Typical examples:
    >>>randomDrawing(100, [[3, 'orange'], [5, 'yellow'], [2, 'blue']])

    [[28, 'orange'], [57, 'yellow'], [15, 'blue']]

    >>>randomDrawing(100000, [[3, 'orange'], [5, 'yellow'], [2,

    'blue']])
    [[29923, 'orange'], [50208, 'yellow'], [19869, 'blue']]

    """
    res = [[0, item[1]] for item in population]
    mapping = []
    for i in xrange(len(population)):
    mapping.extend(*population[0])
    for i in xrange(count):
    index = random.choice(mapping)
    res[index][0] += 1
    return res

    </code>

    --
    Brendon Towle, PhD
    Cognitive Scientist
    +1-412-690-2442x127
    Carnegie Learning, Inc.
    The Cognitive Tutor Company ®
    Helping over 375,000 students in 1000 school districts succeed in math.
    Brendon Towle, Sep 12, 2006
    #1
    1. Advertising

  2. Brendon Towle

    Yu-Xi Lim Guest

    Brendon Towle wrote:
    > I need to simulate scenarios like the following: "You have a deck of 3
    > orange cards, 5 yellow cards, and 2 blue cards. You draw a card, replace
    > it, and repeat N times."
    >
    > So, I wrote the following code, which works, but it seems quite slow to
    > me. Can anyone point out some obvious thing that I'm doing
    > inefficiently? Or, is this more or less as good as it gets?


    Well, a quick test with count=1000000 shows that random.choice takes up
    most of the time (about 85% on my computer). I don't think there's much
    else you can do about it. Coding the equivalent of random.choice
    yourself isn't going to be faster.

    You can try using a 1D array (a simple list) for res to save one lookup
    then transforming that list to the appropriate format just before
    returning it. But that's prob going to have minimal impact.
    Yu-Xi Lim, Sep 12, 2006
    #2
    1. Advertising

  3. Brendon Towle wrote:
    > I need to simulate scenarios like the following: "You have a deck of 3
    > orange cards, 5 yellow cards, and 2 blue cards. You draw a card, replace
    > it, and repeat N times."
    >
    > So, I wrote the following code, which works, but it seems quite slow to
    > me. Can anyone point out some obvious thing that I'm doing
    > inefficiently? Or, is this more or less as good as it gets?
    >
    > For reference, my typical numbers look like this:
    >
    > 2 <= len(population) <= 7
    > 4 <= len(mapping) <= 50
    > 10 <= count <= 1000000
    >
    > B.
    >
    > <code>
    > #!/usr/bin/env python
    >
    > import random
    >
    > def randomDrawing(count, population):
    > """Simulates drawing <count> items from <population>, with replacement.
    > population is a list of lists: [[count1, type1], [count2, type2], ...]
    >
    > Typical examples:
    > >>>randomDrawing(100, [[3, 'orange'], [5, 'yellow'], [2, 'blue']])

    > [[28, 'orange'], [57, 'yellow'], [15, 'blue']]
    >
    > >>>randomDrawing(100000, [[3, 'orange'], [5, 'yellow'], [2, 'blue']])

    > [[29923, 'orange'], [50208, 'yellow'], [19869, 'blue']]
    >
    > """
    > res = [[0, item[1]] for item in population]
    > mapping = []
    > for i in xrange(len(population)):
    > mapping.extend(*population[0])
    > for i in xrange(count):
    > index = random.choice(mapping)
    > res[index][0] += 1
    > return res
    >
    > </code>
    >
    > --Brendon Towle, PhD
    > Cognitive Scientist
    > +1-412-690-2442x127
    > Carnegie Learning, Inc.
    > The Cognitive Tutor Company ®
    > Helping over 375,000 students in 1000 school districts succeed in math.
    >
    >


    Given the data structure, might it not be faster to generate binomial
    variates for n-1 types, and set nth count = #draws - sum(other
    outcomes)? And for a "large" count, could you get by with a normal
    approximation? If you *do* feel the need for exact binomial, there are
    short, somewhat sluggish routines in Devroye (1986 - on the web!, pg
    525), and Random_binomial1, compiled by Alan Miller(AMrandom.zip0,
    available, xlated, at http://www.esbconsult.com/. Even though they are
    relatively slow, generating a few of these would seem to be a lot faster
    than your current approach.

    Sorry I can't help more - I'm just starting to learn Python, have yet to
    even get IDLE going.

    Regards,
    Dave Braden
    David J. Braden, Sep 12, 2006
    #3
  4. Brendon Towle

    Simon Forman Guest

    Brendon Towle wrote:
    > I need to simulate scenarios like the following: "You have a deck of
    > 3 orange cards, 5 yellow cards, and 2 blue cards. You draw a card,
    > replace it, and repeat N times."
    >
    > So, I wrote the following code, which works, but it seems quite slow
    > to me. Can anyone point out some obvious thing that I'm doing
    > inefficiently? Or, is this more or less as good as it gets?
    >
    > For reference, my typical numbers look like this:
    >
    > 2 <= len(population) <= 7
    > 4 <= len(mapping) <= 50
    > 10 <= count <= 1000000
    >
    > B.
    >
    > <code>
    > #!/usr/bin/env python
    >
    > import random
    >
    > def randomDrawing(count, population):
    > """Simulates drawing <count> items from <population>, with
    > replacement.
    > population is a list of lists: [[count1, type1], [count2,
    > type2], ...]
    >
    > Typical examples:
    > >>>randomDrawing(100, [[3, 'orange'], [5, 'yellow'], [2, 'blue']])

    > [[28, 'orange'], [57, 'yellow'], [15, 'blue']]
    >
    > >>>randomDrawing(100000, [[3, 'orange'], [5, 'yellow'], [2,

    > 'blue']])
    > [[29923, 'orange'], [50208, 'yellow'], [19869, 'blue']]
    >
    > """
    > res = [[0, item[1]] for item in population]
    > mapping = []
    > for i in xrange(len(population)):
    > mapping.extend(*population[0])
    > for i in xrange(count):
    > index = random.choice(mapping)
    > res[index][0] += 1
    > return res
    >
    > </code>
    >
    > --
    > Brendon Towle, PhD
    > Cognitive Scientist
    > +1-412-690-2442x127
    > Carnegie Learning, Inc.
    > The Cognitive Tutor Company ®
    > Helping over 375,000 students in 1000 school districts succeed in math.


    I got nearly a 2x speed up with this variant:

    def randomDrawing3(count, population):
    res = [[0, item[1]] for item in population]
    mapping = []
    for i in xrange(len(population)):
    mapping.extend(*population[0])

    n = len(mapping)
    for i in xrange(count):
    index = int(n * random.random())
    res[mapping[index]][0] += 1

    return res


    Apparently "int(n * random.random())" is faster than random.choice()
    or random.randint()

    sforman@garbage:~ $ python -mtimeit -s'import random; n=10' 'int(n *
    random.random())'
    100000 loops, best of 3: 3.22 usec per loop

    sforman@garbage:~ $ python -mtimeit -s'import random; n=10'
    'random.randint(1, n)'
    100000 loops, best of 3: 13 usec per loop

    sforman@garbage:~ $ python -mtimeit -s'import random; n=range(10)'
    'random.choice(n)'
    100000 loops, best of 3: 6.07 usec per loop

    (Note that the first and last examples produce values 0..9 while the
    middle one produces 1..10)


    I don't know for sure, but I think the random, uh, spread or whatever
    will be the same for random() as for choice(). If it's important, you
    should verify that. ;-)

    Peace,
    ~Simon
    Simon Forman, Sep 12, 2006
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    1
    Views:
    619
    Kevin Spencer
    Jan 9, 2006
  2. Travis E. Oliphant
    Replies:
    4
    Views:
    484
    David J. Braden
    Sep 13, 2006
  3. globalrev
    Replies:
    4
    Views:
    742
    Gabriel Genellina
    Apr 20, 2008
  4. defn noob
    Replies:
    1
    Views:
    369
    Mark Space
    Jun 28, 2008
  5. VK
    Replies:
    15
    Views:
    1,119
    Dr J R Stockton
    May 2, 2010
Loading...

Share This Page