Re: urn scheme

Discussion in 'Python' started by Terry Reedy, Jul 28, 2003.

  1. Terry Reedy

    Terry Reedy Guest

    "Duncan Smith" <> wrote in message
    news:bg38sa$e6r$...
    > Hello,
    > I'm currently doing some simulations where I need to sample

    without
    > replacement from a list of counts. (Actually it's a Numeric array,

    but I'm
    > currently converting the array to a list, using the following

    function and
    > converting back to an array.) I *desperately *need to speed this up
    > (simulations currently take over 24 hours). This sampling is a real
    > bottleneck, and accounts for about 90% of the time cost. If I can

    get a
    > 10-fold speed up I might be in business, and a 100-fold speed up

    might even
    > allow me to meet a deadline. Using psyco appears to give me no

    speed up
    > whatsoever (that can't be right, can it?). I am currently working

    on
    > reducing the number of function calls to urn(), but the biggest

    speed up
    > will still come from improving the performance of this function.

    Any ideas
    > for a significant speed up? TIA.


    Do the minimum needed for each call and minimize intermediate
    structures. Among other things, this means specializing the
    function -- no options. Instead separate with and without replacement
    functions. Consider writing urn() as a generator returning one draw
    at a time instead of a list.

    Assuming that 'bins' means 'colors' (number of each), I suggest
    constructing an single urn list as follows:

    contents = []
    for i in range(len(bins)): contents.extend(bins*)

    This replaces cum_bins and the multiple comparisons and adjustments.
    It can go inside or outside urn() depending upon whether a given
    configuration is used just once or muptiple times. Then:

    sampling one item with replacement: random.choice(contents).
    sampling all without replacement: random.shuffle(contents)
    sampling some without replacement: random.shuffle(contents)[:draws]
    or, if draws is small enough (test timings for switch point)

    def urn_swor(colors, draws):
    '''Construct multicolor urn and destructively sample without
    replacement
    (swor) 'draws' items by successively removing random element
    thereof.
    Colors is array of number of each color.
    '''
    ncolors = len(colors)
    contents = []
    for i in range(ncolors):
    contents.extend(colors*)
    n = len(contents)
    rand = random.randrange
    res = ncolors * [0]
    while draws:
    i = rand(n)
    res[contents] += 1
    contents = contents[-1]
    n -= 1
    draws -= 1
    return res

    >>> urn_swor((0,2,5,3,4), 10)

    [0, 2, 2, 0, 6]
    >>> urn_swor((0,2,5,3,4), 10)

    [0, 1, 4, 1, 4]
    >>> urn_swor((0,2,5,3,4), 10)

    [0, 1, 3, 1, 5]
    >>> urn_swor((0,2,5,3,4), 10)

    [0, 2, 1, 1, 6]

    > -----------------------------------------------------------
    > import random
    >
    > def urn(bins, draws, drawn=0, others=0):
    >
    > """Bins is a sequence containing the numbers
    > of balls in the urns. 'drawn' and 'others'


    Sampling one urn with k colors (variable number of each) is the same
    as a weighted sampling from k urns with 1 color each but should be
    much faster when done as described above.

    > define how the contents of the bins should
    > be updated after each draw in draws. The
    > default arguments correspond to sampling
    > with replacement. e.g. drawn = -1 corresponds
    > to sampling without replacement"""


    'others' not defined

    > len_bins = len(bins)
    > res = [0] * len_bins
    > cum_bins = [bins[0]] + [0] * (len_bins-1)
    > for i in range(1, len_bins):
    > cum_bins = cum_bins[i-1] + bins
    > for i in range(draws):
    > balls = random.random() * cum_bins[-1]
    > adj = 0
    > for j in range(len_bins):
    > if adj == 0 and cum_bins[j] > balls:
    > res[j] += 1
    > cum_bins[j] += drawn> else:
    > cum_bins[j] += (j - adj) * others + adj * drawn


    I have no idea what this is about.

    > return res
    > ----------------------------------------------------------------
    > >>> urn((0,2,5,3,4), 10, -1)

    > [0, 1, 4, 1, 4]


    Terry J. Reedy
    Terry Reedy, Jul 28, 2003
    #1
    1. Advertising

  2. Terry Reedy

    Duncan Smith Guest

    "Terry Reedy" <> wrote in message
    news:...
    >
    > "Duncan Smith" <> wrote in message
    > news:bg38sa$e6r$...


    [snip]

    > > cum_bins[j] += (j - adj) * others + adj * drawn

    >
    > I have no idea what this is about.
    >


    'others' defines the adjustment to be made to the bins not sampled from (so
    zero for most purposes).

    The following gives me about a 30% speed up. Should save me a day or two.
    Cheers.

    Duncan


    def gen_samples(colours, draws, iterations):
    ncolours = len(colours)
    contents = []
    for i in range(ncolours):
    contents.extend(colours*)
    if len(contents) / 2 > draws:
    draws = len(contents) - draws
    for i in range(iterations):
    random.shuffle(contents)
    drawn = contents[:draws]
    res = colours[:]
    for index in drawn:
    res[index] -= 1
    yield res
    else:
    for i in range(iterations):
    random.shuffle(contents)
    drawn = contents[:draws]
    res = ncolours * [0]
    for index in drawn:
    res[index] += 1
    yield res
    Duncan Smith, Jul 29, 2003
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ken Varn
    Replies:
    12
    Views:
    3,606
    Joerg Jooss
    May 14, 2005
  2. C GIllespie

    urn's and wsdl file

    C GIllespie, Jan 27, 2004, in forum: XML
    Replies:
    0
    Views:
    821
    C GIllespie
    Jan 27, 2004
  3. Philippe Poulard

    [XUS] An XML URN Scheme

    Philippe Poulard, Feb 25, 2005, in forum: XML
    Replies:
    0
    Views:
    518
    Philippe Poulard
    Feb 25, 2005
  4. Philippe Poulard
    Replies:
    0
    Views:
    431
    Philippe Poulard
    Feb 25, 2005
  5. Lee
    Replies:
    3
    Views:
    2,802
Loading...

Share This Page