Generate a sequence of random numbers that sum up to 1?

Discussion in 'Python' started by Anthony Liu, Apr 22, 2006.

  1. Anthony Liu

    Anthony Liu Guest

    I am at my wit's end.

    I want to generate a certain number of random numbers.
    This is easy, I can repeatedly do uniform(0, 1) for
    example.

    But, I want the random numbers just generated sum up
    to 1 .

    I am not sure how to do this. Any idea? Thanks.

    __________________________________________________
    Do You Yahoo!?
    Tired of spam? Yahoo! Mail has the best spam protection around
    http://mail.yahoo.com
    Anthony Liu, Apr 22, 2006
    #1
    1. Advertising

  2. Anthony Liu

    Mel Wilson Guest

    Anthony Liu wrote:
    > I am at my wit's end.
    >
    > I want to generate a certain number of random numbers.
    > This is easy, I can repeatedly do uniform(0, 1) for
    > example.
    >
    > But, I want the random numbers just generated sum up
    > to 1 .
    >
    > I am not sure how to do this. Any idea? Thanks.


    numbers.append (random.uniform (0, 1.0-sum(numbers)))

    might help, perhaps.

    or

    scaled = [x/sum(numbers) for x in numbers]

    Mel.
    >
    Mel Wilson, Apr 22, 2006
    #2
    1. Advertising

  3. Anthony Liu wrote:
    > But, I want the random numbers just generated sum up
    > to 1 .


    This seems like an odd request. Might I ask what it's for?

    Generating random numbers in [0,1) that are both uniform and sum to 1 looks
    like an unsatisfiable task. Each number you generate restricts the
    possibilities for future numbers. E.g. if the first number is 0.5, all
    future numbers must be < 0.5 (indeed, must *sum* to 0.5). You'll end up
    with a distribution increasingly skewed towards smaller numbers the more
    you generate. I can't imagine what that would be useful for.

    If that's not a problem, do this: generate the numbers, add them up, and
    divide each by the sum.

    nums = [random.uniform(0,1) for x in range(0,100)]
    sum = reduce(lambda x,y: x+y, nums)
    norm = [x/sum for x in nums]

    Of course now the numbers aren't uniform over [0,1) anymore.

    Also note that the sum of the normalized numbers will be very close to 1,
    but slightly off due to representation issues. If that level of accuracy
    matters, you might consider generating your rands as integers and then
    fp-dividing by the sum (or just store them as integers/fractions).
    Edward Elliott, Apr 22, 2006
    #3
  4. Em Sáb, 2006-04-22 às 03:16 +0000, Edward Elliott escreveu:
    > If that level of accuracy
    > matters, you might consider generating your rands as integers and then
    > fp-dividing by the sum (or just store them as integers/fractions).


    Or using decimal module: http://docs.python.org/lib/module-decimal.html

    --
    Felipe.
    Felipe Almeida Lessa, Apr 22, 2006
    #4
  5. Anthony Liu <> wrote:
    ...
    > As a matter of fact, given that we have to specify the
    > number of states for an HMM, I would like to create a
    > specified number of random floating numbers whose sum
    > is 1.0.


    def forAL(N):
    N_randoms = [random.random() for x in xrange(N)]
    total = sum(N_randoms)
    return [x/total for x in N_randoms]


    Does this do what you want? Of course, the resulting numbers are not
    independent, but then the constraints you pose would contradict that.


    Alex
    Alex Martelli, Apr 22, 2006
    #5
  6. Anthony Liu wrote:
    > I am at my wit's end.
    >
    > I want to generate a certain number of random numbers.
    > This is easy, I can repeatedly do uniform(0, 1) for
    > example.
    >
    > But, I want the random numbers just generated sum up
    > to 1 .
    >
    > I am not sure how to do this. Any idea? Thanks.
    >


    --------------------------------------------------------------
    import random

    def partition(start=0,stop=1,eps=5):
    d = stop - start
    vals = [ start + d * random.random() for _ in range(2*eps) ]
    vals = [start] + vals + [stop]
    vals.sort()
    return vals

    P = partition()

    intervals = [ P[i:i+2] for i in range(len(P)-1) ]

    deltas = [ x[1] - x[0] for x in intervals ]

    print deltas

    print sum(deltas)
    ---------------------------------------------------------------

    Gerard
    Gerard Flanagan, Apr 22, 2006
    #6
  7. Gerard Flanagan wrote:
    > Anthony Liu wrote:
    > > I am at my wit's end.
    > >
    > > I want to generate a certain number of random numbers.
    > > This is easy, I can repeatedly do uniform(0, 1) for
    > > example.
    > >
    > > But, I want the random numbers just generated sum up
    > > to 1 .
    > >
    > > I am not sure how to do this. Any idea? Thanks.
    > >

    >
    > --------------------------------------------------------------
    > import random
    >
    > def partition(start=0,stop=1,eps=5):
    > d = stop - start
    > vals = [ start + d * random.random() for _ in range(2*eps) ]
    > vals = [start] + vals + [stop]
    > vals.sort()
    > return vals
    >
    > P = partition()
    >
    > intervals = [ P[i:i+2] for i in range(len(P)-1) ]
    >
    > deltas = [ x[1] - x[0] for x in intervals ]
    >
    > print deltas
    >
    > print sum(deltas)
    > ---------------------------------------------------------------
    >


    def partition(N=5):
    vals = sorted( random.random() for _ in range(2*N) )
    vals = [0] + vals + [1]
    for j in range(2*N+1):
    yield vals[j:j+2]

    deltas = [ x[1]-x[0] for x in partition() ]

    print deltas

    print sum(deltas)

    >>>

    [0.10271966686994982, 0.13826576491042208, 0.064146913555132801,
    0.11906452454467387, 0.10501198456091299, 0.011732423830768779,
    0.11785369256442912, 0.065927165520102249, 0.098351305878176198,
    0.077786747076205365, 0.099139810689226726]
    1.0
    Gerard Flanagan, Apr 22, 2006
    #7
  8. Gerard Flanagan wrote:
    > Gerard Flanagan wrote:
    > > Anthony Liu wrote:
    > > > I am at my wit's end.
    > > >
    > > > I want to generate a certain number of random numbers.
    > > > This is easy, I can repeatedly do uniform(0, 1) for
    > > > example.
    > > >
    > > > But, I want the random numbers just generated sum up
    > > > to 1 .
    > > >
    > > > I am not sure how to do this. Any idea? Thanks.
    > > >

    > >
    > > --------------------------------------------------------------
    > > import random
    > >
    > > def partition(start=0,stop=1,eps=5):
    > > d = stop - start
    > > vals = [ start + d * random.random() for _ in range(2*eps) ]
    > > vals = [start] + vals + [stop]
    > > vals.sort()
    > > return vals
    > >
    > > P = partition()
    > >
    > > intervals = [ P[i:i+2] for i in range(len(P)-1) ]
    > >
    > > deltas = [ x[1] - x[0] for x in intervals ]
    > >
    > > print deltas
    > >
    > > print sum(deltas)
    > > ---------------------------------------------------------------
    > >

    >
    > def partition(N=5):
    > vals = sorted( random.random() for _ in range(2*N) )
    > vals = [0] + vals + [1]
    > for j in range(2*N+1):
    > yield vals[j:j+2]
    >
    > deltas = [ x[1]-x[0] for x in partition() ]
    >
    > print deltas
    >
    > print sum(deltas)
    >


    finally:

    ---------------------------------------------------------------
    def distribution(N=2):
    p = [0] + sorted( random.random() for _ in range(N-1) ) + [1]
    for j in range(N):
    yield p[j+1] - p[j]

    spread = list(distribution(10))

    print spread
    print sum(spread)
    ---------------------------------------------------------------
    Gerard
    Gerard Flanagan, Apr 22, 2006
    #8
  9. Gerard Flanagan <> wrote:
    > def distribution(N=2):
    > p = [0] + sorted( random.random() for _ in range(N-1) ) + [1]
    > for j in range(N):
    > yield p[j+1] - p[j]
    >
    > spread = list(distribution(10))
    >
    > print spread
    > print sum(spread)


    This is simpler, easier to prove correct and most likely quicker.

    def distribution(N=2):
    L = [ random.uniform(0,1) for _ in xrange(N) ]
    sumL = sum(L)
    return [ l/sumL for l in L ]

    spread = distribution(10)
    print spread
    print sum(spread)

    --
    Nick Craig-Wood <> -- http://www.craig-wood.com/nick
    Nick Craig-Wood, Apr 23, 2006
    #9
  10. Nick Craig-Wood wrote:
    > Gerard Flanagan <> wrote:
    > > def distribution(N=2):
    > > p = [0] + sorted( random.random() for _ in range(N-1) ) + [1]
    > > for j in range(N):
    > > yield p[j+1] - p[j]
    > >
    > > spread = list(distribution(10))
    > >
    > > print spread
    > > print sum(spread)

    >
    > This is simpler, easier to prove correct and most likely quicker.
    >
    > def distribution(N=2):
    > L = [ random.uniform(0,1) for _ in xrange(N) ]
    > sumL = sum(L)
    > return [ l/sumL for l in L ]
    >


    simpler:- ok

    easier to prove correct:- in what sense?

    quicker:- slightly slower in fact (using xrange in both functions).
    This must be due to 'uniform' - using random() rather than uniform(0,1)
    then yes, it's quicker. Roughly tested, I get yours (and Alex
    Martelli's) to be about twice as fast. (2<=N<1000, probably greater
    difference as N increases).

    All the best.

    Gerard
    Gerard Flanagan, Apr 23, 2006
    #10
  11. Anthony Liu

    fumanchu Guest

    I'm surprised noone has pursued a course of subtraction rather than
    division. Say you want 10 numbers:

    >>> s = 1.0
    >>> n = []
    >>> for x in xrange(9):

    .... value = random.random() * s
    .... n.append(value)
    .... s -= value
    ....
    >>> n.append(s)
    >>> n

    [0.7279111122901516, 0.082128708606867745, 0.0080516733577621798,
    0.12122060245902817, 0.0034460458833209676, 0.0021046234724371184,
    0.054109424914363845, 0.00035750970249204185, 0.00051175075536832372,
    0.00015854855820800087]
    >>> sum(n)

    1.0


    Either:
    1) Just because they're *ordered* doesn't mean they're not *random*,
    or
    2) You all now know why I'm not a mathematician. ;)

    It seems to me that the only constraint on the randomness of my results
    is the OP's constraint: that they sum to 1. I'd be fascinated to learn
    if and why that wouldn't work.


    Robert Brewer
    System Architect
    Amor Ministries
    fumanchu, Apr 23, 2006
    #11
  12. fumanchu <> wrote:

    > I'm surprised noone has pursued a course of subtraction rather than
    > division. Say you want 10 numbers:
    >
    > >>> s = 1.0
    > >>> n = []
    > >>> for x in xrange(9):

    > ... value = random.random() * s
    > ... n.append(value)
    > ... s -= value
    > ...
    > >>> n.append(s)
    > >>> n

    > [0.7279111122901516, 0.082128708606867745, 0.0080516733577621798,
    > 0.12122060245902817, 0.0034460458833209676, 0.0021046234724371184,
    > 0.054109424914363845, 0.00035750970249204185, 0.00051175075536832372,
    > 0.00015854855820800087]
    > >>> sum(n)

    > 1.0
    >
    >
    > Either:
    > 1) Just because they're *ordered* doesn't mean they're not *random*,
    > or
    > 2) You all now know why I'm not a mathematician. ;)
    >
    > It seems to me that the only constraint on the randomness of my results
    > is the OP's constraint: that they sum to 1. I'd be fascinated to learn
    > if and why that wouldn't work.


    n[0] is uniformly distributed between 0 and 1; n[1] is not -- not sure
    how to characterize its distribution, but it's vastly skewed to favor
    smaller values -- and further n[x] values for x>1 are progressively more
    and more skewed similarly.

    Such total disuniformity, where the very distribution of each value is
    skewed by the preceding one, may still be "random" for some sufficiently
    vague meaning of "random", but my intuition suggests it's unlikely to
    prove satisfactory for the OP's purposes.


    Alex
    Alex Martelli, Apr 23, 2006
    #12
  13. Anthony Liu

    Robert Kern Guest

    Alex Martelli wrote:
    > fumanchu <> wrote:
    >
    >>I'm surprised noone has pursued a course of subtraction rather than
    >>division. Say you want 10 numbers:
    >>
    >>>>>s = 1.0
    >>>>>n = []
    >>>>>for x in xrange(9):

    >>
    >>... value = random.random() * s
    >>... n.append(value)
    >>... s -= value
    >>...
    >>
    >>>>>n.append(s)
    >>>>>n

    >>
    >>[0.7279111122901516, 0.082128708606867745, 0.0080516733577621798,
    >>0.12122060245902817, 0.0034460458833209676, 0.0021046234724371184,
    >>0.054109424914363845, 0.00035750970249204185, 0.00051175075536832372,
    >>0.00015854855820800087]
    >>
    >>>>>sum(n)

    >>
    >>1.0
    >>
    >>Either:
    >> 1) Just because they're *ordered* doesn't mean they're not *random*,
    >>or
    >> 2) You all now know why I'm not a mathematician. ;)
    >>
    >>It seems to me that the only constraint on the randomness of my results
    >>is the OP's constraint: that they sum to 1. I'd be fascinated to learn
    >>if and why that wouldn't work.

    >
    > n[0] is uniformly distributed between 0 and 1; n[1] is not -- not sure
    > how to characterize its distribution, but it's vastly skewed to favor
    > smaller values -- and further n[x] values for x>1 are progressively more
    > and more skewed similarly.
    >
    > Such total disuniformity, where the very distribution of each value is
    > skewed by the preceding one, may still be "random" for some sufficiently
    > vague meaning of "random", but my intuition suggests it's unlikely to
    > prove satisfactory for the OP's purposes.


    [digression]

    All of this discussion about whether the distribution of values is uniform or
    not doesn't mean much until one has defined "uniformity," or equivalently,
    "distance" in the space we're talking about. In this case, we're talking about
    the unit n-simplex space (SS^n) which has elements S=(s_1, s_2, ... s_n) where
    sum(S) = 1 and s_i >= 0. I favor the Aitchison distance:

    import numpy as np

    def aitchison_distance(x, y):
    """ Compute the Aitchison distance between two vectors in simplex space.
    """
    lx = np.log(x)
    ly = np.log(y)
    lgx = np.mean(lx, axis=-1)
    lgy = np.mean(ly, axis=-1)
    diff = (lx - lgx) - (ly - lgy)
    return np.sqrt(np.sum(diff*diff))

    Note that zeros yield inifinities, so the borders of the unit simplex are
    infinitely farther away from other points in the interior. Consequently,
    generating "uniform" random samples from this space is as impractical as it is
    to draw "uniform" random samples from the entire infinite real number line. It's
    also not very interesting. However, one can transform SS^n into RR^(n-1) and
    back again, so drawing numbers from a multivariate normal of whatever mean and
    covariance you like will give you "nice" simplicial data and quite possibly even
    realistic data, too, depending on your model. I like using the isometric
    log-ratio transform ("ilr" transform) for this.

    Good Google search terms: "compositional data", Aitchison

    But of course, for the OP's purpose of creating synthetic Markov chain
    transition tables, generating some random vectors uniformly on [0, 1)^n and
    normalizing them to sum to 1 works a treat. Don't bother with anything else.

    --
    Robert Kern


    "I have come to believe that the whole world is an enigma, a harmless enigma
    that is made terrible by our own mad attempt to interpret it as though it had
    an underlying truth."
    -- Umberto Eco
    Robert Kern, Apr 23, 2006
    #13
  14. Alex Martelli wrote:
    > Such total disuniformity, where the very distribution of each value is
    > skewed by the preceding one, may still be "random" for some sufficiently
    > vague meaning of "random", but my intuition suggests it's unlikely to
    > prove satisfactory for the OP's purposes.


    It does seem very odd. If you could restrict the range, you could get an
    unskewed distribution. Set range = (0, 2*sum/cnt) and you can generate cnt
    numbers whose sum will tend towards sum (because the average value will be
    sum/cnt):

    target_sum = 1
    cnt = 100
    max = 2.0 * target_sum / cnt # 0.02
    nums = [random.uniform(0,max) for x in range(0,cnt)]
    real_sum = sum(nums) # 0.975... in one sample run

    If the sum has to be exact, you can set the last value to reach it:

    nums[-1] = target_sum - sum(nums[:-1])
    print sum(nums) # 1.0

    which skews the sample ever so slightly. And check for negatives in case
    the sum exceeded the target.

    If the exact count doesn't matter, just generate random nums until you're
    within some delta of the target sum.

    Basically, there usually better options to the problem as originally posed.
    Actually, now that I reread it the OP never said the range had be [0,1).
    So maybe we read too much into the original phrasing. If you need
    anything resembling a uniform distribution, scaling the results afterward
    is not the way to go.
    Edward Elliott, Apr 23, 2006
    #14
  15. Anthony Liu

    Terry Reedy Guest

    "fumanchu" <> wrote in message
    news:...
    > I'm surprised noone has pursued a course of subtraction rather than
    > division.


    I believe someone did mention the subtraction method in one of the initial
    responses. But the problem is this. If you independently sample n numbers
    from a given distribution and then rescale, then the scaled numbers still
    all have the same distribution (and are uniform in that sense). In the
    subtraction method, each comes from a differnt distribution, as others
    explained, with the nth being radically different from the first.

    Terry Jan Reedy
    Terry Reedy, Apr 24, 2006
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. shivermetimbers15
    Replies:
    12
    Views:
    413
    Amy G
    Feb 18, 2004
  2. Anthony Liu
    Replies:
    1
    Views:
    455
    Paul Rubin
    Apr 30, 2006
  3. Anthony Liu
    Replies:
    0
    Views:
    390
    Anthony Liu
    Apr 22, 2006
  4. Prakhar
    Replies:
    3
    Views:
    462
    Default User
    Jun 12, 2007
  5. globalrev
    Replies:
    4
    Views:
    742
    Gabriel Genellina
    Apr 20, 2008
Loading...

Share This Page