Still too slow

E

elsa

Hello again,

Thanks for the tips r.e random.ranint(). This improved matters
somewhat, however my program is still too slow. If anyone has any
further tips on how to speed it up, they would be much appreciated!

So, I'm calling evolve(L,limit) from the interactive prompt. L is
initally [[100],['NA']]. Ideally limit would be 10^7.

Here is my program:

import random
n=100

def evolve(L,limit):
global n
while n<limit:
evnt = event()
if evnt!="None":
ind = chooseInd(L,n)
action(evnt,L,ind)

def chooseInd(L,n):
choiceSum=0
index=0
choice = random.randint(1,n)
while choiceSum < choice:
choiceSum+=L[index][0]
index +=1
return (index-1)

def event():
choice = random.random()
if choice <= .3:
event='b'
elif choice <= .4:
event ='d'
elif choice<= .5:
event = 'm'
else:
event = 'None'
return event

def action(event, L, index):
global n
if event == 'b':
L[index][0]+=1
n +=1
elif event == 'd':
L[index][0]-=1
n -=1
elif event == 'm':
L.append([1,index])
n +=1


thanks in advance,

Elsa.
 
P

Philip Semanchuk

Hello again,

Thanks for the tips r.e random.ranint(). This improved matters
somewhat, however my program is still too slow. If anyone has any
further tips on how to speed it up, they would be much appreciated!

So, I'm calling evolve(L,limit) from the interactive prompt. L is
initally [[100],['NA']]. Ideally limit would be 10^7.

Here is my program:

import random
n=100

def evolve(L,limit):
global n
while n<limit:
evnt = event()
if evnt!="None":
ind = chooseInd(L,n)
action(evnt,L,ind)

def chooseInd(L,n):
choiceSum=0
index=0
choice = random.randint(1,n)
while choiceSum < choice:
choiceSum+=L[index][0]
index +=1
return (index-1)

def event():
choice = random.random()
if choice <= .3:
event='b'
elif choice <= .4:
event ='d'
elif choice<= .5:
event = 'm'
else:
event = 'None'
return event

def action(event, L, index):
global n
if event == 'b':
L[index][0]+=1
n +=1
elif event == 'd':
L[index][0]-=1
n -=1
elif event == 'm':
L.append([1,index])
n +=1

Hi Elsa,
I didn't follow the earlier discussion on this, so pardon me if I'm
repeating what others have said.

First, have you profiled the code? That will help you find bottlenecks.

I haven't profiled it myself so I am shooting from the hip, but a
couple of obvious speedups in event() jump out at me. I'd rewrite it
like this:

def event():
choice = random.random()
if choice <= .3:
return 'b'
elif choice <= .4:
return 'd'
elif choice <= .5:
return 'm'

return None

Returning immediately when you've found your choice will save
evaluating a couple of elifs -- not a big help, but it is a little. It
comes at the cost having multiple exits to the function which is
something I prefer to avoid, but this function is small enough that
having multiple exits won't get confusing.

Second, returning the Python object None rather than the string "None"
allows you to change evolve() from this string comparison:
if evnt!="None":

to this comparison to the Python singleton None:
if evnt != None:

or this, which is even simpler (although it may not be faster):
if evnt:

Hope this helps
Philip
 
J

John Posner

Hello again,

Thanks for the tips r.e random.ranint(). This improved matters
somewhat, however my program is still too slow. If anyone has any
further tips on how to speed it up, they would be much appreciated!

So, I'm calling evolve(L,limit) from the interactive prompt. L is
initally [[100],['NA']]. Ideally limit would be 10^7.

Here is my program:

import random
n=100

def evolve(L,limit):
global n
while n<limit:
evnt = event()
if evnt!="None":
ind = chooseInd(L,n)
action(evnt,L,ind)

def chooseInd(L,n):
choiceSum=0
index=0
choice = random.randint(1,n)
while choiceSum< choice:
choiceSum+=L[index][0]
index +=1
return (index-1)

def event():
choice = random.random()
if choice<= .3:
event='b'
elif choice<= .4:
event ='d'
elif choice<= .5:
event = 'm'
else:
event = 'None'
return event

def action(event, L, index):
global n
if event == 'b':
L[index][0]+=1
n +=1
elif event == 'd':
L[index][0]-=1
n -=1
elif event == 'm':
L.append([1,index])
n +=1


thanks in advance,

Elsa.

Elsa,

1. You changed the subject line from "For loop searching takes too
long!" to "Still too slow". This causes newsreader programs to start a
new discussion thread, which makes life difficult for people who need to
refer back to previous messages. Please don't change the subject line
any more.

2. You provided a very clear description of your original question:

Now, what I need to do is randomly choose one myList, however the
distribution of my random choice needs to be proportional to the
values of myList[0].

This description should be the doc string for the chooseInd() function
-- for example:

def chooseInd(L,n):
"""
randomly choose one L, so that the distribution
of choices is proportional to the values of L[0]
"""

It is not clear (to me, anyway) what the other functions are trying to
accomplish. So please add a doc string to each of these functions, with
descriptions of similar clarity. This will help us a lot. And it will
help you, too, if you return to the code after a few days/weeks/months
of directing your attention elsewhere.

3. Please provide a *complete* transcript of an interactive session that
exercises this code.

Tx,
John
 
A

Alf P. Steinbach

* elsa:
Thanks for the tips r.e random.ranint(). This improved matters
somewhat, however my program is still too slow. If anyone has any
further tips on how to speed it up, they would be much appreciated!

So, I'm calling evolve(L,limit) from the interactive prompt. L is
initally [[100],['NA']].

This would produce an unhandled exception since your code tries to compute the
sum of the values in the list. You can't add strings and numbers.

Ideally limit would be 10^7.

Here is my program:

import random
n=100

def evolve(L,limit):
global n
while n<limit:
evnt = event()
if evnt!="None":
ind = chooseInd(L,n)
action(evnt,L,ind)

def chooseInd(L,n):
choiceSum=0
index=0
choice = random.randint(1,n)
while choiceSum < choice:
choiceSum+=L[index][0]
index +=1
return (index-1)

This is the problematic part of your program, because it uses time linear in the
length of L.

Consider if you know the sum s_low of the lower half of the array, and the sum
s_high of the higher half of the array. Then you can choose the lower half with
probability s_low/(s_low+s_high), otherwise the higher half. Then repeat this
recursively for the half chosen, until you get down to a single array element.

This requires keeping track of the sums of halfs of the array, e.g. in a tree
like structure or a set of arrays of diminishing lengths. Generally it will use
some constant factor times twice the storage that you're now using. But it
reduces the time for choosing an index from O(n) to O(log n).

For example, if you have a logical structure like

*
/ \
* 1
/ \
98 1

then at the top level * you choose the left branch with probability 99/(99+1) =
99/100 = 0.99. At the second level * you choose the right branch with
probability 1/(98+1) = 1/99. The combined probability of choosing that lower 1
is then 0.99*(1/99) = 0.01, as it should be.

def event():
choice = random.random()
if choice <= .3:
event='b'
elif choice <= .4:
event ='d'
elif choice<= .5:
event = 'm'
else:
event = 'None'
return event

Note that you can double the speed of your program by removing the 'None' value.
Adjust the limits you're checking against so as to retain the same probabilities
of the choices. Further constant factor speedup is possible by simply returning
a function to Do Stuff instead of returning a character saying what to do.


def action(event, L, index):
global n
if event == 'b':
L[index][0]+=1
n +=1
elif event == 'd':
L[index][0]-=1
n -=1
elif event == 'm':
L.append([1,index])
n +=1


thanks in advance,

An observation about design: you have a global n (the current total sum) and an
array L that you're passing around everywhere, so that it's effectively also a
global. This indicates that they should ideally instead be attributes of an
object, with your routines above as methods. However in Python this may cause
some overhead, so, perhaps first make it Work (and then make a Copy). :)


Cheers & hth.,

- Alf
 
A

Alf P. Steinbach

* Alf P. Steinbach:
* elsa:
Thanks for the tips r.e random.ranint(). This improved matters
somewhat, however my program is still too slow. If anyone has any
further tips on how to speed it up, they would be much appreciated!

So, I'm calling evolve(L,limit) from the interactive prompt. L is
initally [[100],['NA']].

This would produce an unhandled exception since your code tries to
compute the sum of the values in the list. You can't add strings and
numbers.

Ideally limit would be 10^7.

Here is my program:

import random
n=100

def evolve(L,limit):
global n
while n<limit:
evnt = event()
if evnt!="None":
ind = chooseInd(L,n)
action(evnt,L,ind)

def chooseInd(L,n):
choiceSum=0
index=0
choice = random.randint(1,n)
while choiceSum < choice:
choiceSum+=L[index][0]
index +=1
return (index-1)

This is the problematic part of your program, because it uses time
linear in the length of L.

Consider if you know the sum s_low of the lower half of the array, and
the sum s_high of the higher half of the array. Then you can choose the
lower half with probability s_low/(s_low+s_high), otherwise the higher
half. Then repeat this recursively for the half chosen, until you get
down to a single array element.

This requires keeping track of the sums of halfs of the array, e.g. in a
tree like structure or a set of arrays of diminishing lengths. Generally
it will use some constant factor times twice the storage that you're now
using. But it reduces the time for choosing an index from O(n) to O(log n).

For example, if you have a logical structure like

*
/ \
* 1
/ \
98 1

then at the top level * you choose the left branch with probability
99/(99+1) = 99/100 = 0.99. At the second level * you choose the right
branch with probability 1/(98+1) = 1/99. The combined probability of
choosing that lower 1 is then 0.99*(1/99) = 0.01, as it should be.

Oh, forgot to add: since your array's first item will be the one that's far most
often chosen a "blind" sort of literal implementation of the above will perhaps
slow down the program instead of speeding it up, because it only speeds up those
searches that progress beyond the first item. But anyway, I'm guessing that
checking the first item first, as a special case, will really improve things.
Only when it turns out that it's not the first item, apply binary search.

def event():
choice = random.random()
if choice <= .3:
event='b'
elif choice <= .4:
event ='d'
elif choice<= .5:
event = 'm'
else:
event = 'None'
return event

Note that you can double the speed of your program by removing the
/s/double/increase/g


'None' value. Adjust the limits you're checking against so as to retain
the same probabilities of the choices. Further constant factor speedup
is possible by simply returning a function to Do Stuff instead of
returning a character saying what to do.


def action(event, L, index):
global n
if event == 'b':
L[index][0]+=1
n +=1
elif event == 'd':
L[index][0]-=1
n -=1
elif event == 'm':
L.append([1,index])
n +=1


thanks in advance,

An observation about design: you have a global n (the current total sum)
and an array L that you're passing around everywhere, so that it's
effectively also a global. This indicates that they should ideally
instead be attributes of an object, with your routines above as methods.
However in Python this may cause some overhead, so, perhaps first make
it Work (and then make a Copy). :)

Cheers & hth.,

- Alf
 
E

elsa

Hi John and others,

sorry about my etiquette errors. As you can tell I'm a newbie, and
appreciate all the help I can get. I'm trying to master this thing
with only the net and a couple of books as tutors.

Here is what I'm running at the interactive prompt:
import myBDM
L=[[100,'NA']]
myBDM.evolve(L,100000)

Ideally, I'd like to bump it up to myBDM.evolve(L,10000000). L keeps
track of the number of subpopulations, how many individuals are in
each subpopulation, and the parent subpopulation each subpopulation
arose from.

Here is my code again:


import random
n=100

def evolve(L,limit):
"""
evolves the population until the population size reaches limit, by
choosing an individual of a particular subpopulation
type, then randomly performing
a birth, death, or mutation on this individual
"""
global n
while n<limit:
evnt = event()
if evnt!="None":
ind = chooseInd(L,n)
action(evnt,L,ind)

def chooseInd(L,n):
"""
randomly choose one subpopulation L , so that the distribution
of choices is proportional to the values of L[0]
"""
choiceSum=0
index=0
choice = random.randint(1,n)
while choiceSum < choice:
choiceSum+=L[index][0]
index +=1
return (index-1)

def event():
"""
randomly chooses a birth (prob .3), death (prob .1) or mutation
(prob .1) event
"""
choice = random.random()
if choice <= .3:
event='b'
elif choice <= .4:
event ='d'
elif choice<= .5:
event = 'm'
else:
event = 'None'
return event

def action(event, L, index):
"""
decides what happens to the population/subpopulation, based on
whether
the event is a birth, death or mutation
"""
global n
if event == 'b':
L[index][0]+=1
n +=1
elif event == 'd':
L[index][0]-=1
n -=1
elif event == 'm':
L.append([1,index])
n +=1
 
M

MRAB

elsa said:
Hi John and others,

sorry about my etiquette errors. As you can tell I'm a newbie, and
appreciate all the help I can get. I'm trying to master this thing
with only the net and a couple of books as tutors.

Here is what I'm running at the interactive prompt:
[snip]

def event():
"""
randomly chooses a birth (prob .3), death (prob .1) or mutation
(prob .1) event
"""
choice = random.random()
if choice <= .3:
event='b'
elif choice <= .4:
event ='d'
elif choice<= .5:
event = 'm'
else:
event = 'None'
return event
[snip]
Here's my line of thought:

Currently, 0 <= choice < 1.0.

If you multiplied the random number by 10 then the probability
thresholds could all be ints, and if you're using ints then 'choice'
could be an index into a list:

event_choices = ['b', 'b', 'b', 'd', 'm'] + ['None'] * 5

which then suggests using random.choice() instead:

def event():
"""
randomly chooses a birth (prob .3), death (prob .1) or mutation (prob
..1) event
"""
return random.choice(event_choices)

and you could put the event code inline instead of in a function, which
saves the cost of a function call.

Incidentally, for a probability of 0.3 the test should be "choice <
0.3", etc, not "choice <= 0.3", etc, although I suspect that you won't
notice any difference in the results! :)
 
D

Dave Angel

John said:
<div class="moz-text-flowed" style="font-family: -moz-fixed">On
Hello again,

Thanks for the tips r.e random.ranint(). This improved matters
somewhat, however my program is still too slow. If anyone has any
further tips on how to speed it up, they would be much appreciated!

So, I'm calling evolve(L,limit) from the interactive prompt. L is
initally [[100],['NA']]. Ideally limit would be 10^7.

Here is my program:

import random
n=100

def evolve(L,limit):
global n
while n<limit:
evnt = event()
if evnt!="None":
ind = chooseInd(L,n)
action(evnt,L,ind)

def chooseInd(L,n):
choiceSum=0
index=0
choice = random.randint(1,n)
while choiceSum< choice:
choiceSum+=L[index][0]
index +=1
return (index-1)

def event():
choice = random.random()
if choice<= .3:
event='b'
elif choice<= .4:
event ='d'
elif choice<= .5:
event = 'm'
else:
event = 'None'
return event

def action(event, L, index):
global n
if event == 'b':
L[index][0]+=1
n +=1
elif event == 'd':
L[index][0]-=1
n -=1
elif event == 'm':
L.append([1,index])
n +=1


thanks in advance,

Elsa.

Elsa,

1. You changed the subject line from "For loop searching takes too
long!" to "Still too slow". This causes newsreader programs to start a
new discussion thread, which makes life difficult for people who need
to refer back to previous messages. Please don't change the subject
line any more.

2. You provided a very clear description of your original question:

Now, what I need to do is randomly choose one myList, however the
distribution of my random choice needs to be proportional to the
values of myList[0].

This description should be the doc string for the chooseInd() function
-- for example:

def chooseInd(L,n):
"""
randomly choose one L, so that the distribution
of choices is proportional to the values of L[0]
"""

It is not clear (to me, anyway) what the other functions are trying to
accomplish. So please add a doc string to each of these functions,
with descriptions of similar clarity. This will help us a lot. And it
will help you, too, if you return to the code after a few
days/weeks/months of directing your attention elsewhere.

3. Please provide a *complete* transcript of an interactive session
that exercises this code.

Tx,
John

Remarks aimed at elsa, not John.

You used tabs in your source code, which don't survive mail very well.
So indentation is different in my quoted copy than presumably in your
original. No bug there, but I much prefer having my editor expand all
tabs into spaces (4 colums)

You have a typo in the initialization of L. Try pasting it from a
working session, instead of retyping it. Presumably you wanted
L= [[100,'NA']]. rather than [[100],['NA']].
The latter value throws an exception in the code.

You could start simply by changing the values in the event() function.
No point in returning None half the time when that'll accomplish
nothing. So just double the fractions you're checking against, or scale
the random value.

The pattern produced is interesting. The only count that gets very big
is L[0][0]. ("Them what has, gets") Lots of the other counts reach
zero, and once they do, they won't ever change.

The algorithm is O(N*N), and it already takes a long time for 10**4. So
10**7 would be enormous.

If the improvement needed were minor, you should profile it, figure
where all the time is spent, and optimize that. But I think that's
hopeless with this data structure.

At a guess, I'd think you should build a single list that grows to size
'limit', where you start with 100 items of value "NA". No counts
needed, because they're all implicitly count of 1. Then write your code
to manipulate that list. When you finish, construct the list you really
want, by using something like the groupby() function.

DaveA
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top