Hello,
This is a question for the best method (in terms of performance
only) to choose a random element from a list among those that satisfy
a certain property.
This is the setting: I need to pick from a list a random element
that satisfies a given property. All or none of the elements may have
the property. Most of the time, many of the elements will satisfy the
property, and the property is a bit expensive to evaluate. Chance of
having the property are uniform among elements.
import random
def random_pick(a_list,property):
'''Returns a random element from a list that has the property
Returns None if no element has the property
'''
random.shuffle(a_list)
for i in a_list:
if property(i): return i
but that requires to shuffle the list every time.
A second approach, that works if we know that at least one element of
the list has the property, is:
import random
def random_pick(a_list,property):
'''Returns a random element from a list that has the property
Loops forever if no element has the property
'''
while 1:
i=random.choice(a_list)
if property(i): return i
which is more efficient (on average) if many elements of the list have
the property and less efficient if only few elements of the list has
the property (and goes crazy if no element has the property)
import random
def random_pick(a_list,property):
'''Returns a random element from a list that has the property
'''
b_list=[x for x in a_list if property(x)]
try:
return random.choice(b_list)
finally: return None
but this one checks the property on all the elements, which is no
good.
I don't need strong random numbers, so a simple solution like:
import random
globalRNG=random.Random()
def random_pick(a_list,property):
'''Returns a random element from a list that has the property
Works only if len(a_list)+1 is prime: uses Fermat's little theorem
'''
a=globalRNG(1,len(a_list))
ind=a
for i in xrange(len(a_list)):
x=a_list[a-1]
if property(x):return x
ind*=a
but this works only if len(a_list)+1 is prime!!! Now this one could be
saved if there were an efficient method to find a prime number given
than a number n but not very much greater...
Any other ideas? Thanks everybody
Caching might help.
If random_pick is called several times with the same list(s) then
cache the result of
[property(i) for i in a_list] against a_list.
If random_pick is called several times with list(s) whith multiple
instances of list items then cache
property(i) against i for i in a_list .
You could do both.
You might investigate wether property(i) could be implemented using a
faster algorithm, maybe table lookup, or interpolation from initial
table lookup.
- Paddy.