random playing soundfiles according to rating.

K

kp87

I am a little bit stuck ....

I want to play a bunch of soundfiles randomly, but i want to give each
soundfile a rating (say 0-100) and have the likelihood that the file be
chosen be tied to its rating so that the higher the rating the more
likely a file is to be chosen. Then i need some additional flags for
repetition, and some other business. I am guessing a dictionary would
be a great way to do this, with the key being the soundfile name and
the values being my ratings and other flags & associated data.

#soundfile name : [rating, %chance it will repeat/loop]
sfiles = { ("sf001") : [85, 15],
("sf002") : [25, 75],
("sf003") : [95, 45],
("sf004") : [35, 95] }


But i am stuck on how to do a random chooser that works according to my
idea of choosing according to rating system. It seems to me to be a bit
different that just choosing a weighted choice like so:

def windex(lst):
'''an attempt to make a random.choose() function that makes weighted
choices

accepts a list of tuples with the item and probability as a pair
like: >>> x = [('one', 0.25), ('two', 0.25), ('three', 0.5)]
>>> y=windex(x)'''
n = random.uniform(0, 1)
for item, weight in lst:
if n < weight:
break
n = n - weight
return item


And i am not sure i want to have to go through what will be hundreds of
sound files and scale their ratings by hand so that they all add up to
100%. I just want to have a long list that i can add too whenever i
want, and assign it a grade/rating according to my whims!

cheers,

-kp---
 
S

Steve Holden

I am a little bit stuck ....

I want to play a bunch of soundfiles randomly, but i want to give each
soundfile a rating (say 0-100) and have the likelihood that the file be
chosen be tied to its rating so that the higher the rating the more
likely a file is to be chosen. Then i need some additional flags for
repetition, and some other business. I am guessing a dictionary would
be a great way to do this, with the key being the soundfile name and
the values being my ratings and other flags & associated data.

#soundfile name : [rating, %chance it will repeat/loop]
sfiles = { ("sf001") : [85, 15],
("sf002") : [25, 75],
("sf003") : [95, 45],
("sf004") : [35, 95] }


But i am stuck on how to do a random chooser that works according to my
idea of choosing according to rating system. It seems to me to be a bit
different that just choosing a weighted choice like so:

def windex(lst):
'''an attempt to make a random.choose() function that makes weighted
choices

accepts a list of tuples with the item and probability as a pair
like: >>> x = [('one', 0.25), ('two', 0.25), ('three', 0.5)]
>>> y=windex(x)'''
n = random.uniform(0, 1)
for item, weight in lst:
if n < weight:
break
n = n - weight
return item


And i am not sure i want to have to go through what will be hundreds of
sound files and scale their ratings by hand so that they all add up to
100%. I just want to have a long list that i can add too whenever i
want, and assign it a grade/rating according to my whims!

A really cheesy scheme: decide you will allocate each file a probability
between 0 and 10 or 1 and 10 - whatever. 100 may be a bit much - do you
really need such fine discrimination?

If you have your list [(filename, prob), (filename, prob), ... ] then
construct a filename list as follows:

flist = []
for filename, prob in lst:
flist += [filename] * prob

Then choosing a filename becomes quite simple:

file_to_play = random.choice(flist)

The rest is up to you ...

regards
Steve
 
M

Michael Spencer

But i am stuck on how to do a random chooser that works according to my
idea of choosing according to rating system. It seems to me to be a bit
different that just choosing a weighted choice like so:
....

And i am not sure i want to have to go through what will be hundreds of
sound files and scale their ratings by hand so that they all add up to
100%. I just want to have a long list that i can add too whenever i
want, and assign it a grade/rating according to my whims!
Perhaps something like this:

from bisect import bisect_left
from random import randrange

def w_choice(range, weights):
"""Return a choice from range, given *cumulative* relative frequencies"""
total = weights[-1]
pick = randrange(1,total)
ix = bisect_left(weights, pick)
return range[ix]


def cum_sum(iterable, start = 0):
cum = []
for i in iterable:
start += i
cum.append(start)
return cum

Test it: ... names = files.keys()
... weights = cum_sum(files.values())
...
... plays = {}
... for i in range(N):
... track = w_choice(names, weights)
... plays[track] = plays.get(track, 0) + 1
... return plays
...
>>> test(files, N=10000) {'File3': 7039, 'File2': 1049, 'File1': 1912}
>>> files["File4"] = 50
>>> test(files, N=150000) {'File3': 70502, 'File2': 9988, 'File1': 20009, 'File4': 49501}
>>>


Michael
 
J

JW

I think of it this way: you randomly pick a entry out of a dictionary,
then roll a 100-side die to see if the pick is "good enough". Repeat
until you find one, or give up.

import random

def rand_weighted_pick(weighted_picks):
for i in range(100):
name, prob = random.choice(weighted_picks)
if prob >= random.randint(0,100): return name
# Give up and return a random choice
return random.choice(weighted_picks)[0]

if __name__ == "__main__":
test_vals = [("A",50),("B",30),("C",20)]
dist = dict()
for name, prob in test_vals: dist[name] = 0
for x in xrange(1000): dist[rand_weighted_pick(test_vals)] += 1
print "Expected: A = 500, B = 300, C = 200"
print "Actual : A = %d, B = %d, C = %d"%(dist['A'], dist['B'],
dist['C'])
 
B

Ben Cartwright

But i am stuck on how to do a random chooser that works according to my
idea of choosing according to rating system. It seems to me to be a bit
different that just choosing a weighted choice like so:
....

And i am not sure i want to have to go through what will be hundreds of
sound files and scale their ratings by hand so that they all add up to
100%. I just want to have a long list that i can add too whenever i
want, and assign it a grade/rating according to my whims!

Indeed, manually normalizing all those weights would be a downright
sinful waste of time and effort.

The solution (to any problem, really) starts with how you conceptualize
it. For this problem, consider the interval [0, T), where T is the sum
of all the weights. This interval is made up of adjacent subintervals,
one for each weight. Now pick a random point in [0, T). Determine
which subinterval this point is in, and you're done.

import random
def choose_weighted(zlist):
point = random.uniform(0, sum(weight for key, weight in zlist))
for key, weight in zlist: # which subinterval is point in?
point -= weight
if point < 0:
return key
return None # will only happen if sum of weights <= 0

You'll get bogus results if you use negative weights, but that should
be obvious. Also note that by using random.uniform instead of
random.randrange, floating point weights are handled correctly.

Test it: ... counts[choose_weighted(data)] += 1
...
>>> [(key, counts[key]) for key, weight in data] [('foo', 749), ('bar', 1513), ('skipme', 0), ('baz', 7738)]
>>>

--Ben
 
K

kpp9c

I've been looking at some of the suggested approaches and looked a
little at Michael's bit which works well.... bisect is a module i
always struggle with (hee hee)

I am intrigued by Ben's solution and Ben's distilled my problem quite
nicely, but, well....i don't understand what "point" is doing with
wieght for key, weight for zlist, furthermore, it barfs in my
interpreter... (Python 2.3)
 
E

Ed Singleton

I am a little bit stuck ....

I want to play a bunch of soundfiles randomly, but i want to give each
soundfile a rating (say 0-100) and have the likelihood that the file be
chosen be tied to its rating so that the higher the rating the more
likely a file is to be chosen. Then i need some additional flags for
repetition, and some other business. I am guessing a dictionary would
be a great way to do this, with the key being the soundfile name and
the values being my ratings and other flags & associated data.

It depends how accurate you want the likelihood to be and how
important performance is.

I was thinking about this in respect of queueing mp3s from my
collection (10,000+) to be played based on a 1-5 star rating (five
stars is five times more likely to be played than 1 star).

If speed is no issue (for example you can queue an mp3 while the
current one is playing), then Ben's solution is the classic one.
Store the total of all your scores (or calculate it on the fly if you
don't have too many files), pick a random number up to that total, and
then iterate through all your scores, subtracting each score from the
total, until the total reaches zero, and then play that file.

However that approach gets slower and slower the more files you have
(slower to calculate the total and slower to iterate through the
files).

If speed is more of an issue, you could give up on trying to use a
perfect probability, and just pick 10 (or 50, or 100) files at random,
and then play one of those based on the above approach.

Ed
 
C

Christos Georgiou

If speed is no issue (for example you can queue an mp3 while the
current one is playing), then Ben's solution is the classic one.
Store the total of all your scores (or calculate it on the fly if you
don't have too many files), pick a random number up to that total, and
then iterate through all your scores, subtracting each score from the
total, until the total reaches zero, and then play that file.

However that approach gets slower and slower the more files you have
(slower to calculate the total and slower to iterate through the
files).

Hm... just playing:

import random, itertools

scored=[('bad',1), ('not that bad',2),('ok',3),('better',4),('best',5)]

def player(lst):
def forever(lst):
while 1:
for item in lst:
yield item
total_score= sum(x[1] for x in lst)
scanner= forever(lst)
while 1:
next_score= random.randrange(total_score)
for item in scanner:
if next_score <= item[1]:
yield item[0]
next_score+= random.randrange(total_score)
else:
next_score-= item[1]


print list(itertools.islice(player(scored), 0, 20))

['better', 'ok', 'best', 'not that bad', 'best', 'best', 'best', 'not that
bad', 'ok', 'best', 'best', 'bad', 'better', 'better', 'better', 'ok', 'ok',
'not that bad', 'best', 'best']
 
B

Ben Cartwright

kpp9c said:
I've been looking at some of the suggested approaches and looked a
little at Michael's bit which works well.... bisect is a module i
always struggle with (hee hee)

I am intrigued by Ben's solution and Ben's distilled my problem quite
nicely

Thanks!-) Actually, you should use Michael's solution, not mine. It
uses the same concept, but it finds the correct subinterval in O(log n)
steps (by using bisect on a cached list of cumulative sums). My code
takes O(n) steps -- this is a big difference when you're dealing with
thousands of items.
but, well....i don't understand what "point" is doing with
wieght for key, weight for zlist

This line:
point = random.uniform(0, sum(weight for key, weight in zlist))
Is shorthand for:
total = 0
for key, weight in zlist:
total += weight
point = random.uniform(0, total)
furthermore, it barfs in my
interpreter... (Python 2.3)

Oops, that's because it uses generator expressions
(http://www.python.org/peps/pep-0289.html), a 2.4 feature. Try
rewriting it longhand (see above). The second line of the test code
will have to be changed too, i.e.:
>>> counts = dict([(key, 0) for key, weight in data])

--Ben
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top