time series calculation in list comprehension?

F

falcon

Is there a way I can do time series calculation, such as a moving
average in list comprehension syntax? I'm new to python but it looks
like list comprehension's 'head' can only work at a value at a time. I
also tried using the reduce function and passed in my list and another
function which calculates a moving average outside the list comp. ...
but I'm not clear how to do it. Any ideas? Thanks.
 
B

beliavsky

falcon said:
Is there a way I can do time series calculation, such as a moving
average in list comprehension syntax? I'm new to python but it looks
like list comprehension's 'head' can only work at a value at a time. I
also tried using the reduce function and passed in my list and another
function which calculates a moving average outside the list comp. ...
but I'm not clear how to do it. Any ideas? Thanks.

I suggest that statistical data, including time series, be stored and
processed in arrays, such as the one found in NumPy. You can compute
averages using the "sum" function and array slices.
 
T

Terry Reedy

falcon said:
Is there a way I can do time series calculation, such as a moving
average in list comprehension syntax? I'm new to python but it looks
like list comprehension's 'head' can only work at a value at a time. I
also tried using the reduce function and passed in my list and another
function which calculates a moving average outside the list comp. ...
but I'm not clear how to do it. Any ideas? Thanks.

Write explicit for loops, possibly with nested if conditionals, that do
exactly what you want. The functional expressions are abbreviations for
certain patterns of induction. Except as an educational exercise, I do not
think it worthwhile to go through contortions to force fit a problem to a
pattern it does not really fit.

Terry Jan Reedy
 
J

johnzenger

falcon said:
Is there a way I can do time series calculation, such as a moving
average in list comprehension syntax? I'm new to python but it looks
like list comprehension's 'head' can only work at a value at a time. I
also tried using the reduce function and passed in my list and another
function which calculates a moving average outside the list comp. ...
but I'm not clear how to do it. Any ideas? Thanks.

I agree with others that reduce is not the best way to do this. But,
to satisfy your curiosity, I offer this horribly inefficient way to use
"reduce" to calculate the average of a list:

from __future__ import division

def reduceaverage(acc, x):
return [acc[0] + x, acc[1] + 1, (acc[0] + x) / (acc[1] + 1) ]

numbers = [4, 8, 15, 16, 23, 42]
print reduce(reduceaverage, numbers, [0,0,0])[2]

....basically, the idea is to write a function that takes as its first
argument the accumulated values, and as its second argument the next
value in the list. In Python, this is almost always the wrong way to
do something, but it is kind of geeky and LISP-ish.
 
P

Paul Rubin

falcon said:
Is there a way I can do time series calculation, such as a moving
average in list comprehension syntax? I'm new to python but it looks
like list comprehension's 'head' can only work at a value at a time. I
also tried using the reduce function and passed in my list and another
function which calculates a moving average outside the list comp. ...
but I'm not clear how to do it. Any ideas? Thanks.

Do you mean something like this?

for i in xrange(5, len(ts)):
# compute and print moving average from i-5 to i
print i, sum(ts[i-5:i]) / 5.
 
L

Lonnie Princehouse

Well, you could iterate over an index into the list:

from __future__ import division

def moving_average(sequence, n):
return [sum(sequence[i:i+n])/n for i in
xrange(len(sequence)-n+1)]

Of course, that's hardly efficient. You really want to use the value
calculated for the i_th term in the (i+1)th term's evaluation. While
it's not easy (or pretty) to store state between iterations in a list
comprehension, this is the perfect use for a generator:

def generator_to_list(f):
return lambda *args,**keywords: list(f(*args,**keywords))

@generator_to_list
def moving_average(sequence, n):
assert len(sequence) >= n and n > 0
average = sum(sequence[:n]) / n
yield average
for i in xrange(1, len(sequence)-n+1):
average += (sequence[i+n-1] - sequence[i-1]) / n
yield average
 
D

Dennis Lee Bieber

Is there a way I can do time series calculation, such as a moving
average in list comprehension syntax? I'm new to python but it looks
like list comprehension's 'head' can only work at a value at a time. I
also tried using the reduce function and passed in my list and another
function which calculates a moving average outside the list comp. ...
but I'm not clear how to do it. Any ideas? Thanks.

Something like:

AVELEN = 5
DATA = [ 2, 4, 5, 1, 6, 6, 3, 2, 6, 3,
2, 3, 3, 5, 6, 6, 5, 4, 3, 6,
1, 4, 3, 4, 1, 6, 6, 5, 6, 5,
5, 6, 5, 3, 3, 5, 1, 4, 1, 3,
2, 5, 5, 4, 2, 5, 5, 2, 2, 4,
1, 3, 5, 1, 4, 1, 1, 3, 3, 6 ]

moving = [ sum(DATA[x-AVELEN:x]) / float(AVELEN)
for x in xrange(AVELEN, len(DATA) + 1) ]

print moving
....
[3.6000000000000001, 4.4000000000000004, 4.2000000000000002,
3.6000000000000001, 4.5999999999999996, 4.0, 3.2000000000000002,
3.2000000000000002, 3.3999999999999999, 3.2000000000000002,
3.7999999999999998, 4.5999999999999996, 5.0, 5.2000000000000002,
4.7999999999999998, 4.7999999999999998, 3.7999999999999998,
3.6000000000000001, 3.3999999999999999, 3.6000000000000001,
2.6000000000000001, 3.6000000000000001, 4.0, 4.4000000000000004,
4.7999999999999998, 5.5999999999999996, 5.4000000000000004,
5.4000000000000004, 5.4000000000000004, 4.7999999999999998,
4.4000000000000004, 4.4000000000000004, 3.3999999999999999,
3.2000000000000002, 2.7999999999999998, 2.7999999999999998,
2.2000000000000002, 3.0, 3.2000000000000002, 3.7999999999999998,
3.6000000000000001, 4.2000000000000002, 4.2000000000000002,
3.6000000000000001, 3.2000000000000002, 3.6000000000000001,
2.7999999999999998, 2.3999999999999999, 3.0, 2.7999999999999998,
2.7999999999999998, 2.7999999999999998, 2.3999999999999999, 2.0,
2.3999999999999999, 2.7999999999999998]

{Note: I've not actually validated the results -- the first and last
seem right... }
--
 
P

Peter Otten

Lonnie said:
You really want to use the value calculated for the i_th term in the
(i+1)th term's evaluation.  

It may sometimes be necessary to recalculate the average for every iteration
to avoid error accumulation. Another tradeoff with your optimization is
that it becomes harder to switch the accumulation function from average to
max, say.
While it's not easy (or pretty) to store state between iterations in a
list comprehension, this is the perfect use for a generator:

  def generator_to_list(f):
    return lambda *args,**keywords: list(f(*args,**keywords))

  @generator_to_list
  def moving_average(sequence, n):
    assert len(sequence) >= n and n > 0
    average = sum(sequence[:n]) / n
    yield average
    for i in xrange(1, len(sequence)-n+1):
      average += (sequence[i+n-1] - sequence[i-1]) / n
      yield average

Here are two more that work with arbitrary iterables:

from __future__ import division

from itertools import islice, tee, izip
from collections import deque

def window(items, n):
it = iter(items)
w = deque(islice(it, n-1))
for item in it:
w.append(item)
yield w # for a robust implementation:
# yield tuple(w)
w.popleft()

def moving_average1(items, n):
return (sum(w)/n for w in window(items, n))

def moving_average2(items, n):
first_items, last_items = tee(items)
accu = sum(islice(last_items, n-1))
for first, last in izip(first_items, last_items):
accu += last
yield accu/n
accu -= first

While moving_average1() is even slower than your inefficient variant,
moving_average2() seems to be a tad faster than the efficient one.

Peter
 
J

Jim Segrave

Is there a way I can do time series calculation, such as a moving
average in list comprehension syntax? I'm new to python but it looks
like list comprehension's 'head' can only work at a value at a time. I
also tried using the reduce function and passed in my list and another
function which calculates a moving average outside the list comp. ...
but I'm not clear how to do it. Any ideas? Thanks.

I used the following to return an array of the average of the last n
values -it's not particularly pretty, but it works

# set number of values to average
weighting = 10

# an array of values we want to calculate a running average on
ratings = []
# an array of running averages
running_avg = []

# some routine to fill ratings with the values
r = random.Random()
for i in range(0, 20):
ratings.append(float(r.randint(0, 99)))

for i in range(1, 1 + len(ratings)):
if i < weighting:
running_avg.append(ratings[i - 1])
else:
running_avg.append(reduce(lambda s, a: s+ a,
ratings[i - weighting : i]) /
len(ratings[i - weighting : i]))

for i in range(0, len(ratings)):
print "%3d: %3d %5.2f" % (i, ratings, running_avg)


sample output:
0: 34 34.00
1: 28 28.00
2: 58 58.00
3: 16 34.00
4: 74 44.00
5: 32 45.00
6: 74 49.00
7: 21 50.25
8: 78 51.25
9: 28 50.25
10: 32 39.75
11: 93 57.75
12: 2 38.75
13: 7 33.50
14: 8 27.50
15: 30 11.75
16: 1 11.50
17: 8 11.75
18: 40 19.75
19: 8 14.25

For all but the first 3 rows, the third column is the average of the
values in the 2nd column for this and the preceding 3 rows.
 
R

Raymond Hettinger

[Peter Otten]
from __future__ import division

from itertools import islice, tee, izip . . .
def moving_average2(items, n):
first_items, last_items = tee(items)
accu = sum(islice(last_items, n-1))
for first, last in izip(first_items, last_items):
accu += last
yield accu/n
accu -= first

While moving_average1() is even slower than your inefficient variant,
moving_average2() seems to be a tad faster than the efficient one.

This is nicely done and scales-up well. Given an n-average of m-items,
it has O(n) memory consumption and O(m) running time. In contrast, the
other variants do more work than necessary by pulling the whole
sequence into memory or by re-summing all n items at every step,
resulting in O(m) memory consumption and O(m*n) running time.

This recipe gets my vote for the best solution.


Raymond
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top