Standard Deviation One-liner

B

Billy Mays

I'm trying to shorten a one-liner I have for calculating the standard
deviation of a list of numbers. I have something so far, but I was
wondering if it could be made any shorter (without imports).


Here's my function:

a=lambda d:(sum((x-1.*sum(d)/len(d))**2 for x in d)/(1.*(len(d)-1)))**.5


The functions is invoked as follows:
1.2909944487358056
 
A

Alain Ketterlin

Billy Mays said:
I'm trying to shorten a one-liner I have for calculating the standard
deviation of a list of numbers. I have something so far, but I was
wondering if it could be made any shorter (without imports).
a=lambda d:(sum((x-1.*sum(d)/len(d))**2 for x in d)/(1.*(len(d)-1)))**.5

You should make it two half-liners, because this one repeatedly computes
sum(d). I would suggest:

aux = lambda s1,s2,n: (s2 - s1*s1/n)/(n-1)
sv = lambda d: aux(sum(d),sum(x*x for x in d),len(d))

(after some algebra). Completely untested, assumes data come in as
floats. You get the idea.

-- Alain.
 
A

Alain Ketterlin

Alain Ketterlin said:
aux = lambda s1,s2,n: (s2 - s1*s1/n)/(n-1)
sv = lambda d: aux(sum(d),sum(x*x for x in d),len(d))

Err, sorry, the final square root is missing.

-- Alain.
 
R

Raymond Hettinger

I'm trying to shorten a one-liner I have for calculating the standard
deviation of a list of numbers.  I have something so far, but I was
wondering if it could be made any shorter (without imports).

Here's my function:

a=lambda d:(sum((x-1.*sum(d)/len(d))**2 for x in d)/(1.*(len(d)-1)))**.5

The functions is invoked as follows:

 >>> a([1,2,3,4])
1.2909944487358056

Besides trying to do it one line, it is also interesting to write an
one-pass version with incremental results:

http://mathcentral.uregina.ca/QQ/database/QQ.09.06/h/murtaza2.html

Another interesting avenue to is aim for highest possible accuracy.
Consider using math.fsum() to avoid rounding errors in the summation
of large numbers of nearly equal values.


Raymond
 
S

Steven D'Aprano

I'm trying to shorten a one-liner I have for calculating the standard
deviation of a list of numbers.  I have something so far, but I was
wondering if it could be made any shorter (without imports).

Here's my function:

a=lambda d:(sum((x-1.*sum(d)/len(d))**2 for x in
d)/(1.*(len(d)-1)))**.5

The functions is invoked as follows:

 >>> a([1,2,3,4])
1.2909944487358056

Besides trying to do it one line, it is also interesting to write an
one-pass version with incremental results:

http://mathcentral.uregina.ca/QQ/database/QQ.09.06/h/murtaza2.html

I'm not convinced that's a good approach, although I haven't tried it. In
general, the so-called "computational formula" for variance is optimized
for pencil and paper calculations of small amounts of data, but is
numerically unstable.

See

http://www.johndcook.com/blog/2008/09/26/comparing-three-methods-of-
computing-standard-deviation/

http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance



I'll also take this opportunity to plug my experimental stats package,
which includes coroutine-based running statistics, including standard
deviation:
1.4999999999999998

The non-running calculation of stdev gives this:
1.5


http://pypi.python.org/pypi/stats/
http://code.google.com/p/pycalcstats/

Be warned that the version on Google Code is unstable, and currently
broken.

Feedback is welcome!
 
E

Ethan Furman

Steven said:
I'm trying to shorten a one-liner I have for calculating the standard
deviation of a list of numbers. I have something so far, but I was
wondering if it could be made any shorter (without imports).

Here's my function:

a=lambda d:(sum((x-1.*sum(d)/len(d))**2 for x in
d)/(1.*(len(d)-1)))**.5

The functions is invoked as follows:

a([1,2,3,4])
1.2909944487358056
Besides trying to do it one line, it is also interesting to write an
one-pass version with incremental results:

http://mathcentral.uregina.ca/QQ/database/QQ.09.06/h/murtaza2.html

I'm not convinced that's a good approach, although I haven't tried it. In
general, the so-called "computational formula" for variance is optimized
for pencil and paper calculations of small amounts of data, but is
numerically unstable.

See

http://www.johndcook.com/blog/2008/09/26/comparing-three-methods-of-
computing-standard-deviation/

http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance



I'll also take this opportunity to plug my experimental stats package,
which includes coroutine-based running statistics, including standard
deviation:

--> s = stats.co.stdev()
--> s.send(3)
nan

Look! A NaN in the wild! :)

~Ethan~
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,017
Latest member
GreenAcreCBDGummiesReview

Latest Threads

Top