ANN: stats 0.1a calculator statistics for Python

S

Steven D'Aprano

I am pleased to announce the first public release of stats for Python.

http://pypi.python.org/pypi/stats

stats is a pure-Python module providing basic statistics functions
similar to those found on scientific calculators. It currently includes:

Univariate statistics including:
* arithmetic, harmonic, geometric and quadratic means
* median, mode
* standard deviation and variance (sample and population)

Multivariate statistics including:
* Pearson's correlation coefficient
* covariance (sample and population)
* linear regression

and others.

This is an unstable alpha release of the software. Feedback and
contributions are welcome.

J

Jean-Michel Pichavant

Steven said:
I am pleased to announce the first public release of stats for Python.

http://pypi.python.org/pypi/stats

stats is a pure-Python module providing basic statistics functions
similar to those found on scientific calculators. It currently includes:

Univariate statistics including:
* arithmetic, harmonic, geometric and quadratic means
* median, mode
* standard deviation and variance (sample and population)

Multivariate statistics including:
* Pearson's correlation coefficient
* covariance (sample and population)
* linear regression

and others.

This is an unstable alpha release of the software. Feedback and
contributions are welcome.
I already have a stats module:
/usr/lib/python2.5/site-packages/stats.py

"""
stats.py module

(Requires pstat.py module.)

#################################################
####### Written by: Gary Strangman ###########
#################################################

A collection of basic statistical functions for python. The function
names appear below.
[snip]
"""

It looks like it is part of the standard debian python distro (python
2.5). That would mean that 'sats' is already used.

JM

V

Vlastimil Brom

2010/10/17 Steven D'Aprano said:
I am pleased to announce the first public release of stats for Python.

http://pypi.python.org/pypi/stats

stats is a pure-Python module providing basic statistics functions
similar to those found on scientific calculators. It currently includes:

Univariate statistics including:
* arithmetic, harmonic, geometric and quadratic means
* median, mode
* standard deviation and variance (sample and population)

Multivariate statistics including:
* Pearson's correlation coefficient
* covariance (sample and population)
* linear regression

and others.

This is an unstable alpha release of the software. Feedback and
contributions are welcome.
Thanks for this useful module!
I just wanted to report a marginal error triggered in the doctests:

Failed example:
isnan(float('nan'))
Exception raised:
Traceback (most recent call last):
File "C:\Python25\lib\doctest.py", line 1228, in __run
compileflags, 1) in test.globs
File "<doctest __main__.isnan[0]>", line 1, in <module>
isnan(float('nan'))
ValueError: invalid literal for float(): nan

(python 2.5.4 on win XP; this might be OS specific; probably in the
newer versions float() was updated, the tests on 2.6 and 2.7 are ok ):

I too would be interested in a comparison with the older module with
the same name:
http://www.nmr.mgh.harvard.edu/Neural_Systems_Group/gary/python.html
which is likely to be the same as the above mentioned one.

vbr

S

Steven D'Aprano

I already have a stats module:
/usr/lib/python2.5/site-packages/stats.py

The name of my module is not set in stone.

I can't help what site-packages you have, but the above is not on PyPI,
and it's certainly not part of the standard library.

If my module (or one like it) gets accepted for the standard library, I
don't think that could happen until version 3.3. When that happens, a
name will be chosen. Until then, I've got the first package on PyPI
called stats, and I'm keeping it

It may not stay stats forever, since it is uncomfortably close to the
stat module.

S

Steven D'Aprano

How did you determine that? Unfortunately, the name of the package
listed on PyPI bears no relation to the name of packages installed into
Python.

Fair point.

J

Jean-Michel Pichavant

Steven said:
Fair point.
It's part of the debian distribution. Just sayin so you know .

apt-cache showpkg python-stats
Package: python-stats
Versions:
0.6-7
(/var/lib/apt/lists/apt-cache.sequans.com_debian_dists_lenny_main_binary-i386_Packages)
(/var/lib/dpkg/status)

JM

C

Chris Torek

Thanks for this useful module!
I just wanted to report a marginal error triggered in the doctests:

Failed example:
isnan(float('nan'))
Exception raised:
Traceback (most recent call last):
File "C:\Python25\lib\doctest.py", line 1228, in __run
compileflags, 1) in test.globs
File "<doctest __main__.isnan[0]>", line 1, in <module>
isnan(float('nan'))
ValueError: invalid literal for float(): nan

(python 2.5.4 on win XP; this might be OS specific; probably in the
newer versions float() was updated, the tests on 2.6 and 2.7 are ok ):

Indeed it was; in older versions float() just invoked the C library
routines, so float('nan') works on Mac OS X python 2.5, for instance,
but then you run into the fact that math.isnan() is only in 2.6 and
later

Workaround, assuming an earlier "from math import *":

try:
isnan(0.0)
except NameError:
def isnan(x): x != x

Of course you are still stuck with float('nan') failing on Windows.
I have no quick and easy workaround for that one.