standard deviation

Bill Cunningham · Jun 5, 2011

I have some code here and a snippet of unfinished, untested code which
is an attempt at a function called stddev. This is of course meant to
calculate a standard deviation. I am trying to build small helper functions
that can be built into tech analysis tools. Something I've been attempting
and thinking about for a long time. stddev's first parameter is passed the
return value of the function mean(). It may not need a second parameter but
this is what I have so far. stddev needs to do the following things.
1) find the difference in prices from mean. Whether negative of positive
numbers.
2) square those numbers
3) sum those squares
4) calculate the square of the total from 3 above.

header called "tech.h"

#include <stdio.h>
#include <stdlib.h>
#ifdef M
#include <math.h>
#endif

double mean(double *, int);
double stddev(double, double *);

mean.c

#include "tech.h"

double mean(double *avg, int num)
{
double sum, average;
int i;
sum = average = 0;
for (i = 0; i < num; ++i) {
sum = sum + avg;
average = sum / num;
}
return average;
}

stddev.c /*the attempt*/

#include "tech.h"

double stddev(double mean, double *prices)
{
double price = 0.0;
int i = 0;
for (; i < prices; ++i) {
if (prices > mean) {
price = prices - mean;
return prices;
} else if (prices < mean) {
price = mean - prices;
return prices;
}

I really have no way to code this but I don't want anyone to do my
homework. Can someone offer tips or citations as to what I might need to do
here?

Bill

Lew Pitcher · Jun 5, 2011

I have some code here and a snippet of unfinished, untested code which
is an attempt at a function called stddev. This is of course meant to
calculate a standard deviation. [snip]
double stddev(double mean, double *prices)
{
double price = 0.0;
int i = 0;
for (; i < prices; ++i) {
if (prices > mean) {
price = prices - mean;
return prices;
} else if (prices < mean) {
price = mean - prices;
return prices;
}

I really have no way to code this but I don't want anyone to do my
homework. Can someone offer tips or citations as to what I might need to
do here?

Sorry, Bill, but your code doesn't really reflect the accepted way that you
calculate standard deviation. I'm not mathematician enough to tell whether
you've written equivalent code or not, so I'll just assume that your code
isn't correct, and move on.

I suggest that you read the first few paragraphs of the Wikipedia article on
Standard Deviation, especially start of the "Basic Examples" section
(http://en.wikipedia.org/wiki/Standard_deviation#Basic_examples)
There, you'll find an excellent algorithm for calculating standard deviation
that is easily transformable into C code.

Let me summarize their algorithm:
1) Compute the mean of the population
2) For each element of the population,
2a) compute the difference between the element and the mean.
2b) square this value
2c) call this new value the "variance"
3) Find the mean of the variances (sum them, then divide by the # of
variances)
4) Compute the square root of this sum of the mean of the variances

This square root is the "standard deviation"

Kleuskes & Moos · Jun 5, 2011

I have some code here and a snippet of unfinished, untested code which
is an attempt at a function called stddev. This is of course meant to
calculate a standard deviation. I am trying to build small helper functions
that can be built into tech analysis tools. Something I've been attempting
and thinking about for a long time. stddev's first parameter is passed the
return value of the function mean(). It may not need a second parameter but
this is what I have so far. stddev needs to do the following things.
1) find the difference in prices from mean. Whether negative of positive
numbers.
2) square those numbers
3) sum those squares
4) calculate the square of the total from 3 above.

header called "tech.h"

#include <stdio.h>
#include <stdlib.h>
#ifdef M
#include <math.h>
#endif

double mean(double *, int);
double stddev(double, double *);

mean.c

#include "tech.h"

double mean(double *avg, int num)
{
double sum, average;
int i;
sum = average = 0;
for (i = 0; i < num; ++i) {
sum = sum + avg;
average = sum / num;
}
return average;

}

stddev.c /*the attempt*/

#include "tech.h"

double stddev(double mean, double *prices)
{
double price = 0.0;
int i = 0;
for (; i < prices; ++i) {
if (prices > mean) {
price = prices - mean;
return prices;
} else if (prices < mean) {
price = mean - prices;
return prices;
}

I really have no way to code this but I don't want anyone to do my
homework. Can someone offer tips or citations as to what I might need to do
here?

Bill

Erwin Kreyszig has a pretty good rundown in 'Introduction to
mathematical statistics, principles and methods' section 3.2 and 3.3.
It used to be pretty standard when i was in college, so i guess it
should still be available in the library.

Lew Pitcher · Jun 5, 2011

I have some code here and a snippet of unfinished, untested code
which
is an attempt at a function called stddev. This is of course meant to
calculate a standard deviation. [snip]
double stddev(double mean, double *prices)
{
double price = 0.0;
int i = 0;
for (; i < prices; ++i) {
if (prices > mean) {
price = prices - mean;
return prices;
} else if (prices < mean) {
price = mean - prices;
return prices;
}

Click to expand...

FWIW, from the algorithm and data given on the Wikipedia page, I coded this

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

double StdDev(unsigned int samplesize, double population[])
{
double sum, mean, spread;
unsigned int index;

if (samplesize == 0) return 0.0; /* catch obvious error */

/* compute mean of sample population */
for (index = 0, sum = 0.0 ; index < samplesize; ++index)
sum += population[index];
mean = sum / samplesize;

/* compute variances */
for (index = 0, sum = 0.0 ; index < samplesize; ++index)
{
double delta;

delta = population[index] - mean;
sum += (delta * delta);
}
return sqrt(sum/samplesize); /* standard deviation */
}

/*
** Population values taken from the Wikipedia example
*/
int main(void)
{
double pop[] = {2,4,4,4,5,5,7,9};
unsigned int popsize = (sizeof(pop) / sizeof(pop[0]));

printf("The standard deviation = %f\n",StdDev(popsize,pop));

return EXIT_SUCCESS;
}

When I compile and run this code
$ cc -lm -o stddev stddev.c
$ stddev
The standard deviation = 2.000000
$
I get the same Standard Deviation value as the Wikipedia article's example

HTH

Bill Cunningham · Jun 5, 2011

Lew Pitcher wrote:

[snip]

FWIW, from the algorithm and data given on the Wikipedia page, I
coded this

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

double StdDev(unsigned int samplesize, double population[])
{
double sum, mean, spread;
unsigned int index;

if (samplesize == 0) return 0.0; /* catch obvious error */

/* compute mean of sample population */
for (index = 0, sum = 0.0 ; index < samplesize; ++index)
sum += population[index];
mean = sum / samplesize;

/* compute variances */
for (index = 0, sum = 0.0 ; index < samplesize; ++index)
{
double delta;

delta = population[index] - mean;
sum += (delta * delta);

Is this really saying sum=sum+(delta*delta);
And the parenthsis is for precedence?

}
return sqrt(sum/samplesize); /* standard deviation */
}

/*
** Population values taken from the Wikipedia example
*/
int main(void)
{
double pop[] = {2,4,4,4,5,5,7,9};
unsigned int popsize = (sizeof(pop) / sizeof(pop[0]));

Is the above code the standard thing to use if you have an array and
really don't want to count the number of elements? Using sizeof?

Lew Pitcher · Jun 5, 2011

Lew Pitcher wrote:

[snip]

FWIW, from the algorithm and data given on the Wikipedia page, I
coded this

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

double StdDev(unsigned int samplesize, double population[])
{
double sum, mean, spread;
unsigned int index;

if (samplesize == 0) return 0.0; /* catch obvious error */

/* compute mean of sample population */
for (index = 0, sum = 0.0 ; index < samplesize; ++index)
sum += population[index];
mean = sum / samplesize;

/* compute variances */
for (index = 0, sum = 0.0 ; index < samplesize; ++index)
{
double delta;

delta = population[index] - mean;
sum += (delta * delta);

Click to expand...

Is this really saying sum=sum+(delta*delta);
Yes

And the parenthsis is for precedence?

Not really. The parenthesis here are a visual cue to the programmer. They
are unnecessary for the logic; the expression would compute the same
without the parenthesis.

}
return sqrt(sum/samplesize); /* standard deviation */
}

/*
** Population values taken from the Wikipedia example
*/
int main(void)
{
double pop[] = {2,4,4,4,5,5,7,9};
unsigned int popsize = (sizeof(pop) / sizeof(pop[0]));

Click to expand...

Is the above code the standard thing to use if you have an array and
really don't want to count the number of elements? Using sizeof?

(sizeof(array) / sizeof(array[0])) is a fairly standard way to determine the
number of elements in an array. You could call it an idiom.

Fred · Jun 6, 2011

On June 5, 2011 14:25, in comp.lang.c, (e-mail address removed) wrote:

I have some code here and a snippet of unfinished, untested code
which
is an attempt at a function called stddev. This is of course meant to
calculate a standard deviation. [snip]
double stddev(double mean, double *prices)
{
double price = 0.0;
int i = 0;
for (; i < prices; ++i) {
if (prices > mean) {
price = prices - mean;
return prices;
} else if (prices < mean) {
price = mean - prices;
return prices;
}

Click to expand...

Click to expand...

FWIW, from the algorithm and data given on the Wikipedia page, I coded this

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

double StdDev(unsigned int samplesize, double population[])
{
double sum, mean, spread;
unsigned int index;

if (samplesize == 0) return 0.0; /* catch obvious error */

/* compute mean of sample population */
for (index = 0, sum = 0.0 ; index < samplesize; ++index)
sum += population[index];
mean = sum / samplesize;

/* compute variances */
for (index = 0, sum = 0.0 ; index < samplesize; ++index)
{
double delta;

delta = population[index] - mean;
sum += (delta * delta);
}
return sqrt(sum/samplesize); /* standard deviation */
}

/*
** Population values taken from the Wikipedia example
*/
int main(void)
{
double pop[] = {2,4,4,4,5,5,7,9};
unsigned int popsize = (sizeof(pop) / sizeof(pop[0]));

printf("The standard deviation = %f\n",StdDev(popsize,pop));

return EXIT_SUCCESS;
}

When I compile and run this code
$ cc -lm -o stddev stddev.c
$ stddev
The standard deviation = 2.000000
$
I get the same Standard Deviation value as the Wikipedia article's example

The above algorithm, while mathematically correct, is not good enough
for a computer. If your population is very large, or the individual
items in the population vary greatly in magnitude, you may run into
severe truncation and roundoff errors.

A more accurate way is to include the Leveque computational
correction,
computing the variance as:

var = { sum[(x - mean)^2] - (1/n)*sum[(x - mean) } / (n-1)
then stddev = sqrt(var)

Note that you computing the mean is not really as simple as summing
the
items and dividing by the number of items. What happens on a 32-bit
machine if the first item is of magnitude 10^18, followed by 10^20
items that are of magnitude 1? None of the latter items will
contribute to your sum, and your answer will be a couple of orders of
magnitude in error.

Fred · Jun 6, 2011

On June 5, 2011 14:57, in comp.lang.c, (e-mail address removed) wrote:

On June 5, 2011 14:25, in comp.lang.c, (e-mail address removed) wrote:
I have some code here and a snippet of unfinished, untested code
which
is an attempt at a function called stddev. This is of course meant to
calculate a standard deviation.
[snip]
double stddev(double mean, double *prices)
{
double price = 0.0;
int i = 0;
for (; i < prices; ++i) {
if (prices > mean) {
price = prices - mean;
return prices;
} else if (prices < mean) {
price = mean - prices;
return prices;
}

Click to expand...

Click to expand...

FWIW, from the algorithm and data given on the Wikipedia page, I coded this

Click to expand...

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

Click to expand...

double StdDev(unsigned int samplesize, double population[])
{
double sum, mean, spread;
unsigned int index;

Click to expand...

if (samplesize == 0) return 0.0; /* catch obvious error */

Click to expand...

/* compute mean of sample population */
for (index = 0, sum = 0.0 ; index < samplesize; ++index)
sum += population[index];
mean = sum / samplesize;

Click to expand...

/* compute variances */
for (index = 0, sum = 0.0 ; index < samplesize; ++index)
{
double delta;

Click to expand...

delta = population[index] - mean;
sum += (delta * delta);
}
return sqrt(sum/samplesize); /* standard deviation */
}

Click to expand...

/*
** Population values taken from the Wikipedia example
*/
int main(void)
{
double pop[] = {2,4,4,4,5,5,7,9};
unsigned int popsize = (sizeof(pop) / sizeof(pop[0]));

Click to expand...

printf("The standard deviation = %f\n",StdDev(popsize,pop));

Click to expand...

return EXIT_SUCCESS;
}

Click to expand...

When I compile and run this code
$ cc -lm -o stddev stddev.c
$ stddev
The standard deviation = 2.000000
$
I get the same Standard Deviation value as the Wikipedia article's example

Click to expand...

The above algorithm, while mathematically correct, is not good enough
for a computer. If your population is very large, or the individual
items in the population vary greatly in magnitude, you may run into
severe truncation and roundoff errors.

A more accurate way is to include the Leveque computational
correction,
computing the variance as:

var = { sum[(x - mean)^2] - (1/n)*sum[(x - mean) } / (n-1)
then stddev = sqrt(var)

Oops, missing a square. The variance with Leveque correction is

{ sum[(x - mean)^2] - (1/n)* sum[(x - mean)]^2 } / (n-1)

i.e., in the first term you sum the squares of x-mean,
and in the second term you square the sum of x-mean

See the Stanford Computer Science report by Chan, Golub, and Leveque

Note that you computing the mean is not really as simple as summing
the
items and dividing by the number of items. What happens on a 32-bit
machine if the first item is of magnitude 10^18, followed by 10^20
items that are of magnitude 1? None of the latter items will
contribute to your sum, and your answer will be a couple of orders of
magnitude in error.

Click to expand...

-- Fred K

Nobody · Jun 6, 2011

The above algorithm, while mathematically correct, is not good enough
for a computer. If your population is very large, or the individual
items in the population vary greatly in magnitude, you may run into
severe truncation and roundoff errors.

A more accurate way is to include the Leveque computational
correction,

While that may be true, it's a minor detail given the amount of software
I've seen which uses the single-pass algorithm:

var := (sum(x^2) - sum(x)^2/n)/n

This can be rather inaccurate if the standard deviation is small compared
to the mean (i.e. the data has a relatively large constant offset).

Dann Corbit · Jun 7, 2011

{snip}
The Welford method cited here is quite good numerically. It is the
method that I use:
http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance

Fred · Jun 7, 2011

{snip}
The Welford method cited here is quite good numerically. It is the
method that I use:http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance

And note that it cites the article by LeVeque et. al.

Bill Cunningham · Jun 7, 2011

Dann said:
{snip}
The Welford method cited here is quite good numerically. It is the
method that I use:
http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance

What kind of language is that in the article?

Bill

Lew Pitcher · Jun 7, 2011

What kind of language is that in the article?

It's not a formal computer language; it is pseudo-code. Pseudo-code is a
programming-language like way to express algorithms.

Keith Thompson · Jun 7, 2011

Lew Pitcher said:
It's not a formal computer language; it is pseudo-code. Pseudo-code is a
programming-language like way to express algorithms.

No, it's not pseudo-code, it's Python. (It's odd that the article
never mentions that face.)

Bill, I've suggested to you before that Python might be a better
language for you than C. I now repeat that suggestion.

Bill Cunningham · Jun 9, 2011

Bill, I've suggested to you before that Python might be a better
language for you than C. I now repeat that suggestion.

Oh why? It doesn't look sensible to me at all. C++ or java might be more
understandable than python. Even perl.

Bill

Keith Thompson · Jun 9, 2011

Bill Cunningham said:
Oh why? It doesn't look sensible to me at all. C++ or java might be more
understandable than python. Even perl.

Because the things that have been causing you grief in C all these years
are, to large extent, things that you wouldn't have to worry about in
Python.

Seriously, does C "look sensible" to you?

Bill Cunningham · Jun 9, 2011

Keith said:
Because the things that have been causing you grief in C all these
years are, to large extent, things that you wouldn't have to worry
about in Python.

Seriously, does C "look sensible" to you?

Not really serious stuff no. But I am learning functions. I can use
those real well. Syntactic things seem to be complicated in C. Quite a bit
in C doesn't look sensible to me actually.

Bill

Nobody · Jun 9, 2011

Bill, I've suggested to you before that Python might be a better language
for you than C. I now repeat that suggestion.

Python has a few pitfalls of its own. Probably the most common one is
the fact that everything is passed by reference.

Bill Cunningham · Jun 9, 2011

Keith said:
Seriously, does C "look sensible" to you?

I'm just a humble hobbyist. I do want to learn C or even C++ if I have
to go to higher level things. The tutorials that I've studied some most
basic things about C. Algorithms and ways to do things seems to be an
altogether different matter. They don't seem to teach that in tutorials.

Bill

Michael Press · Jun 10, 2011

Fred said:
Oops, missing a square. The variance with Leveque correction is

{ sum[(x - mean)^2] - (1/n)* sum[(x - mean)]^2 } / (n-1)

i.e., in the first term you sum the squares of x-mean,
and in the second term you square the sum of x-mean

Looks hinky. Perhaps

{ sum[(x - mean)^2] - (1/n)* [sum(x - mean)]^2 } / (n-1)

Wants help in c program made with arrays and function	0	Aug 3, 2022
seg fault	11	Jun 4, 2011
SENTINEL CONTROL LOOP WHEN DEALING WITH TWO ARRAYS	1	Oct 26, 2023
Trouble with prediction code, for the life of me I can't figure out why it isnt running properly. Help would be appreciated.	0	Jul 8, 2023
Problem of the code?	3	Mar 13, 2019
C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
[C language] Issue in the Lotka-Volterra model.	0	Jun 28, 2023
Struct Member Variables Problem	0	Jun 21, 2023

standard deviation

Bill Cunningham

Lew Pitcher

Kleuskes & Moos

Lew Pitcher

Bill Cunningham

Lew Pitcher

Fred

Fred

Nobody

Dann Corbit

Fred

Bill Cunningham

Lew Pitcher

Keith Thompson

Bill Cunningham

Keith Thompson

Bill Cunningham

Nobody

Bill Cunningham

Michael Press

Ask a Question

Similar Threads

Staff online

Members online

Forum statistics

Latest Threads