Numerical result variation for cos near pi/2


M

mathog

This problem is probably related to this thread:

https://groups.google.com/forum/#!topic/comp.lang.c/K10prwCtRV4

This program:

#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#define MY_PI 3.14159265358979323846
int main(void){
double dang = MY_PI /2.0;
printf("dang:%le\n",dang);
double dresult = cos(dang);
printf("dresult:%le\n",dresult);
exit(EXIT_SUCCESS);
}

compiled with:

gcc -Wall -std=c99 -o problem problem.c -lm

on a linux gcc 4.7.2 system emits:

(A)
dang:1.570796e+00
dresult:6.123234e-17


compiled the same way on a gcc 4.6.3 system emits:

(B)
dang:1.570796e+00
dresult:6.123032e-17

but compiled with -O2 on the same system emits (A), like the first
system. Thinking it might be an "excess precision" problem I put
"volatile" in front of "double dang" and tested again with -O0 and -O2
on the 4.6.3 system. That made the result consistent on the 4.6.3
system, but it was always (B)!

It would be a big help if this calculation was always the _same_ value,
at least on such closely related systems, because when that is not true
"diff" cannot be used to easily detect changes on the much, much larger
binary file this example was extracted from.

This particular issue could be papered over with something like:

double dresult = cos(dang);
if(dresult < 1.0e-15 && dresult > -1.0e-15)dresult=0.0;

but presumably there are similar issues at 1.0e-14 too.

The two test systems are both generating 32 bit x86 binaries and are two
different linux distros running on AMD chips (an Opteron and an Athlon X2).

Ideas?

Thanks,

David Mathog
 
Ad

Advertisements

M

mathog

Let the machines be A and B

Do

1. compile on A (-O0)
2. copy binary to B
3. run on B

and find that the output is the same as if the binary had been compiled on B

So the problem seems to be (mostly) attached to the math libraries,
which are slightly different versions on the two machines, 2.15 vs.
2.17.

Odd that cos(M_PI/2) changed with library versions, even if the
difference is way down in the "not significant" range. Seems like
a value that would have been set in stone at about beta 0.0.2!

Regards,

David Mathog
 
G

glen herrmannsfeldt

mathog said:
This problem is probably related to this thread:

This program:
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#define MY_PI 3.14159265358979323846
int main(void){
double dang = MY_PI /2.0;
printf("dang:%le\n",dang);
double dresult = cos(dang);
printf("dresult:%le\n",dresult);
exit(EXIT_SUCCESS);
}
(snip)


(snip)

dresult:6.123032e-17

My first thought is x87 extended precision. Tradition of the x87 is
to keep internal (register stack) values with 64 significant bits,
where normal values have 53 (double) or 24 (single). Might be able to
tell by converting the two values to their internal representation.

But another possibility is rounding mode. The x87 (and maybe others)
have bits to set the rounding mode, with choices such as down, up,
nearest, and maybe more. It isn't always so obvious how the rounding
mode gets set, so maybe it is different in the two cases.

You can compile with -S and see any differences in the generated code.

It might be that once calls an actual cos() subroutine, returning a
(double), where the other directly executes either FCOS or FSINCOS,
keeping the result on the stack with full precsion. It is supposed
to be that you call an actual routine that checks that the argument
is within the valid range, and either complains or does something
else before executing FCOS.

-- glen
 
G

Geoff

This particular issue could be papered over with something like:

double dresult = cos(dang);
if(dresult < 1.0e-15 && dresult > -1.0e-15)dresult=0.0;

but presumably there are similar issues at 1.0e-14 too.

I believe it's supposed to be papered over with something like:

double dresult = cos(dang);
if(dresult < DBL_EPSILON && dresult > -DBL_EPSILON) dresult = 0.0;

On my system
printf("%le\n", DBL_EPSILON);
yields 2.220446e-016, which is far greater than your 6.xxxxe-17.

This is a FAQ when dealing with floating point differences.

http://c-faq.com/fp/fpequal.html
 
J

J. Clarke

This problem is probably related to this thread:

https://groups.google.com/forum/#!topic/comp.lang.c/K10prwCtRV4

This program:

#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#define MY_PI 3.14159265358979323846
int main(void){
double dang = MY_PI /2.0;
printf("dang:%le\n",dang);
double dresult = cos(dang);
printf("dresult:%le\n",dresult);
exit(EXIT_SUCCESS);
}

compiled with:

gcc -Wall -std=c99 -o problem problem.c -lm

on a linux gcc 4.7.2 system emits:

(A)
dang:1.570796e+00
dresult:6.123234e-17


compiled the same way on a gcc 4.6.3 system emits:

(B)
dang:1.570796e+00
dresult:6.123032e-17

but compiled with -O2 on the same system emits (A), like the first
system. Thinking it might be an "excess precision" problem I put
"volatile" in front of "double dang" and tested again with -O0 and -O2
on the 4.6.3 system. That made the result consistent on the 4.6.3
system, but it was always (B)!

It would be a big help if this calculation was always the _same_ value,
at least on such closely related systems, because when that is not true
"diff" cannot be used to easily detect changes on the much, much larger
binary file this example was extracted from.

This particular issue could be papered over with something like:

double dresult = cos(dang);
if(dresult < 1.0e-15 && dresult > -1.0e-15)dresult=0.0;

but presumably there are similar issues at 1.0e-14 too.

The two test systems are both generating 32 bit x86 binaries and are two
different linux distros running on AMD chips (an Opteron and an Athlon X2).

Ideas?

Thanks,

David Mathog

Just out of curiosity I tried this and got the same results you reported
on 4.6.3 on a Core 2 Quad, which pretty much rules out it being some
vagary of AMD's floating point implementation.

I did some more poking around and looking at the assembler output, with
the -02 optimization and _without_ the "volatile" keyword, it appears
that the compiler optimizes the calculation right out of the program--
instead of a call to the "cos" routine, it references a stored constant.
With the "volatile" keyword, however, it calls the "cos" routine, as it
does for both versions of -O0. By the way, with and without "volatile"
the -O0 versions are identical except that at one point two lines come
in reverse order.

I don't have 4.7.2 installed anywhere, so it's inconvenient to do
further testing, however looking at the assembler output from 4.7 and
comparing might at least shed some light.

Incidentally, the Microsoft compiler calls a "cos" routine and always
yields the 6.123234e-17 result (I tried all -O levels, 32 vs 64 bit,
SSE2 on and off, ran on AMD and Intel, and used two different
versions)--I don't know what that signifies but figured it might be
worht mentioning.
 
M

mathog

Geoff said:
I believe it's supposed to be papered over with something like:

double dresult = cos(dang);
if(dresult < DBL_EPSILON && dresult > -DBL_EPSILON) dresult = 0.0;

On my system
printf("%le\n", DBL_EPSILON);
yields 2.220446e-016, which is far greater than your 6.xxxxe-17.

But, as it turns out, will not work to resolve this difference. Here is
another small test program:

#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#define MY_PI 3.14159265358979323846 /* pi */
int main(void){
int i;
double eps;
double pi=MY_PI;
printf("pi (asin) = %20.20le\n",2.0*asin(1));
printf("pi (MY_PI) = %20.20le\n",pi);
eps = 1/8.0;
for(i=1;i<20;i++){
printf("i:%2d eps:%18.14le dsin:%18.14le\n",i,eps, sin(pi - eps));
eps /= 8.0;
}
exit(EXIT_SUCCESS);
}

On the 4.7.2 system it emits:

pi (asin) = 3.14159265358979311600e+00
pi (MY_PI) = 3.14159265358979311600e+00
i: 1 eps:1.25000000000000e-01 dsin:1.24674733385228e-01
i: 2 eps:1.56250000000000e-02 dsin:1.56243642248835e-02
i: 3 eps:1.95312500000000e-03 dsin:1.95312375823693e-03
i: 4 eps:2.44140625000000e-04 dsin:2.44140622574803e-04
i: 5 eps:3.05175781250000e-05 dsin:3.05175781203855e-05
i: 6 eps:3.81469726562500e-06 dsin:3.81469726573821e-06
i: 7 eps:4.76837158203125e-07 dsin:4.76837158325572e-07
i: 8 eps:5.96046447753906e-08 dsin:5.96046448978553e-08
i: 9 eps:7.45058059692383e-09 dsin:7.45058071938851e-09
i:10 eps:9.31322574615479e-10 dsin:9.31322697080158e-10
i:11 eps:1.16415321826935e-10 dsin:1.16415444291615e-10
i:12 eps:1.45519152283669e-11 dsin:1.45520376930468e-11
i:13 eps:1.81898940354586e-12 dsin:1.81911186822577e-12
i:14 eps:2.27373675443232e-13 dsin:2.27496140123147e-13
i:15 eps:2.84217094304040e-14 dsin:2.85441741103187e-14
i:16 eps:3.55271367880050e-15 dsin:3.67517835871524e-15
i:17 eps:4.44089209850063e-16 dsin:5.66553889764798e-16
i:18 eps:5.55111512312578e-17 dsin:1.22464679914735e-16
i:19 eps:6.93889390390723e-18 dsin:1.22464679914735e-16


and when the _same_ 32 bit binary is run on the 4.6.3 system it emits:

pi (asin) = 3.14159265358979311600e+00
pi (MY_PI) = 3.14159265358979311600e+00
i: 1 eps:1.25000000000000e-01 dsin:1.24674733385228e-01
i: 2 eps:1.56250000000000e-02 dsin:1.56243642248835e-02
i: 3 eps:1.95312500000000e-03 dsin:1.95312375823693e-03
i: 4 eps:2.44140625000000e-04 dsin:2.44140622574803e-04
i: 5 eps:3.05175781250000e-05 dsin:3.05175781203855e-05
i: 6 eps:3.81469726562500e-06 dsin:3.81469726573821e-06
i: 7 eps:4.76837158203125e-07 dsin:4.76837158325568e-07
i: 8 eps:5.96046447753906e-08 dsin:5.96046448978512e-08
i: 9 eps:7.45058059692383e-09 dsin:7.45058071938446e-09
i:10 eps:9.31322574615479e-10 dsin:9.31322697076114e-10
i:11 eps:1.16415321826935e-10 dsin:1.16415444287570e-10
i:12 eps:1.45519152283669e-11 dsin:1.45520376890022e-11
i:13 eps:1.81898940354586e-12 dsin:1.81911186418124e-12
i:14 eps:2.27373675443232e-13 dsin:2.27496136078614e-13
i:15 eps:2.84217094304040e-14 dsin:2.85441700657862e-14
i:16 eps:3.55271367880050e-15 dsin:3.67517431418274e-15
i:17 eps:4.44089209850063e-16 dsin:5.66549845232300e-16
i:18 eps:5.55111512312578e-17 dsin:1.22460635382238e-16
i:19 eps:6.93889390390723e-18 dsin:1.22460635382238e-16

Notice that the dsin values diverge from i = 7, with a result
in the 10^-7 range, so DBL_EPSILON is too small for this problem.

If the function sin(eps) is tested instead the results are identical on
both platforms. Not surprisingly, even though mathematically
the results for sin(pi-eps) should be the same as the preceeding,
numerically they are not. The interesting thing is that they diverge
very early, at the 3rd step. For sin(eps) we know the expected result
for very small eps: eps. And that is what it returns from 10-8 on down.

i: 1 eps:1.25000000000000e-01 dsin:1.24674733385228e-01
i: 2 eps:1.56250000000000e-02 dsin:1.56243642248834e-02
i: 3 eps:1.95312500000000e-03 dsin:1.95312375823680e-03
i: 4 eps:2.44140625000000e-04 dsin:2.44140622574681e-04
i: 5 eps:3.05175781250000e-05 dsin:3.05175781202630e-05
i: 6 eps:3.81469726562500e-06 dsin:3.81469726561575e-06
i: 7 eps:4.76837158203125e-07 dsin:4.76837158203107e-07
i: 8 eps:5.96046447753906e-08 dsin:5.96046447753906e-08
i: 9 eps:7.45058059692383e-09 dsin:7.45058059692383e-09
i:10 eps:9.31322574615479e-10 dsin:9.31322574615479e-10
i:11 eps:1.16415321826935e-10 dsin:1.16415321826935e-10
i:12 eps:1.45519152283669e-11 dsin:1.45519152283669e-11
i:13 eps:1.81898940354586e-12 dsin:1.81898940354586e-12
i:14 eps:2.27373675443232e-13 dsin:2.27373675443232e-13
i:15 eps:2.84217094304040e-14 dsin:2.84217094304040e-14
i:16 eps:3.55271367880050e-15 dsin:3.55271367880050e-15
i:17 eps:4.44089209850063e-16 dsin:4.44089209850063e-16
i:18 eps:5.55111512312578e-17 dsin:5.55111512312578e-17
i:19 eps:6.93889390390723e-18 dsin:6.93889390390723e-18

So, which, if either version of the sin(pi-eps) calculation
is actually the right value? Looking at step 16 eps is
3.55271367880050e-15 so

3.67517835871524e-15 expected sin(pi-eps)[== sin(eps) == eps)
3.67517835871524e-15 observed, newer library (2.18, correct)
3.67517431418274e-15 observed, older library (2.15, incorrect)

Apparently they cleaned up a problem in the math library recently
affecting trig calculations that gave results near zero. This is
probably one of the trig cases referred to in the 2.16 release notes
here: http://sourceware.org/ml/libc-alpha/2012-06/msg00807.html

Regards,

David Mathog
 
Ad

Advertisements

N

Nobody

Numerical result variation for cos near pi/2

Around pi/2, the absolute error in the result is proportional to the
absolute error in the input. But the value of cos(pi/2) is zero, so the
relative error of the result (absolute error divided by the correct value)
will tend to infinity as the input tends to pi/2 (from either direction).

Exponential notation (x-times-ten-to-the-y) tries to limit relative
error, which is a poor choice for the case where f(x)=0 and x!=0.

A simple solution is to output the result using %f rather than %e or %g
so that the rounding error disappears off the right-hand side.

However, there will always be cases where the correct result is on the
boundary between representations, so the most miniscule difference will
result in a difference of one in the least-significant digit, which in
turn will propagate to the left if the least-significant digit changes
from 9 to 0 or vice versa.

The real solution is not to compare numbers with "diff".

Getting numerical results which are repeatable to the last bit is hard
(and can have a very significant performance penalty), particularly when
dealing with transcendental functions.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top