small numerical differences in floating point result between wintel and Sun/SPARC

JS · Dec 13, 2004

We have the same floating point intensive C++ program that runs on
Windows on Intel chip and on Sun Solaris on SPARC chips. The program
reads the exactly the same input files on the two platforms. However,
they generate slightly different results for floating point numbers.

Are they really supposed to generate exactly the same results? I
guess so because both platforms are supposed to be IEEE floating point
standard (754?) compliant. I have turned on the Visual C++ compile
flags which will make sure the Windows produce standard compliant code
(the /Op flags). However, they still produce different results. I
suspect that this may be due to a commerical mathematical library that
we use which can't be compiled using /Op option. If I had recompiled
everything using /Op option, the two should have produced the same
results.

Am I right?

Thanks a lot.

Steven G. Kargl · Dec 13, 2004

We have the same floating point intensive C++ program that runs on
Windows on Intel chip and on Sun Solaris on SPARC chips. The program
reads the exactly the same input files on the two platforms. However,
they generate slightly different results for floating point numbers.

What does this have to do with Fortran (or C)? Don't
cross post off-topic threads.

PS: Google on "floating goldberg"

Rich Townsend · Dec 13, 2004

JS said:
We have the same floating point intensive C++ program that runs on
Windows on Intel chip and on Sun Solaris on SPARC chips. The program
reads the exactly the same input files on the two platforms. However,
they generate slightly different results for floating point numbers.

Are they really supposed to generate exactly the same results? I
guess so because both platforms are supposed to be IEEE floating point
standard (754?) compliant. I have turned on the Visual C++ compile
flags which will make sure the Windows produce standard compliant code
(the /Op flags). However, they still produce different results. I
suspect that this may be due to a commerical mathematical library that
we use which can't be compiled using /Op option. If I had recompiled
everything using /Op option, the two should have produced the same
results.

Am I right?

Yes and no. As I understand it, the IEEE floating point standard places
reasonably-tight constraints on how *atomic* floating-point operations
are undertaken. For instance, given the bit patterns making up two
floating point numbers a and b, the standard says how to find the bit
pattern making up their product a*b.

However, a typical computer program is not a single atomic operation; it
is a whole sequence of them. Take for example this assignment:

d = a + b + c

What order should the sum be undertaken in? Should it be

d = (a + b) + c,

or

d = a + (b + c)?

Mathematically, these are identical. But in a computer program the "+"
operation does not represent true mathematical addition, but rather a
floating-point approximation of it. Even if this approximation conforms
to the IEEE standard, the results of the two assignments above will
differ in many situations. Consider, for instance, when:

a = 1.e10
b = -1.e10
c = 1.

Assuming floating-point math with a precision less than ten decimal
significant digits, the first expression above above will give d = 1,
but the second expression will give d = 0.

Therefore, the result of the *original* assignment above (the one
without the parentheses) depends on how the compiler decides to join the
two atomic addition operations. Even though these operations might
individually conform to the IEEE standard, their result will vary
depending on the order in which the compiler decides to perform them.
This is nothing to do with the IEEE standard per se, but a fundamental
limitation of finite-precision floating-point math.

Hopefully, this should underline the idiocy in rubbishing one compiler
because it produces slightly different results to another.

cheers,

Rich

--
Dr Richard H D Townsend
Bartol Research Institute
University of Delaware

[ Delete VOID for valid email address ]

Dik T. Winter · Dec 13, 2004

> Am I right?

No, you are wrong. The slight differences are there because the Intel
chips calculate expressions with 80 bits of precision, the Sparc uses
64 bits of precision. Both are allowed by the IEEE standard.

James D. Veale · Dec 13, 2004

I have a customer who runs a simulation on both a PC and a UNIX box,
results are written to an ASCII data file. He has used
TFuzSerf, a fuzzy number file comparison, to verify similar results.

Demonstration versions and contact information are available
at the Complite File Comparison Family web page, at ...

http://world.std.com/~jdveale

This fuzzy number comparison utility allows you to specify both absolute
and relative tolerances(ranges) to numbers. ASCII numbers are
recognized and treated as true numbers, not just character strings.
As a result 2-digit and 3-digit exponents are handled automatically.

Please feel free to contact me if I can answer any questions.

Jim Veale

JS said:
We have the same floating point intensive C++ program that runs on
Windows on Intel chip and on Sun Solaris on SPARC chips. The program
reads the exactly the same input files on the two platforms. However,
they generate slightly different results for floating point numbers.

Are they really supposed to generate exactly the same results? I
guess so because both platforms are supposed to be IEEE floating point
standard (754?) compliant. I have turned on the Visual C++ compile
flags which will make sure the Windows produce standard compliant code
(the /Op flags). However, they still produce different results. I
suspect that this may be due to a commerical mathematical library that
we use which can't be compiled using /Op option. If I had recompiled
everything using /Op option, the two should have produced the same
results.

Gordon Burditt · Dec 14, 2004

We have the same floating point intensive C++ program that runs on

Windows on Intel chip and on Sun Solaris on SPARC chips. The program
reads the exactly the same input files on the two platforms. However,
they generate slightly different results for floating point numbers.

Are they really supposed to generate exactly the same results? I

I don't believe the == operator applied to calculated floating-point
results is ever required to return 1. Nor does it have to be consistent
about it. An implementation like that probably won't sell well, but
ANSI C allows it.

int i;
double d1, d2;
... put a value in d1 and d2 ...

for (i = 0; (d1 + d2) == (d2 + d1); i++)
/* empty */ ;
printf("Iterations: %d\n", i);

There's nothing wrong with the code printing "Iterations: 13" here.

guess so because both platforms are supposed to be IEEE floating point
standard (754?) compliant.

Just because the hardware is IEEE floating point doesn't mean
the compiler has to keep the intermediate values in 80-bit long
double or has to lop off the extra precision consistently.

I have turned on the Visual C++ compile
flags which will make sure the Windows produce standard compliant code
(the /Op flags). However, they still produce different results. I
suspect that this may be due to a commerical mathematical library that
we use which can't be compiled using /Op option. If I had recompiled
everything using /Op option, the two should have produced the same
results.

C offers no guarantee that an Intel platform will produce the
same results as an Intel platform with the same CPU serial number
and the same compiler serial number.

Am I right?

No.

Gordon L. Burditt

Lawrence Kirby · Dec 14, 2004

On Mon, 13 Dec 2004 22:35:42 -0500, Rich Townsend wrote:

....

I have turned on the Visual C++ compile

Yes and no. As I understand it, the IEEE floating point standard places
reasonably-tight constraints on how *atomic* floating-point operations
are undertaken. For instance, given the bit patterns making up two
floating point numbers a and b, the standard says how to find the bit
pattern making up their product a*b.

It also depends on the language. C, and presumably C++, allows
intermediate results to be held at higher precision than indicated by the
type. IEEE supports a number of precisions but doesn't mandate which
should be used, it just says what will happen given a particular
precision.

However, a typical computer program is not a single atomic operation; it
is a whole sequence of them. Take for example this assignment:

d = a + b + c

What order should the sum be undertaken in? Should it be

d = (a + b) + c,

or

d = a + (b + c)?

That depends on the language. In C (and again presumably C++) d = a + b +c
is equivalent to d = (a + b) + c. A C optimiser cannot rearrange this
unless it can prove the result is consistent with this for all possible
input values. That's rarely possible in floating point.

....

Hopefully, this should underline the idiocy in rubbishing one compiler
because it produces slightly different results to another.

Nevertheless many if not most C compilers for x86 platforms violate the C
standard (not the IEEE standard) when it comes to floating point. C
requires that values held in objects and the results of casts be held at
the correct precision for the type. . However on x86 this requires the
value to be stored to and reloaded from memory/cache which is horrendously
inefficient compared to keeping the value in a register. Compiler writers
often take the view that generating faster code that keeps values
represented at a higher precision is the lesser of evils. Not everybody
agrees in all circumstances. I will leave the readers of comp.lang.c++ and
comp.lang.fortran to comment on those languages.

Lawrence

Herman D. Knoble · Dec 14, 2004

First, we assuming that the program in question here is not displaying
or writing results past the precision of the corresponding variables.

How many "places" (mantissa bits) are you talking about here?
If the results differ by only the last mantissa bit or two I would not
worry about this. If results differ by more than the least significant
bit or two, there could be reason to suspect compiler options, like /OP
(or, for example, XLF under AIX the compiler option -qfloat=nomaf).

For the case where results differ by more that a mantissa bit or two,
it could also be possible that the code itself is not completely
numerically stable. One of the ways to determine if a model is stable
is in fact to perturb the input (or platform) slightly and
compare results.

Skip

-|We have the same floating point intensive C++ program that runs on
-|Windows on Intel chip and on Sun Solaris on SPARC chips. The program
-|reads the exactly the same input files on the two platforms. However,
-|they generate slightly different results for floating point numbers.
-|
-|Are they really supposed to generate exactly the same results? I
-|guess so because both platforms are supposed to be IEEE floating point
-|standard (754?) compliant. I have turned on the Visual C++ compile
-|flags which will make sure the Windows produce standard compliant code
-|(the /Op flags). However, they still produce different results. I
-|suspect that this may be due to a commerical mathematical library that
-|we use which can't be compiled using /Op option. If I had recompiled
-|everything using /Op option, the two should have produced the same
-|results.
-|
-|Am I right?
-|
-|Thanks a lot.

Herman D. (Skip) Knoble, Research Associate
(a computing professional for 38 years)
Email: SkipKnobleLESS at SPAMpsu dot edu
Web: http://www.personal.psu.edu/hdk
Penn State Information Technology Services
Academic Services and Emerging Technologies
Graduate Education and Research Services
Penn State University
214C Computer Building
University Park, PA 16802-21013
Phone:+1 814 865-0818 Fax:+1 814 863-7049

websnarf · Dec 14, 2004

Sun and Intel should produce identical results for
negation,+,-,*,/,sqrt() operations, because this is required for IEEE
754 compliance, and both implement double and float on the same sized
floating point values (older x86 compilers used to use 80bit for all FP
temporaries which could definately be a source of differences, but VC++
doesn't do this anymore). You should also expect identical output for
things like fprem(), ceil(), floor(), modf(), and so on. All of these
operations have well known and finite ways of being calculated to the
best and closest possible result, which is what the IEEE 754 specifies.
Results can be different if you put the processors into different
rounding modes -- I thought that the ANSI C specification was supposed
to specify a consistent rounding mode, so there shouldn't be any
differences because of that, but I could be wrong.

The problem is everything else. sin(), cos(), log(), exp(), atanh()
etc, ... these guys are not even consistent between Athlons and
Pentiums. The reason is that there is no known finite algorithm for
computing these guys to the exactly corrected rounded result. There
are plenty of ways to compute them within an accuracy of "1 ulp"
(i.e., off by at most one unit in the last siginificant bit.)

But I am not the foremost expert on this stuff (and it looks like the
other posters here know even less than me.) This question is more
appropriate for a group like comp.arch.arithmetic where there are
plenty of posters there who are expert on this.

Gerry Thomas · Dec 14, 2004

JS said:
We have the same floating point intensive C++ program that runs on
Windows on Intel chip and on Sun Solaris on SPARC chips. The program
reads the exactly the same input files on the two platforms. However,
they generate slightly different results for floating point numbers.

Are they really supposed to generate exactly the same results?

For fp intensive programs, whatever the language, no.

I
guess so because both platforms are supposed to be IEEE floating point
standard (754?) compliant. I have turned on the Visual C++ compile
flags which will make sure the Windows produce standard compliant code
(the /Op flags). However, they still produce different results. I
suspect that this may be due to a commerical mathematical library that
we use which can't be compiled using /Op option. If I had recompiled
everything using /Op option, the two should have produced the same
results.

Am I right?

In all probability, no, :-(.

For an update to the over-hackneyed Goldberg variations, see Parker,
Pierce, and Eggert , "Monte Carlo Arithmetic: exploiting randomness in
floating-point arithmetic," Comp. Sci. & Eng., pp. 58-68, July 2000.
..> Thanks a lot.

--
You're Welcome,
Gerry T.
______
"Competent engineers rightly distrust all numerical computations and seek
corroboration from alternative numerical methods, from scale models, from
prototypes, from experience... ." -- William V. Kahan.

JS · Dec 14, 2004

First, we assuming that the program in question here is not displaying
or writing results past the precision of the corresponding variables.

How many "places" (mantissa bits) are you talking about here?

All the calculations are done in double. The output result has
precision (in the format of 999999999.666666) that is well within the
precision limit of double. So this is not due not incorrect precision
used for displaying or outputting the result.

If the results differ by only the last mantissa bit or two I would not
worry about this. If results differ by more than the least significant
bit or two, there could be reason to suspect compiler options, like /OP
(or, for example, XLF under AIX the compiler option -qfloat=nomaf).

Some posters don't seem to know about /Op option. From Microsoft doc:

"By default, the compiler uses the coprocessorâ€™s 80-bit registers to
hold the intermediate results of floating-point calculations. This
increases program speed and decreases program size. However, because
the calculation involves floating-point data types that are
represented in memory by less than 80 bits, carrying the extra bits of
precision (80 bits minus the number of bits in a smaller
floating-point type) through a lengthy calculation can produce
inconsistent results.

With /Op, the compiler loads data from memory prior to each
floating-point operation and, if assignment occurs, writes the results
back to memory upon completion. Loading the data prior to each
operation guarantees that the data does not retain any significance
greater than the capacity of its type."

Thus the difference I got doesn't seem to come from the 80-bit vs 64
bit issue mentioned by some posters.

Gerry Thomas · Dec 14, 2004

[Stuff from the ever-helpful Skip Knoble elided without prejudice]

Some posters don't seem to know about /Op option.

Not me, and a bunch besides!

FYI, most clf's posters are rabidly anti Microsoft and could give a toss
what it has to say about anything, least of all in respect to it's own
products,

.

From Microsoft doc: (sic DOC!)

"By default, the compiler uses the coprocessor's 80-bit registers to
hold the intermediate results of floating-point calculations. This
increases program speed and decreases program size. However, because
the calculation involves floating-point data types that are
represented in memory by less than 80 bits, carrying the extra bits of
precision (80 bits minus the number of bits in a smaller
floating-point type) through a lengthy calculation can produce
inconsistent results.

With /Op, the compiler loads data from memory prior to each
floating-point operation and, if assignment occurs, writes the results
back to memory upon completion. Loading the data prior to each
operation guarantees that the data does not retain any significance
greater than the capacity of its type."

IIRC (and I'm sure I do) this is a verbatim quote from Microsoft's Fortran
PowerStation documentation (sic DOC!). This product commands the least of
respect in these environs. (Word to the wise, look into VC++ ~2 for what
you seek!)

Thus the difference I got doesn't seem to come from the 80-bit vs 64
bit issue mentioned by some posters.

How come? I've no idea why you're beginning to remind me of Ozmandias but
I'm wondering if it's that Alexander the Wimp movie I endured on the
weekend.

--
You're Welcome,
Gerry T.
______
"Things are not what they seem; or, to be more accurate, they are not only
what they seem, but very much else besides." -- Aldous Huxley.

Richard Maine · Dec 14, 2004

I don't believe the == operator applied to calculated floating-point
results is ever required to return 1. Nor does it have to be consistent
about it. An implementation like that probably won't sell well, but
ANSI C allows it.

I'd argue that the same thing applies to Fortran (ok, except that
I'd talk about == returning .true. instead of 1). There are probably
people who would disagree with me, but that would be my interpretation
of the standard.

The comment about such a compiler not selling well would also
apply. I'd likely not get too much argument on that part.

James Giles · Dec 14, 2004

Richard said:
I'd argue that the same thing applies to Fortran (ok, except that
I'd talk about == returning .true. instead of 1). There are probably
people who would disagree with me, but that would be my interpretation
of the standard.

I'd argue that any language that claims IEEE compliance must
get a true result under some predictable circumstances. Not as
many such circumstances as originally intended by the IEEE
committee! But, *some* such circumstances exist. For example,
I think that 0.1 == 0.1 must be true, since the IEEE standard
places minimum requirements on the precision of decimal
to binary conversion. Of course, whether the value of a literal
constitutes a "calculated" result is maybe a matter of opinion.

--
J. Giles

"I conclude that there are two ways of constructing a software
design: One way is to make it so simple that there are obviously
no deficiencies and the other way is to make it so complicated
that there are no obvious deficiencies." -- C. A. R. Hoare

Gordon Burditt · Dec 14, 2004

I don't believe the == operator applied to calculated floating-point

I'd argue that any language that claims IEEE compliance must
get a true result under some predictable circumstances. Not as
many such circumstances as originally intended by the IEEE
committee! But, *some* such circumstances exist. For example,

For example, d == d where d is not a NaN.

I think that 0.1 == 0.1 must be true, since the IEEE standard
places minimum requirements on the precision of decimal
to binary conversion.

But does it place MAXIMUM requirements? 0.1 is an infinite repeating
decimal in binary. Any extra precision on one side but not the other
is likely to cause a mismatch.

Of course, whether the value of a literal
constitutes a "calculated" result is maybe a matter of opinion.

Ok, I didn't define "calculated result", but the intent was something
like "the result of a floating-point binary operator such as +, -, *, /".
Although multiplication by 0.0 should come out exact.

0.1 == 0.1 has a much better chance than, say, (0.1 + 0.0) == (0.1 + 0.0) .

Gordon L. Burditt

James Giles · Dec 14, 2004

Gordon Burditt wrote:
....

But does it place MAXIMUM requirements? 0.1 is an infinite repeating
decimal in binary. Any extra precision on one side but not the other
is likely to cause a mismatch.

The IEEE standard requires that, for numbers in this range, the
result must correctly rounded to the target precision. Since they're
both the same type and KIND (hence, the same target precision),
and since there is but one value that's closest to one tenth that's
representable as an IEEE single precision number, they must
give the same result.

--
J. Giles

"I conclude that there are two ways of constructing a software
design: One way is to make it so simple that there are obviously
no deficiencies and the other way is to make it so complicated
that there are no obvious deficiencies." -- C. A. R. Hoare

Gerry Thomas · Dec 14, 2004

[...]

The IEEE standard requires that, for numbers in this range, the
result must correctly rounded to the target precision. Since they're
both the same type and KIND (hence, the same target precision),
and since there is but one value that's closest to one tenth that's
representable as an IEEE single precision number, they must
give the same result.

How profound!

--
You're Welcome,
Gerry T.
______
"Facts are meaningless. You could use facts to prove anything that's even
remotely true." -- Homer Simpson.

Christian Bau · Dec 15, 2004

JS said:
All the calculations are done in double. The output result has
precision (in the format of 999999999.666666) that is well within the
precision limit of double. So this is not due not incorrect precision
used for displaying or outputting the result.

Some posters don't seem to know about /Op option. From Microsoft doc:

"By default, the compiler uses the coprocessor¹s 80-bit registers to
hold the intermediate results of floating-point calculations. This
increases program speed and decreases program size. However, because
the calculation involves floating-point data types that are
represented in memory by less than 80 bits, carrying the extra bits of
precision (80 bits minus the number of bits in a smaller
floating-point type) through a lengthy calculation can produce
inconsistent results.

With /Op, the compiler loads data from memory prior to each
floating-point operation and, if assignment occurs, writes the results
back to memory upon completion. Loading the data prior to each
operation guarantees that the data does not retain any significance
greater than the capacity of its type."

That isn't enough. First, the x86 FPU must be switched to 64 bit mode,
otherwise you will occasionally get different result due to double
rounding (about 1 in 2000 chance for a single random operation). The
load/store will make sure that overflows and underflows are reproduced
(let x = 1e300. (x * x) / x should be Inf / x = Inf. Without storing and
reloading the result of x * x, x * x = 1e600, (x * x) / x = 1e300. You
still run into trouble because underflow in a multiplication and
division can give different results due to double rounding. You can
always switch to Java + strict FP mode to get reproducible results.
(Which will be damned slow but it will get everything right, that is why
nonstrict FP mode had to be introduced. It is slow and gives sometimes
different results).

Tim Prince · Dec 15, 2004

Christian Bau said:
That isn't enough. First, the x86 FPU must be switched to 64 bit mode,
otherwise you will occasionally get different result due to double
rounding (about 1 in 2000 chance for a single random operation). The
load/store will make sure that overflows and underflows are reproduced
(let x = 1e300. (x * x) / x should be Inf / x = Inf. Without storing and
reloading the result of x * x, x * x = 1e600, (x * x) / x = 1e300. You
still run into trouble because underflow in a multiplication and
division can give different results due to double rounding.

That Microsoft doc is misleading, given that they always (at least since
VS6)
set the FPU to 53-bit precision, which I assume is what you mean by
"64-bit mode." I couldn't find any reference in the thread as to whether
SSE
code generation was invoked (no such option in VS6, but presumably preferred
in later versions).

Lawrence Kirby · Dec 15, 2004

All the calculations are done in double. The output result has
precision (in the format of 999999999.666666) that is well within the
precision limit of double. So this is not due not incorrect precision
used for displaying or outputting the result.

Can you create a small program that demonstrates the problem? Perhaps a
little summation loop or something. With a particular example it would be
possible to pin down the source of any inconsistency.

....

Some posters don't seem to know about /Op option. From Microsoft doc:

"By default, the compiler uses the coprocessor's 80-bit registers to
hold the intermediate results of floating-point calculations. This
increases program speed and decreases program size. However, because
the calculation involves floating-point data types that are
represented in memory by less than 80 bits, carrying the extra bits of
precision (80 bits minus the number of bits in a smaller
floating-point type) through a lengthy calculation can produce
inconsistent results.

I think somebody else pointed out that recent Microsoft compilers set the
FPU to 53 bits precision anyway. Another thing to consider is rounding
mode. They may well be the same but this needs to be checked.

Also I have an inherent mistrust of options like /Op. gcc certainly
doesn't get everything right with its version of that option, maybe other
compilers do but it can be hard on x86 especially and I wouldn't trust
it without some verification.

Lawrence

Floating point errors in collision routines	4	Dec 1, 2004
Point on Line Segment in 2D. Which code is faster ? Can you improve it ?	65	Sep 16, 2005
One Small step one infinite leap	1	Feb 6, 2005
setup write compile and run GMP progs in cygwin	0	Dec 12, 2007
What's going on with C Compilers and C99??	30	Mar 28, 2007
The devolution of English language and slothful c.l.p behaviors exposed!	50	Jan 24, 2012
C coding guidelines	99	Aug 26, 2009
In the Matter of Herb Schildt: a Detailed Analysis of "C: TheComplete Nonsense"	109	Apr 3, 2010

small numerical differences in floating point result between wintel and Sun/SPARC

JS

Steven G. Kargl

Rich Townsend

Dik T. Winter

James D. Veale

Gordon Burditt

Lawrence Kirby

Herman D. Knoble

websnarf

Gerry Thomas

JS

Gerry Thomas

Richard Maine

James Giles

Gordon Burditt

James Giles

Gerry Thomas

Christian Bau

Tim Prince

Lawrence Kirby

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads