Bug in floating-point addition: is anyone else seeing this?

Carl Banks · May 22, 2008

Are you running your simulations on a system that does or does not
support the "useless bell and whistle" of correct rounding? If not,
how do you prevent regression towards 0?

The "useless bell and whistle" is switching to multiprecision.

I'm not sure whether our hardware has a rounding bias or not but I
doubt it would matter if it did.

For example, one of the things that caused the PS3 to be in 3rd place
behind the Wii and XBox 360 is that to save a cycle or two, the PS3
cell core does not support rounding of single precision results -- it
truncates them towards 0. That led to horrible single-pixel errors in
the early demos I saw, which in term helped contribute to game release
delays, which has turned into a major disappointment for Sony.

And you believe that automatically detecting rounding errors and
switching to multi-precision in software would have saved Sony all
this?

Carl Banks

Henrique Dante de Almeida · May 22, 2008

10000000000000000.0

Notice that 1e16-1 doesn't exist in IEEE double precision:
1e16-2 == 0x1.1c37937e07fffp+53
1e16 == 0x1.1c37937e08p+53

(that is, the hex representation ends with "7fff", then goes to
"8000").

So, it's just rounding. It could go up, to 1e16, or down, to 1e16-2.
This is not a bug, it's a feature.

Henrique Dante de Almeida · May 22, 2008

Notice that 1e16-1 doesn't exist in IEEE double precision:
1e16-2 == 0x1.1c37937e07fffp+53
1e16 == 0x1.1c37937e08p+53

(that is, the hex representation ends with "7fff", then goes to
"8000").

So, it's just rounding. It could go up, to 1e16, or down, to 1e16-2.
This is not a bug, it's a feature.

I didn't answer your question. :-/

Adding a small number to 1e16-2 should be rounded to nearest (1e16-2)
by default. So that's strange.

The following code compiled with gcc 4.2 (without optimization) gives
the same result:

#include <stdio.h>

int main (void)
{
double a;

while(1) {
scanf("%lg", &a);
printf("%a\n", a);
printf("%a\n", a + 0.999);
printf("%a\n", a + 0.9999);
}
}

Henrique Dante de Almeida · May 22, 2008

I didn't answer your question. :-/

Adding a small number to 1e16-2 should be rounded to nearest (1e16-2)
by default. So that's strange.

The following code compiled with gcc 4.2 (without optimization) gives
the same result:

#include <stdio.h>

int main (void)
{
double a;

while(1) {
scanf("%lg", &a);
printf("%a\n", a);
printf("%a\n", a + 0.999);
printf("%a\n", a + 0.9999);
}

}

However, compiling it with "-mfpmath=sse -msse2" it works. (it
doesn't work with -msse either).

Henrique Dante de Almeida · May 22, 2008

However, compiling it with "-mfpmath=sse -msse2" it works. (it
doesn't work with -msse either).

Finally (and the answer is obvious). 387 breaks the standards and
doesn't use IEEE double precision when requested to do so.

It reads the 64-bit double and converts it to a 80-bit long double.
In this case, 1e16-2 + 0.9999 == 1e16-1. When requested by the printf
call, this 80-bit number (1e16-1) is converted to a double, which
happens to be 1e16.

Ross Ridge · May 22, 2008

Henrique Dante de Almeida said:
Finally (and the answer is obvious). 387 breaks the standards and
doesn't use IEEE double precision when requested to do so.

Actually, the 80387 and the '87 FPU in all other IA-32 processors
do use IEEE 745 double-precision arithmetic when requested to do so.
The problem is that GCC doesn't request that it do so. It's a long
standing problem with GCC that will probably never be fixed. You can
work around this problem the way the Microsoft C/C++ compiler does
by requesting that the FPU always use double-precision arithmetic.
That way your answers are only wrong when you use long double or float.

Ross Ridge

Diez B. Roggisch · May 22, 2008

Dave said:
Are you running your simulations on a system that does or does not
support the "useless bell and whistle" of correct rounding? If not,
how do you prevent regression towards 0?

For example, one of the things that caused the PS3 to be in 3rd place
behind the Wii and XBox 360 is that to save a cycle or two, the PS3
cell core does not support rounding of single precision results -- it
truncates them towards 0. That led to horrible single-pixel errors in
the early demos I saw, which in term helped contribute to game release
delays, which has turned into a major disappointment for Sony.

First of all - calling the PS3 technologically behind the WII (that is on
par with the PS2 wrt to it's computational power) is preposterous.

And that put aside, I don't get why a discussion about single or double
precision floats that SHARE THE SAME ROUNDING BEHAVIOR - just in different
scales - has to do with automatically adapting calculations to higher
precision numbers such as decimals or any other arbitrary precision number
format.

Diez

Iain King · May 22, 2008

Utterly shameless.

You may find a more appreciative (and less antagonised) audience for
your language in comp.lang.cobol

Diez B. Roggisch · May 22, 2008

This person who started this thread posted the calculations showing
that Python was doing the wrong thing, and filed a bug report on it.

If someone pointed out a similar problem in Flaming Thunder, I would
agree that Flaming Thunder was doing the wrong thing.

I would fix the problem a lot faster, though, within hours if
possible. Apparently this particular bug has been lurking on Bugzilla
since 2003: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323

I wonder how you would accomplish that, given that there is no fix.

http://hal.archives-ouvertes.fr/hal-00128124

Diez

Mark Dickinson · May 22, 2008

Actually, the 80387 and the '87 FPU in all other IA-32 processors
do use IEEE 745 double-precision arithmetic when requested to do so.
The problem is that GCC doesn't request that it do so. It's a long
standing problem with GCC that will probably never be fixed. You can
work around this problem the way the Microsoft C/C++ compiler does
by requesting that the FPU always use double-precision arithmetic.

Even this isn't a perfect solution, though: for one thing, you can
only
change the precision used for rounding, but not the exponent range,
which remains the same as for extended precision. Which means you
still don't get strict IEEE 754 compliance when working with very
large or very small numbers. In practice, I guess it's fairly
easy to avoid the extremes of the exponent range, so this seems like
a workable fix.

More seriously, it looks as though libm (and hence the Python
math module) might need the extended precision: on my machine
there's a line in /usr/include/fpu_control.h that says

#define _FPU_EXTENDED 0x300 /* libm requires double extended
precision. */

Mark

Mark Dickinson · May 22, 2008

I wonder how you would accomplish that, given that there is no fix.

http://hal.archives-ouvertes.fr/hal-00128124

Diez

For anyone still following the discussion, I highly
recommend the above mentioned paper; I found it
extremely helpful.

http://bugs.python.org/issue2937

Mark

Henrique Dante de Almeida · May 22, 2008

Actually, the 80387 and the '87 FPU in all other IA-32 processors
do use IEEE 745 double-precision arithmetic when requested to do so.

True. :-/

It seems that it uses a flag to control the precision. So, a
conformant implementation would require saving/restoring the flag
between calls. No wonder why gcc doesn't try to do this.

There are two possible options for python, in that case:

- Leave it as it is. The python language states that floating point
operations are based on the underlying C implementation. Also, the
relative error in this case is around 1e-16, which is smaller than the
expected error for IEEE doubles (~2e-16), so the result is non-
standard, but acceptable (in the general case, I believe the rounding
error could be marginally bigger than the expected error in extreme
cases, though).

- Use long doubles for archictectures that don't support SSE2 and use
SSE2 IEEE doubles for architectures that support it.

A third option would be for python to set the x87 precision to double
and switch it back to extended precision when calling C code (that
would be too much work for nothing).

Python -- floating point arithmetic	3	Jul 7, 2010
Python -- floating point arithmetic	2	Jul 7, 2010
Surprise with special floating point values	3	Nov 29, 2006
gmpy floating point exception	2	Mar 29, 2007
Anyone else has seen "forrtl: error (200) ..."	4	May 30, 2007
MIMEText breaking the rules?	2	Aug 1, 2007
Python 2.5 segmentation faulting importing random	0	Mar 10, 2008
Anyone understand this syntax error?	2	Dec 2, 2006

Bug in floating-point addition: is anyone else seeing this?

Carl Banks

Henrique Dante de Almeida

Henrique Dante de Almeida

Henrique Dante de Almeida

Henrique Dante de Almeida

Ross Ridge

Diez B. Roggisch

Iain King

Diez B. Roggisch

Mark Dickinson

Mark Dickinson

Henrique Dante de Almeida

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads