representational error in C

newgoat · Mar 11, 2006

Hello,

I wrote two program to test the inherent problem binary system has
with representing floating points values. Adding 0.1 to itself 1000
times results in a number slightly larger than 100. But to my surprise,
adding it to itself 100000 times results in a number smaller than
10000.
Can anybody tell me why? Thanks!

#include <stdio.h>

float add_dimes_fversion(int);
double add_dimes_dversion(int);

int main(void){
int add_times1 = 1000;
int add_times2 = 100000;

/* Add 0.1 to itself for 1000 times using float for all data would
result in 100.099045. */
printf("\nAdding 0.1 for 1000 times using float data results in
%.10f\n", add_dimes_fversion(add_times1));

/* Add 0.1 to itself for 100000 times using float for all data would
result in 9998.6562500000. */
printf("\nAdding 0.1 for 100000 times using float data results in
%.10f\n\n", add_dimes_fversion(add_times2));

/* Add 0.1 to itself for 1000 times using float for all data would
result in 100.09999999999858744104. */
printf("\nAdding 0.1 for 1000 times using double data results in
%.20lf\n",add_dimes_dversion(add_times1));

/* Add 0.1 to itself for 100000 times using float for all data would
result in 9998.65625000000000000000. */
printf("\nAdding 0.1 for 100000 times using double data results in
%.20lf\n\n", add_dimes_fversion(add_times2));

printf("0.1 * 1000 = %lf, 0.1 * 100000 = %lf\n\n", 0.1 * 1000.0, 0.1
* 100000);
return 0;
}

/* Test with float data type */
float add_dimes_fversion(int n){
float sum = 0.0;
int counter = 0;

while(counter <= n){
sum = sum + 0.1;
counter++;
}

return sum;
}

/* Test with double data type */
double add_dimes_dversion(int n){
double sum = 0.0;
int counter = 0;

while(counter <= n){
sum = sum + 0.1;
counter++;
}

return sum;
}

Walter Roberson · Mar 11, 2006

I wrote two program to test the inherent problem binary system has
with representing floating points values. Adding 0.1 to itself 1000
times results in a number slightly larger than 100. But to my surprise,
adding it to itself 100000 times results in a number smaller than
10000.
Can anybody tell me why? Thanks!

The code you included does not appear to contain anything
specific to C99. If you are using C89 (likely because true C99
is still fairly uncommon), then some of the details of floating
point implementation are not nailed down, and are thus implementation
dependant.

The exact answers produced by the code you provided would
depend upon the rounding behaviour in effect. There are
a few common -different- rounding behaviours (and probably some
obscure ones along the way). C89 does not provide any mechanism
to query or control rounding behaviours (but your OS might.)

One of the recommended rounding behaviours alternates
between rounding up and rounding down, in a deliberate attempt
to reduce accumulation of round-off errors.

My guess, though, is that your system is probably not using
decimal arithmetic, and is instead using a binary floating
point representation. The representation of 100000 probably
does not start with the same binary expansion as 100 does.
This could result in the least-significant bit being
"crowded off the end" as you accumulated values: if that
least significant bit was the difference between underflow of
100000 and overflow of that value, then Yes, the effect
you saw could definitely happen.

Me · Mar 11, 2006

newgoat said:
Hello,

I wrote two program to test the inherent problem binary system has
with representing floating points values. Adding 0.1 to itself 1000
times results in a number slightly larger than 100. But to my surprise,
adding it to itself 100000 times results in a number smaller than
10000.
Can anybody tell me why? Thanks!

Floats work like this (using whole numbers for niceness):

(use a fixed width font for this)
***** * * * * * * * * * ... *
01234 6 8 10121416 20 24 28 64

What I tried to represent here is that there are "goal posts" at
2**(2n) [note: this is not what C does]

0 == 0
4 == 2**2
16 = 2**4
64 = 2**6
256 = 2**8
....

where the distance between 0-4 is 1 unit, 4-16 is 2 units, 16-64 is 4
units, 64-256 is 8 units, etc. With the C standard, these "goal posts"
are actually powers of FLT_RADIX.

(I'm going to assume your computer uses IEEE-754 floats, so FLT_RADIX
== 2) The problem with your example is that 0.1 isn't a power of
FLT_RADIX (i.e. it's not one of the goal posts, it's a number between
the two goal posts).

There are three issues here:

a. 0.1 isn't exactly representable. The standard doesn't say what to do
in this case but odds are, your compiler will round it to the nearest
number. In our whole number example, it would be like what do to when
given number 18.

b. 0.1 isn't a goal post. Goal posts are good. If we start from 0, we
can keep adding by goal posts up to some number *exactly*. Our whole
number example: 0 to 16 by adding 2, 0 to 64 by adding 4.

c. If a number isn't exactly representable, the C standard doesn't say
what to do. Most machines will round to nearest. For our whole number
machine, lets use round to rand to approximate round to nearest because
our whole number machine has a very small range compared to real floats
to save typing.

So with all this in mind lets replace 0.1 with 3.5 and 1000 with 4

So 3.5 can either be 3 or 4. Lets pick 3.

0+3 = 7, lets round to 6
6+3 = 9, lets round to 10
10+3 = 13, lets round to 14
14+3 = 17, lets round to 16

3.5 * 4 == 14 but because of all the rounding that took place, we got
16.

If your computer was a (non-existant?) one that has FLT_RADIX == 10,
this would have worked. Since IEEE-754/FLT_RADIX==2 floats are pretty
much a defacto standard for C (and have desirable numerical
properties), there is a proposal to add a separate decimal float type
to C which would have ran your example with no surprises.

#include <stdio.h>

float add_dimes_fversion(int);
double add_dimes_dversion(int);

int main(void){
int add_times1 = 1000;
int add_times2 = 100000;

/* Add 0.1 to itself for 1000 times using float for all data would
result in 100.099045. */
printf("\nAdding 0.1 for 1000 times using float data results in
%.10f\n", add_dimes_fversion(add_times1));

p.s. .9 is enough for IEEE float (see
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1822.pdf for
explanation)

/* Add 0.1 to itself for 100000 times using float for all data would
result in 9998.6562500000. */
printf("\nAdding 0.1 for 100000 times using float data results in
%.10f\n\n", add_dimes_fversion(add_times2));

/* Add 0.1 to itself for 1000 times using float for all data would
result in 100.09999999999858744104. */
printf("\nAdding 0.1 for 1000 times using double data results in
%.20lf\n",add_dimes_dversion(add_times1));

p.s. .17 is enough for IEEE double

/* Add 0.1 to itself for 100000 times using float for all data would
result in 9998.65625000000000000000. */
printf("\nAdding 0.1 for 100000 times using double data results in
%.20lf\n\n", add_dimes_fversion(add_times2));

printf("0.1 * 1000 = %lf, 0.1 * 100000 = %lf\n\n", 0.1 * 1000.0, 0.1
* 100000);
return 0;
}

/* Test with float data type */
float add_dimes_fversion(int n){
float sum = 0.0;
int counter = 0;

while(counter <= n){
sum = sum + 0.1;

Note: 0.1 is a double literal. To get a float literal use 0.1f. What
you're code does here is:
sum = (double)sum + 0.1;
The standard allows a compiler to do this with float literals but using
a double literal forces it to happen when it might not happen
otherwise.

Michael Mair · Mar 11, 2006

newgoat said:
Hello,

I wrote two program to test the inherent problem binary system has
with representing floating points values. Adding 0.1 to itself 1000
times results in a number slightly larger than 100. But to my surprise,
adding it to itself 100000 times results in a number smaller than
10000.
Can anybody tell me why? Thanks!

> /* Test with float data type */
> float add_dimes_fversion(int n){
> float sum = 0.0;
> int counter = 0;
>
> while(counter <= n){
> sum = sum + 0.1;
> counter++;
> }
>
> return sum;
> }

For one thing, you are working with 1001 and 100001 summands,
respectively; assuming 0.1F*10 < 1, this is not as surprising
as it seems: Let us, for a moment, assume that 0.1F was equal
to 0.09991 in the real world; even if we have no rounding errors,
0.09991*1001 = 99.91+0.0991 = 100.0091 and
0.09991*1000001 = 99910+0.0991 = 99910.0991

Apart from that, the other replies explain most.
Have a look at
http://docs.sun.com/source/806-3568/ncg_goldberg.html
if you are interested in more information.

Cheers
Michael

Walter Roberson · Mar 11, 2006

(I'm going to assume your computer uses IEEE-754 floats, so FLT_RADIX
== 2)

Since IEEE-754/FLT_RADIX==2 floats are pretty
much a defacto standard for C

There are quite a few widespread IEEE-754 systems that use FLT_RADIX=16
or FLT_RADIX=4, so I would say that FLT_RADIX=2 is *not* a defacto
standard for C.

Chris Torek · Mar 11, 2006

There are quite a few widespread IEEE-754 systems that use FLT_RADIX=16
or FLT_RADIX=4, so I would say that FLT_RADIX=2 is *not* a defacto
standard for C.

Just out of curiosity: I know IBM S/370 (and S/360) architecture
uses 16 ("hex float"); but who uses 4? (And: Why? 4 would lose
the implied-1 advantage of base 2, and add the jitter seen in hex
float, yet not increase the exponent range significantly as 16
does.)

Joe Wright · Mar 11, 2006

Walter said:
There are quite a few widespread IEEE-754 systems that use FLT_RADIX=16
or FLT_RADIX=4, so I would say that FLT_RADIX=2 is *not* a defacto
standard for C.

By 'quite a few' you mean presumably more than two. Can you name them
for us please? Thanks.

Ben Pfaff · Mar 11, 2006

There are quite a few widespread IEEE-754 systems that use FLT_RADIX=16
or FLT_RADIX=4, so I would say that FLT_RADIX=2 is *not* a defacto
standard for C.

I was under the impression that IEEE 754 was a binary floating
point standard, i.e. FLT_RADIX=2. Why would an IEEE 754 system
claim FLT_RADIX of 4 or 16?

Jordan Abel · Mar 11, 2006

There are quite a few widespread IEEE-754 systems that use FLT_RADIX=16
or FLT_RADIX=4, so I would say that FLT_RADIX=2 is *not* a defacto
standard for C.

Just because some systems deviate from it doesn't mean it's not a de
facto standard. [Is this really IEEE 754? I thought "IEEE 754" meant a
system matching the 'ideal' of the Intel 387, which, i've read, is what
the IEEE standard is based on.]

jacob navia · Mar 11, 2006

Michael Mair a écrit :

For one thing, you are working with 1001 and 100001 summands,
respectively; assuming 0.1F*10 < 1, this is not as surprising
as it seems: Let us, for a moment, assume that 0.1F was equal
to 0.09991 in the real world; even if we have no rounding errors,
0.09991*1001 = 99.91+0.0991 = 100.0091 and
0.09991*1000001 = 99910+0.0991 = 99910.0991

Apart from that, the other replies explain most.
Have a look at
http://docs.sun.com/source/806-3568/ncg_goldberg.html
if you are interested in more information.

Cheers
Michael

Michael, you have a GREAT EYE!!!!!!!!!!

How EASY is to oversee the dammed <= in that program. I wondered
and wondered... WHAT THE HELL IS GOING ON... ?????

jacob

Jack Klein · Mar 12, 2006

There are quite a few widespread IEEE-754 systems that use FLT_RADIX=16
or FLT_RADIX=4, so I would say that FLT_RADIX=2 is *not* a defacto
standard for C.

Click to expand...

Just because some systems deviate from it doesn't mean it's not a de
facto standard. [Is this really IEEE 754? I thought "IEEE 754" meant a
system matching the 'ideal' of the Intel 387, which, i've read, is what
the IEEE standard is based on.]

Sigh.

The original IEEE 754 and 854 formats were based in large part on the
Intel 8087, which existed half a decade or so before the 387 did.

But since Intel released the part before standardization was complete,
the 8087 does not completely conform to the final standard in a few
areas.

Wants help in c program made with arrays and function	0	Aug 3, 2022
C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
[C language] Issue in the Lotka-Volterra model.	0	Jun 28, 2023
Function is not worked in C	2	Jun 27, 2023
SENTINEL CONTROL LOOP WHEN DEALING WITH TWO ARRAYS	1	Oct 26, 2023
Filter sober in c++ don't pass test	0	Dec 2, 2023
Drawing missing in bitmap in a pure C win32 program	4	Jun 3, 2023
How to fix this code?	1	Sep 22, 2023

representational error in C

newgoat

Walter Roberson

Me

Michael Mair

Walter Roberson

Chris Torek

Joe Wright

Ben Pfaff

Jordan Abel

jacob navia

Jack Klein

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads