Floating point rounding error

Mukesh_Singh_Nick · Jun 16, 2007

Why does floating point have a rounding error? How to work around it?

For example, the following:

flaot f = 1234.12345678F;

printf("%2f\n", f) //prints 1234.123413

and

printf("%8.9f\n", f) //prints 1234.123413086

Flash Gordon · Jun 16, 2007

Why does floating point have a rounding error?

Calculate 2/3 as a decimal number. Come back when you have understood
the answer to your question or when you have finished writing it without
any rounding or truncation. I'll even get you started 0.6667 (I've
rounded it here).

> How to work around it?

Use sufficient care and analysis for the problem at hand. This is a
general problem, not a C specific one, so comp.programming, and there is
no one correct solution for all situations.

Richard Heathfield · Jun 16, 2007

(e-mail address removed) said:

Why does floating point have a rounding error?

Consider 1234.12345678

It's easy enough to deal with 1234. Here are the bits: 10011010010

So let's try to deal with 0.12345678, using binary notation.

So 0.1 (binary) is 0.5 (decimal), 0.01 (binary) is 0.25 (decimal), and
so on.
0.1 = 1/2 = 0.5 - too large
0.01 = 1/4 = 0.25 - too large
0.001 = 1/8 = 0.125 - too large
0.0001 = 1/16 = 0.0625 - too small
0.00011 = 3/32 = 0.09375 - too small
0.000111 = 7/64 = 0.109375 - too small
0.0001111 = 15/128 = 0.1171875 - too small
0.00011111 = 31/256 = 0.12109375 - too small
0.000111111 = 63/512 = 0.123046875 - too small
0.0001111111 = 127/1024 = 0.1240234375 - too large
0.00011111101 = 253/2048 = 0.12353515625 - too large
0.000111111001 = 505/4096 = 0.123291015625 - too small
0.0001111110011 = 1011/8192 = 0.1234130859375 - too small
0.00011111100111 = 2023/16384 = 0.12347412109375 - too large
0.000111111001101 = 4045/32768 = 0.123443603515625 - too small
0.0001111110011011 = 8091/65536 = 0.1234588623046875 - too large

So far, we've used 16 bits on this. Keep on calculatin', and find out
how many bits you need if you're to get an *exact* representation of
0.12345678. You might well be surprised by the result.

How to work around it?

That depends on what you want to achieve.

Mukesh_Singh_Nick · Jun 16, 2007

(e-mail address removed) said:

Consider 1234.12345678

It's easy enough to deal with 1234. Here are the bits: 10011010010

So let's try to deal with 0.12345678, using binary notation.

So 0.1 (binary) is 0.5 (decimal), 0.01 (binary) is 0.25 (decimal), and
so on.
0.1 = 1/2 = 0.5 - too large
0.01 = 1/4 = 0.25 - too large
0.001 = 1/8 = 0.125 - too large
0.0001 = 1/16 = 0.0625 - too small
0.00011 = 3/32 = 0.09375 - too small
0.000111 = 7/64 = 0.109375 - too small
0.0001111 = 15/128 = 0.1171875 - too small
0.00011111 = 31/256 = 0.12109375 - too small
0.000111111 = 63/512 = 0.123046875 - too small
0.0001111111 = 127/1024 = 0.1240234375 - too large
0.00011111101 = 253/2048 = 0.12353515625 - too large
0.000111111001 = 505/4096 = 0.123291015625 - too small
0.0001111110011 = 1011/8192 = 0.1234130859375 - too small
0.00011111100111 = 2023/16384 = 0.12347412109375 - too large
0.000111111001101 = 4045/32768 = 0.123443603515625 - too small
0.0001111110011011 = 8091/65536 = 0.1234588623046875 - too large

So far, we've used 16 bits on this. Keep on calculatin', and find out
how many bits you need if you're to get an *exact* representation of
0.12345678. You might well be surprised by the result.

That depends on what you want to achieve.

Thank you for replying with a very elaborate example, Richard. I would
disappoint you if I told you I am intrigued by the representation of
non-integral decimal numbers in their binary form.

I know binary arithmetic with integrals. I sometimes wondered and
never bothered myself as to how decimals were represented as binaries.
I want to understand your example.

I can see a pattern in the representation.

0.1 is half.
0.01 is a right shift and you further halve it.
0.001 two right shifts further halving it and so on.

You lost me at 0.011. Can I please request you to explain.

Richard Heathfield · Jun 16, 2007

(e-mail address removed) said:

I know binary arithmetic with integrals. I sometimes wondered and
never bothered myself as to how decimals were represented as binaries.

The representation of floating-point numbers is implementation-defined.
IEEE 754 is common but not universal.

I want to understand your example.

I can see a pattern in the representation.

0.1 is half.
0.01 is a right shift and you further halve it.
0.001 two right shifts further halving it and so on.

You lost me at 0.011. Can I please request you to explain.

You understand binary integers. Extend the concept.

13 in binary is 1101 (1 * eight + 1 * four + 0 * two + 1 * one).

So each column represents a multiplier half as big as that of the column
to its left.

So we might reasonably think of the columns beyond the binary point as
representing a half, a quarter, an eighth, etc.

So 0.11 would be (half + quarter) = (three-quarters) = 0.75

0.011 would be (quarter + eighth) = (three-eighths) = 0.375

and so on.

That isn't necessarily how they're stored internally, of course, but it
does give you a good idea of the number of bits you need for storing a
particular value to a particular precision. I recommend that you read
http://docs.sun.com/source/806-3568/ncg_goldberg.html

(Title: "What Every Computer Scientist Should Know About Floating-Point
Arithmetic")

Eric Sosman · Jun 16, 2007

Why does floating point have a rounding error?

Knowledge of the Great Mysteries is reserved for those
who are worthy. To prove your worth, you must undertake a
Quest and complete it successfully. Your Quest, Mukesh, is
to discover what fraction one day is of one week, and express
the answer as a decimal number, using as many decimal places
as are needed for perfect accuracy. When you have done this
I will know you are indeed worthy, and I will reveal to you
the secret origin of rounding errors.

Mukesh_Singh_Nick · Jun 16, 2007

(e-mail address removed) said:

The representation of floating-point numbers is implementation-defined.
IEEE 754 is common but not universal.

You understand binary integers. Extend the concept.

13 in binary is 1101 (1 * eight + 1 * four + 0 * two + 1 * one).

So each column represents a multiplier half as big as that of the column
to its left.

So we might reasonably think of the columns beyond the binary point as
representing a half, a quarter, an eighth, etc.

So 0.11 would be (half + quarter) = (three-quarters) = 0.75

0.011 would be (quarter + eighth) = (three-eighths) = 0.375

and so on.

That isn't necessarily how they're stored internally, of course, but it
does give you a good idea of the number of bits you need for storing a
particular value to a particular precision. I recommend that you readhttp://docs.sun.com/source/806-3568/ncg_goldberg.html

(Title: "What Every Computer Scientist Should Know About Floating-Point
Arithmetic")

Thank you so very much, Richard. You just taught me something
*fantabulous*. I just learnt something terrific, something I could not
have learnt reading a thousand words. Actually, I think I've just
understood the IEEE 754-1985 implementation in a nutshell.

Is this rule of representing decimals in binary applicable only to
754?

And then I revisited your previous table wherein you try to reach a
precision for 1234.12345678 by heuristic computation. It suddenly
removed a big block in my head.

Thank you, everyday.

Mukesh_Singh_Nick · Jun 16, 2007

I recommend that you readhttp://docs.sun.com/source/806-3568/ncg_goldberg.html

(Title: "What Every Computer Scientist Should Know About Floating-Point
Arithmetic")

Thank you. I certainly will.

Joe Wright · Jun 16, 2007

Thank you for replying with a very elaborate example, Richard. I would
disappoint you if I told you I am intrigued by the representation of
non-integral decimal numbers in their binary form.

I know binary arithmetic with integrals. I sometimes wondered and
never bothered myself as to how decimals were represented as binaries.
I want to understand your example.

I can see a pattern in the representation.

0.1 is half.
0.01 is a right shift and you further halve it.
0.001 two right shifts further halving it and so on.

You lost me at 0.011. Can I please request you to explain.

Here's something to chew on..

00111111 10111111 10011010 11011101 00010000 10010001 11001000 10010101
Exp = 1019 (-3)
111 11111101
Man = .11111 10011010 11011101 00010000 10010001 11001000 10010101
1.2345678000000000e-01

I hope it didn't wrap on you.

CBFalconer · Jun 16, 2007

Why does floating point have a rounding error? How to work around
it? For example, the following:

flaot f = 1234.12345678F;
printf("%2f\n", f) //prints 1234.123413
and
printf("%8.9f\n", f) //prints 1234.123413086

It doesn't have a rounding error. It has a precision limit.

CBFalconer · Jun 16, 2007

Eric said:
Knowledge of the Great Mysteries is reserved for those
who are worthy. To prove your worth, you must undertake a
Quest and complete it successfully. Your Quest, Mukesh, is
to discover what fraction one day is of one week, and express
the answer as a decimal number, using as many decimal places
as are needed for perfect accuracy. When you have done this
I will know you are indeed worthy, and I will reveal to you
the secret origin of rounding errors.

Easy. Just use a septal base (which is not decimal), get 0.1.

(Very useful for writing lock combinations).

Joe Wright · Jun 16, 2007

CBFalconer said:
It doesn't have a rounding error. It has a precision limit.

Indeed. 1.23412341e+03 is all there is in 32 bits.

Eric Sosman · Jun 16, 2007

CBFalconer said:
Easy. Just use a septal base (which is not decimal), get 0.1.
(Very useful for writing lock combinations).

Since "as a decimal number" was clearly specified in the
rules of the Quest, your change of base is not septal but septic.
I'll have to look up the traditional rules for punishment of
failure. "Something lingering, with boiling oil in it, I fancy.
Something of that sort. I think boiling oil occurs in it, but
I'm not sure. I know it's something humorous, but lingering,
with either boiling oil or melted lead."

AND you'll never get to learn about rounding error. Nyaahh!

Peter 'Shaggy' Haywood · Jun 20, 2007

Groovy hepcat (e-mail address removed) was jivin' on Sat, 16 Jun
2007 05:07:47 -0700 in comp.lang.c.
Re: Floating point rounding error's a cool scene! Dig it!

Is this rule of representing decimals in binary applicable only to
754?

This doesn't represent decimals. It represents values. Values are
neither decimal nor binary, but may be expressed in decimal or binary
or, indeed, any other number system, such as hexadecimal or octal. A
computer stores values expressed internally in binary, but may read in
or write out the values in decimal or other systems.
What Richard showed you is not a complete floating point
representation. It was simply a binary representation of a value. This
is refered to as "fixed point". Floating point representations have a
mantissa part and an exponent part. These parts are (typically)
expressed in binary. For example, instead of representing .75 as .11
in fixed point binary, it might be represented as (just to keep things
simple) an 8 bit binary mantissa, 00000011, and an 8 bit exponent,
11111110 (that's -2 expressed as an 8 bit binary number). This
represents 3 * 2 ** -2 (using ** for exponentiation) or 3 >> 2. Real
floating point implementations, however, use more bits for the
mantissa and exponent, often with separate sign bits.

--

Dig the even newer still, yet more improved, sig!

http://alphalink.com.au/~phaywood/
"Ain't I'm a dog?" - Ronny Self, Ain't I'm a Dog, written by G. Sherry & W. Walker.
I know it's not "technically correct" English; but since when was rock & roll "technically correct"?

David Thompson · Jul 1, 2007

Knowledge of the Great Mysteries is reserved for those
who are worthy. To prove your worth, you must undertake a
Quest and complete it successfully. Your Quest, Mukesh, is
to discover what fraction one day is of one week, and express
the answer as a decimal number, using as many decimal places
as are needed for perfect accuracy. When you have done this
I will know you are indeed worthy, and I will reveal to you
the secret origin of rounding errors.

<OT> Didn't the French Revolution, among its many variously intriguing
and alarming ideas, (try to) change the week to 10 days? </>

- formerly david.thompson1 || achar(64) || worldnet.att.net

Jean-Marc Bourguet · Jul 2, 2007

David Thompson said:
<OT> Didn't the French Revolution, among its many variously intriguing
and alarming ideas, (try to) change the week to 10 days? </>

Right. This calendar was in use 13 years or so.

Yours,

Java OpenJDK Floating Point Dare	3	Jan 17, 2023
C++ SSE and SSE2 compiler settings, and their Floating Point effects.	0	May 31, 2022
Command Line Arguments	0	Mar 7, 2023
Floating point linkage	37	Oct 13, 2013
How to alter the program so that when user types z or Z or 0, the program sets both a and b to zero?	0	Oct 11, 2022
Need Assistance With A Coding Problem	0	Aug 26, 2023
floating point arithmetic	34	Jul 17, 2009
How to fix this code?	1	Sep 22, 2023

Floating point rounding error

Mukesh_Singh_Nick

Flash Gordon

Richard Heathfield

Mukesh_Singh_Nick

Richard Heathfield

Eric Sosman

Mukesh_Singh_Nick

Mukesh_Singh_Nick

Joe Wright

CBFalconer

CBFalconer

Joe Wright

Eric Sosman

Peter 'Shaggy' Haywood

David Thompson

Jean-Marc Bourguet

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads