float representation and precision

S

Sebastian

Hi,

yesterday I discovered something that won't come as a surprise to more experienced C++ - programmers - but I was genuinely amazed. I had this line of code:

float x = 0.9f;

and when I had a look at it in the VS.NET 2003 debugger, it said x had avalue of 0.89999998 ! When I printf'ed x to a logfile, it was 0.899999976158142090000000000000. Now, why on earth is that? I understand that the results of floatingpoint operations will always be somewhat imprecise due to machine limitations, but 0.9? Can anybody explain this to me?
 
J

JH Trauntvein

Sebastian said:
Hi,

yesterday I discovered something that won't come as a
surprise to more experienced C++ - programmers - but I
was genuinely amazed. I had this line of code:

float x = 0.9f;

and when I had a look at it in the VS.NET 2003 debugger,
it said x had a value of 0.89999998 ! When I printf'ed x
to a logfile, it was 0.899999976158142090000000000000. Now,
why on earth is that? I understand that the results of
floatingpoint operations will always be somewhat imprecise
due to machine limitations, but 0.9? Can anybody explain
this to me?

IEEE754, which is the floating point representation used on most
platforms that I am aware of, is a base two encoding for floating point
values. Just as base 10 has irrational fractions (somewhat like 1/3)
so does base two. The adventure here is that the fractions that are
irrational in base two are different from those that are irrational in
base 10. Apparently 9/10 is one of those irrational fractions in base
2.

Regards,

Jon Trauntvein
 
R

Rolf Magnus

Sebastian said:
Hi,

yesterday I discovered something that won't come as a surprise to more
experienced C++ - programmers - but I was genuinely amazed. I had this
line of code:

float x = 0.9f;

and when I had a look at it in the VS.NET 2003 debugger, it said x had a
value of 0.89999998 ! When I printf'ed x to a logfile, it was
0.899999976158142090000000000000. Now, why on earth is that? I understand
that the results of floatingpoint operations will always be somewhat
imprecise due to machine limitations, but 0.9? Can anybody explain this to
me?

Floating point numbers approximate a continuous range of numbers. Since that
range has an infinite count of different numbers, it isn't possible to
store every number exactly in a computer.
Remember that computers are not calculating in base 10 as humans usually
are, but in base 2. So a number that might look simple to us (like 0.9) can
in base 2 have an infinite number of digits, and thus be impossible to
store exactly in a base 2 floating point number.
 
S

Sebastian

Wow. Thanks for the speedy answer. I like this - the idea that there would be irrational numbers in the binary system never crossed my mind, butit sounds absolutely logical. :)
 
F

Frederick Bruckman

IEEE754, which is the floating point representation used on most
platforms that I am aware of, is a base two encoding for floating point
values. Just as base 10 has irrational fractions (somewhat like 1/3)
so does base two. The adventure here is that the fractions that are
irrational in base two are different from those that are irrational in
base 10. Apparently 9/10 is one of those irrational fractions in base
2.

An "irrational fraction" is a contradiction in terms, "irrational"
meaning "cannot be expressed as a ratio", i.e. "cannot be expressed
as a fraction". You mean "repeating", as in "repeating decimal".
Now, what makes a decimal number repeat, is that it has factors
other than "2" and "5" in the denominator. A non-repeating binary
ratio can only have powers of "2" in the denominator. So, evidently,
every repeating decimal number also repeats in binary, but there
are repeating binary nubers that won't repeat in decimal (e.g. 2/5).

In any case, that's not the reason for the loss of precision in
converting to floating point and back. What's "stored in binary"
is not the number itself, but the natural logarithm of the number.
Since _e_, the base of the natural logarithm is irrational, any
rational number is going to have a non-rational representation in
floating point. The only numbers that come out even in floating
point are irrational themselves. If you ever don't get a rounding
error when converting to floating point and back, it's essentially
coincidence (although it will be reproducible, of course).
 
C

cadull

In any case, that's not the reason for the loss of precision in
converting to floating point and back. What's "stored in binary"
is not the number itself, but the natural logarithm of the number.
Since _e_, the base of the natural logarithm is irrational, any
rational number is going to have a non-rational representation in
floating point. The only numbers that come out even in floating
point are irrational themselves. If you ever don't get a rounding
error when converting to floating point and back, it's essentially
coincidence (although it will be reproducible, of course).

I don't know where you get the natural log information from.

http://www.nuvisionmiami.com/books/asm/workbook/floating_tut.htm

Regards,
cadull
 
F

Frederick Bruckman

I don't know where you get the natural log information from.

Oops. It's log-2. Fractions involving powers of two will thus
convert back and forth without loss.

Frederick
 
R

Rolf Magnus

Frederick said:
In any case, that's not the reason for the loss of precision in
converting to floating point and back. What's "stored in binary"
is not the number itself, but the natural logarithm of the number.

Where did you get that idea from? Floating point numbers are usually stored
in base 2, not base e. I have seen a format where they were stored in base
16, but never e.
 
P

Pete Becker

Frederick said:
Oops. It's log-2.

No, it's not. It's just like 1.3 x 10^3, but in base 2 instead of base
10. The exponent is the logarithm of the value, but the fraction is just
the representation of the scaled value in the appropriate base. So 1/2
is represented in binary floating point as .1 base 2, 1 is represented
as .1 x 2^1, and 2 is represented as .1 x 2^2.
 
P

Pete Becker

Rolf said:
Floating point numbers approximate a continuous range of numbers. Since that
range has an infinite count of different numbers, it isn't possible to
store every number exactly in a computer.

It doesn't even have to be infinite, just more than the floating point
type can hold.
 
R

Rolf Magnus

Pete said:
No, it's not. It's just like 1.3 x 10^3, but in base 2 instead of base
10. The exponent is the logarithm of the value, but the fraction is just
the representation of the scaled value in the appropriate base. So 1/2
is represented in binary floating point as .1 base 2,

No. It's represented as 1 * 2^-1. The mantissa always is >=1 and <2, similar
to what you would do in base 10. You don't write 0.13 x 10^4, but 1.3 *
10^3.
1 is represented as .1 x 2^1, and 2 is represented as .1 x 2^2.

1 is represented as 1 * 2^0, 2 as 1 * 2^1.

Btw: How would you calculate the exponent from a given number?
 
P

Pete Becker

Rolf said:
Pete Becker wrote:




No. It's represented as 1 * 2^-1. The mantissa always is >=1 and <2,

Not under IEEE 754 and its successors. Normalized non-zero fractions are
always less than 1 and greater than or equal to 1/base. This gets
confusing, though, because some architectures (in particular, Intel)
suppress the leading bit, since it's always 1. But when you stick it
back in, the fraction (as the name implies <g>) is always less than 1.
 
F

Frederick Bruckman

Where did you get that idea from? Floating point numbers are usually stored
in base 2, not base e. I have seen a format where they were stored in base
16, but never e.

I don't know where I got that idea from. It might be a nice way
to represent numbers on a scientific calculator -- many of the
"interesting" irrational numbers would be repeating decimals in
base _e_, but I don't know that anyone actually does that. Never
mind.
 
A

Andrey Tarasevich

Sebastian said:
...
yesterday I discovered something that won't come as a surprise to more experienced C++ - programmers - but I was genuinely amazed. I had this line of code:

float x = 0.9f;

and when I had a look at it in the VS.NET 2003 debugger, it said x had a value of 0.89999998 ! When I printf'ed x to a logfile, it was 0.899999976158142090000000000000. Now, why on earth is that? I understand that the results of floatingpoint operations will always be somewhat imprecise due to machine limitations, but 0.9? Can anybody explain this to me?
...

Decimal 0.9 is

0.111001100110011001100... = 0.1(1100)

in binary positional notation. The last part - 1100 - is repeated
indefinitely. This means that a binary machine that uses some form of
positional notation for floating-point numbers and does not provide any
means for representing periodic fractions won't be able to represent 0.9
precisely. The same applies to 0.1, 0.2, 0.8. That's what you observe in
your case.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,012
Latest member
RoxanneDzm

Latest Threads

Top