Converting Floats to Strings yields erratic results

BartC · Jun 9, 2011

BartC said:
Code to convert binary floating point to decimal is very tricky to write.
I've just had a go, and while the following sort of works, it's probably
full of numerical problems too.

My code introduces errors in the lower bits of the number. I'm trying to
compare the results to printf(), but if I write:

printf("%f\n",1.0e40);

I get an impossibly clean result of 1 followed by loads of zeros, no matter
what numbers I stick inside %f. Clearly 1e40 can't be stored exactly (can
it? I thought it would need over 120 bits of precision), so how can I tell
printf to show it as it is?

(If I print 1e50, then I've noticed there is a '1' amongst the zeros; that
can't be right either..)

James Kuyper · Jun 9, 2011

My code introduces errors in the lower bits of the number. I'm trying to
compare the results to printf(), but if I write:

printf("%f\n",1.0e40);

I get an impossibly clean result of 1 followed by loads of zeros, no matter
what numbers I stick inside %f. Clearly 1e40 can't be stored exactly (can
it? I thought it would need over 120 bits of precision), so how can I tell
printf to show it as it is?

(If I print 1e50, then I've noticed there is a '1' amongst the zeros; that
can't be right either..)

1e40 is 2^40*5^40. 2^40 is just absorbed into the exponent. 5^40
requires only 93 bits to be represented exactly, not 120. It's therefore
possible to store 1e40 exactly in a 128-bit type. I've heard of
implementations with 128-bit long double, but not 128-bit double.

With respect to %f, the standard says "The value is rounded to
the appropriate number of digits." (7.19.6.1p8). I think that this is a
little unclear, but it could be used to justify producing all zeros
after the last significant digit of the output string. That's what I've
usually seen real implementations do.

Noob · Jun 9, 2011

BartC said:
Clearly 1e40 can't be stored exactly (can it?
I thought it would need over 120 bits of precision)

1e40 = 10^40
= 5^40 * 2^40
= 0x1D6329F1C35CA4BFABB9F561 * 2^40

5^40 requires 93 bits to store precisely.

Thus, a double-precision IEEE 754 floating-point number
cannot store 1e40 exactly, as the mantissa is expressed
over (only) 52 bits.

http://en.wikipedia.org/wiki/Double_precision_floating-point_format

0x1.D6329F1C35CA4 * 2^132 < 1e40 < 0x1.D6329F1C35CA5 * 2^132

i.e. respectively
9999999999999999094860208812374492184576
10000000000000000000000000000000000000000
10000000000000000303786028427003666890752

The exponent (132) is stored as 132+1023 = 1155 = 0x483

And, in fact, the following program prints 483d6329f1c35ca3
(Note the rounding error).

#include <stdio.h>
#include <string.h>
int main(void)
{
int i;
double d = 1.0;
for (i=0; i < 40; ++i) d *= 10.0;
unsigned char buf[8];
memcpy(buf, &d, sizeof buf);
for (i=0; i < 8; ++i) printf("%02x", buf[7-i]);
putchar('\n');
return 0;
}

cf. also
What Every Computer Scientist Should Know About Floating-Point Arithmetic
http://download.oracle.com/docs/cd/E19957-01/806-3568/ncg_goldberg.html

Regards.

BartC · Jun 9, 2011

James Kuyper said:
1e40 is 2^40*5^40. 2^40 is just absorbed into the exponent. 5^40
requires only 93 bits to be represented exactly, not 120. It's therefore
possible to store 1e40 exactly in a 128-bit type. I've heard of
implementations with 128-bit long double, but not 128-bit double.

With respect to %f, the standard says "The value is rounded to
the appropriate number of digits." (7.19.6.1p8). I think that this is a
little unclear, but it could be used to justify producing all zeros
after the last significant digit of the output string. That's what I've
usually seen real implementations do.

OK. The problem is that with a double set to 1e40, printf gives:

10000000000000000000000000000000000000000.000000

But using a bigint library to calculate the actual value stored in the bit
pattern, it gives:

10000000000000000303786028427003666890752

(although I might have expected a result just below 1e40), while the simple
code I posted earlier gives this result:

10000000000000004440892098500626161694526

So is not as bad as I thought! However my point was I couldn't figure how to
get printf to give me the true value (in decimal; I can't do anything with
the hex).

Keith Thompson · Jun 9, 2011

BartC said:
My code introduces errors in the lower bits of the number. I'm trying to
compare the results to printf(), but if I write:

printf("%f\n",1.0e40);

I get an impossibly clean result of 1 followed by loads of zeros, no matter
what numbers I stick inside %f. Clearly 1e40 can't be stored exactly (can
it? I thought it would need over 120 bits of precision), so how can I tell
printf to show it as it is?

(If I print 1e50, then I've noticed there is a '1' amongst the zeros; that
can't be right either..)

What implementation are you using?

This program:

#include <stdio.h>
int main(void)
{
printf("%f\n", 1e40);
printf("%f\n", 1e41);
printf("%f\n", 1e42);
printf("%f\n", 1e43);
printf("%f\n", 1e44);
printf("%f\n", 1e45);
printf("%f\n", 1e46);
printf("%f\n", 1e47);
printf("%f\n", 1e48);
printf("%f\n", 1e49);
printf("%f\n", 1e50);
return 0;
}

gives me this output:

10000000000000000303786028427003666890752.000000
100000000000000000620008645040778319495168.000000
1000000000000000044885712678075916785549312.000000
10000000000000000139372116959414099130712064.000000
100000000000000008821361405306422640701865984.000000
999999999999999929757289024535551219930759168.000000
9999999999999999931398190359470212947659194368.000000
100000000000000004384584304507619735463404765184.000000
1000000000000000043845843045076197354634047651840.000000
9999999999999999464902769475481793196872414789632.000000
100000000000000007629769841091887003294964970946560.000000

both on Linux/x86 with gcc and glibc, and on Solaris/SPARC.

(The trailing ".000000" makes sense; for numbers that large, the
only representable values are exact integers.)

If your implementation supports it, you can see the exact values
using "%a" or "%A":

0x1.d6329f1c35ca5p+132
0x1.25dfa371a19e7p+136
0x1.6f578c4e0a061p+139
0x1.cb2d6f618c879p+142
0x1.1efc659cf7d4cp+146
0x1.66bb7f0435c9ep+149
0x1.c06a5ec5433c6p+152
0x1.18427b3b4a05cp+156
0x1.5e531a0a1c873p+159
0x1.b5e7e08ca3a8fp+162
0x1.11b0ec57e649ap+166

BartC · Jun 9, 2011

What implementation are you using?

This program:

#include <stdio.h>
int main(void)
{
printf("%f\n", 1e40);
printf("%f\n", 1e41);
printf("%f\n", 1e42);
printf("%f\n", 1e43);
printf("%f\n", 1e44);
printf("%f\n", 1e45);
printf("%f\n", 1e46);
printf("%f\n", 1e47);
printf("%f\n", 1e48);
printf("%f\n", 1e49);
printf("%f\n", 1e50);
return 0;
}

gives me this output:

10000000000000000303786028427003666890752.000000

(That matches the result in my other post; a good sign..)

100000000000000000620008645040778319495168.000000
1000000000000000044885712678075916785549312.000000
10000000000000000139372116959414099130712064.000000
100000000000000008821361405306422640701865984.000000
999999999999999929757289024535551219930759168.000000
9999999999999999931398190359470212947659194368.000000
100000000000000004384584304507619735463404765184.000000
1000000000000000043845843045076197354634047651840.000000
9999999999999999464902769475481793196872414789632.000000
100000000000000007629769841091887003294964970946560.000000

both on Linux/x86 with gcc and glibc, and on Solaris/SPARC.

....

lcc-win32 and gcc (all on x86-32) both give me:

10000000000000000000000000000000000000000.000000
100000000000000000000000000000000000000000.000000
1000000000000000000000000000000000000000000.000000
10000000000000000000000000000000000000000000.000000
100000000000000010000000000000000000000000000.000000
999999999999999930000000000000000000000000000.000000
.....

Using DMC, I do get the 303 in the middle of 1e40 for example, but the rest
is still zeros:

10000000000000000303000000000000000000000.000000

Odd. Perhaps some compiler switch is needed?

(The trailing ".000000" makes sense; for numbers that large, the
only representable values are exact integers.)

I think it's because you start with an integer (of 52/53 bits for example),
then end up multiplying by a power of two, at least for these bigger
numbers.

If your implementation supports it, you can see the exact values
using "%a" or "%A":

0x1.d6329f1c35ca5p+132

Trouble is I don't 'read' hex; that's why I used a separate program to
extract the bits and build the number.

Keith Thompson · Jun 9, 2011

BartC said:
What implementation are you using?

This program:

#include <stdio.h>
int main(void)
{
printf("%f\n", 1e40); [...]
return 0;
}

gives me this output:

10000000000000000303786028427003666890752.000000

Click to expand...

(That matches the result in my other post; a good sign..)

Oh? You said you got "an impossibly clean result of 1 followed by loads
of zeros"; that doesn't look "impossibly clean" to me.

lcc-win32 and gcc (all on x86-32) both give me:

10000000000000000000000000000000000000000.000000
100000000000000000000000000000000000000000.000000
1000000000000000000000000000000000000000000.000000
10000000000000000000000000000000000000000000.000000
100000000000000010000000000000000000000000000.000000
999999999999999930000000000000000000000000000.000000
....

gcc probably isn't relevant; printf is provided by the runtime library,
not by the compiler. For lcc-win32, apparently both are form the same
source, but gcc will use whatever runtime library is on the system.

gcc with glibc on Linux produces the results I posted. gcc on Cygwin
(which I think uses something called "newlib") produces subtly different
results for some values; the Cygwin results are identical for 1e40,
1e41, and 1e42, but have more trailing zeros before the decimal point
for the other values.

glibc seems to be showing the exact stored value, or something
close to it. The results you're getting seem to be the result of
something doing some rounding somewhere.

Using DMC, I do get the 303 in the middle of 1e40 for example, but the rest
is still zeros:

10000000000000000303000000000000000000000.000000

Odd. Perhaps some compiler switch is needed?

Unless a compiler switch affects the behavior of the runtime library
(which is possible), you're probably stuck with the behavior you're
getting unless you change systems.

I think it's because you start with an integer (of 52/53 bits for example),
then end up multiplying by a power of two, at least for these bigger
numbers.
Right.

Trouble is I don't 'read' hex; that's why I used a separate program to
extract the bits and build the number.

....

Ok, here's some new information. Using Perl's Math::BigInt, I've
converted the results (with the trailing ".000000" ignored) to
hexadecimal. The results I got with glibc are:

0x1d6329f1c35ca500000000000000000000
0x125dfa371a19e7000000000000000000000
0xb7abc627050308000000000000000000000
0x72cb5bd86321e40000000000000000000000
0x47bf19673df53000000000000000000000000
0x2cd76fe086b93c000000000000000000000000
0x1c06a5ec5433c60000000000000000000000000
0x118427b3b4a05c00000000000000000000000000
0xaf298d050e439800000000000000000000000000
0x6d79f82328ea3c000000000000000000000000000
0x446c3b15f992680000000000000000000000000000

showing about 53 significant bits followed by zeros, which makes sense.

The results I got on Cygwin are:

0x1d6329f1c35ca500000000000000000000
0x125dfa371a19e7000000000000000000000
0xb7abc627050308000000000000000000000
0x72cb5bd86321e3fffffffffffffffffffffc
0x47bf19673df53000000000000000000000010
0x2cd76fe086b93c000000000000000000000020
0x1c06a5ec5433c5ffffffffffffffffffffffe90
0x118427b3b4a05c00000000000000000000008800
0xaf298d050e439800000000000000000000055000
0x6d79f82328ea3c0000000000000000000000335c0
0x446c3b15f992680000000000000000000001bb5200

which implies some spurious rounding.

Your output of
100000000000000010000000000000000000000000000.000000
is
0x47bf19673df5303cef26e3f42126110000000
which has 118 significant bits.

I'm not suggesting that the implementations you're using are
non-conforming; I don't think printf is required to print more
significant digits than are really there. The standard says "The
value is rounded to the appropriate number of digits."; there may
be some disagreement among implementers about what this means.

My guess is that
100000000000000010000000000000000000000000000.000000
results from computing a rounded result, and then picking up an
insignificant digit in the conversion to a string. I'd probably
call that a bug, but not in the sense that it's non-conforming.

BartC · Jun 9, 2011

Keith Thompson said:
Oh? You said you got "an impossibly clean result of 1 followed by loads
of zeros"; that doesn't look "impossibly clean" to me.

I meant my reply to James; printf() gives me the clean result, but using
special code to calculate the stored value with bigints gave me the proper
result above.

gcc with glibc on Linux produces the results I posted. gcc on Cygwin

glibc seems to be showing the exact stored value, or something
close to it. The results you're getting seem to be the result of
something doing some rounding somewhere.

Unless a compiler switch affects the behavior of the runtime library
(which is possible), you're probably stuck with the behavior you're
getting unless you change systems.

It seems even odder now (if that makes sense...); some implementations will
print the exact result, some will do some internal rounding. And if a
certain precision is specified, it will be ignored if it's beyond some
limit. And it's all a matter of luck.

Ok, here's some new information. Using Perl's Math::BigInt, I've
converted the results (with the trailing ".000000" ignored) to
hexadecimal. The results I got with glibc are:

0x1d6329f1c35ca500000000000000000000

showing about 53 significant bits followed by zeros, which makes sense.

The results I got on Cygwin are:

0x1d6329f1c35ca500000000000000000000

OK, the actual stored value.

0x72cb5bd86321e3fffffffffffffffffffffc

which implies some spurious rounding.

Your output of
100000000000000010000000000000000000000000000.000000
is
0x47bf19673df5303cef26e3f42126110000000
which has 118 significant bits.

I'm not suggesting that the implementations you're using are
non-conforming; I don't think printf is required to print more
significant digits than are really there. The standard says "The
value is rounded to the appropriate number of digits."; there may
be some disagreement among implementers about what this means.

My guess is that
100000000000000010000000000000000000000000000.000000
results from computing a rounded result, and then picking up an
insignificant digit in the conversion to a string. I'd probably
call that a bug, but not in the sense that it's non-conforming.

It doesn't sound right to me. It should be possible to turn off rounding in
order to see exactly what's happening in a calculation,, and one way is by
specifying an extra-large precision. But some libraries apparently don't do
that.

Noob · Jun 10, 2011

[ Re-posting through E-S, as aioe seems heavily filtered ]
OT: How many see this post, and not the one through aioe?

Clearly 1e40 can't be stored exactly (can it?
I thought it would need over 120 bits of precision)

1e40 = 10^40
= 5^40 * 2^40
= 0x1D6329F1C35CA4BFABB9F561 * 2^40

5^40 requires 93 bits to store precisely.

Thus, a double-precision IEEE 754 floating-point number
cannot store 1e40 exactly, as the mantissa is expressed
over (only) 52 bits.

http://en.wikipedia.org/wiki/Double_precision_floating-point_format

0x1.D6329F1C35CA4 * 2^132 < 1e40 < 0x1.D6329F1C35CA5 * 2^132

i.e. respectively
9999999999999999094860208812374492184576
10000000000000000000000000000000000000000
10000000000000000303786028427003666890752

The exponent (132) is stored as 132+1023 = 1155 = 0x483

And, in fact, the following program prints 483d6329f1c35ca3
(Note the rounding error).

#include <stdio.h>
#include <string.h>
int main(void)
{
int i;
double d = 1.0;
for (i=0; i < 40; ++i) d *= 10.0;
unsigned char buf[8];
memcpy(buf, &d, sizeof buf);
for (i=0; i < 8; ++i) printf("%02x", buf[7-i]);
putchar('\n');
return 0;
}

cf. also
What Every Computer Scientist Should Know About Floating-Point Arithmetic
http://download.oracle.com/docs/cd/E19957-01/806-3568/ncg_goldberg.html

Regards.

Seebs · Jun 10, 2011

[ Re-posting through E-S, as aioe seems heavily filtered ]
OT: How many see this post, and not the one through aioe?

Me. I filter aioe because such a large number of the recurring trolls of
this group seem to use it for their socks.

cf. also
What Every Computer Scientist Should Know About Floating-Point Arithmetic

I think I see that recommended, on average, about twice a week. I think
I read it once, but either a very long time ago or not actually at all.
Gonna go do that now.

-s

Converting floats to Strings and back	7	Nov 21, 2005
An extension to Rational - friendlier with Floats and Strings	0	May 7, 2005
Is this setjmp/jongjmp usable legal?	23	Jan 3, 2013
How to use ufixed when it involves multiplication a number of times?(VHDL question)	0	Aug 22, 2016
Engineering a List container Part 2: Implementations	20	Dec 8, 2013
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Mar 15, 2008
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Feb 1, 2008
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Jan 15, 2008

Converting Floats to Strings yields erratic results

BartC

James Kuyper

Noob

BartC

Keith Thompson

BartC

Keith Thompson

BartC

Noob

Seebs

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads