Converting Floats to Strings yields erratic results

  • Thread starter Dirk T. Shelley
  • Start date
D

Dirk T. Shelley

Hey all,
I'm writing, for my own personal enlightenment, a little program
that will convert a floating-point number, passed as an argument on
the command line, into a character string. I'm now aware of
sprintf(), but I wasn't when I started writing this program, and I
feel it is my duty to finish it ;-)

Everything, in general, works fine, except that some values are
converted to one less than their actual value. For example, 15
returns 15, 14.123 returns 14, but 14 returns 13. I am _completely_
clueless, although I suspect it has something to do with rounding and
truncating happening behind my back. The important part is as
follows...

while (foo >= 10) {
foo = foo/10.0;
tenDivs++;
}

i = 0;
while (tenDivs >= 0) {
while (!(foo < 1)) {
foo--;
oneDivs++;
}
str = 48 + oneDivs;
foo = foo * 10;
oneDivs = 0;
tenDivs--;
i++;
}

Where "foo" is the value to be converted. Can anybody help, or point
me to a source that can?

In an unrelated question: Does anyone know of any link which describes
the (relative) performance of all kinds of C operations? e.g: how fast is
"add" comparing with "multiplication" on a typical machine.

Thanks!
 
K

Keith Thompson

Dirk T. Shelley said:
I'm writing, for my own personal enlightenment, a little program
that will convert a floating-point number, passed as an argument on
the command line, into a character string. I'm now aware of
sprintf(), but I wasn't when I started writing this program, and I
feel it is my duty to finish it ;-)

Typically you *can't* pass floating-point numbers on the command line.
You can only pass strings. Presumably you convert those strings to
floating-point numbers somehow, but you don't tell us how.
Everything, in general, works fine, except that some values are
converted to one less than their actual value. For example, 15
returns 15, 14.123 returns 14, but 14 returns 13. I am _completely_
clueless, although I suspect it has something to do with rounding and
truncating happening behind my back. The important part is as
follows...

If you don't underrstand what's happening, how do you know that this is
the important part?
while (foo >= 10) {
foo = foo/10.0;

This will yield an inexact result (and for all we know the original
value of foo was inexact). Floating-point numbers are represented in
binary, not decimal.

Suggested reading: section 14 of the comp.lang.c FAQ,
<http://www.c-faq.com/>. For more advanced reading, search for
Goldberg's "What Every Computer Scientist Should Know About
Floating-Point Arithmetic".
tenDivs++;
}

i = 0;
while (tenDivs >= 0) {
while (!(foo < 1)) {
foo--;
oneDivs++;
}
str = 48 + oneDivs;


Where did the value 48 come from?

Oh, I see, it's the ASCII value of '0'. Just write '0' (yes, character
constants are of type int); it makes your code clearer and potentially
more portable. It assumes that the representations of '0' through '9'
are contiguous; as it happens, the language guarantees that.
foo = foo * 10;
oneDivs = 0;
tenDivs--;
i++;
}

Where "foo" is the value to be converted. Can anybody help, or point
me to a source that can?

I haven't taken the time to analyze what your program is doing, but
consider this. As I mentioned above, dividing a floating-point
number x by 10.0 is likely to give you an inexact result.
Multiplying that result by 10.0 could give you something other than
the original number, and truncating that result could give you x - 1.
In an unrelated question: Does anyone know of any link which describes
the (relative) performance of all kinds of C operations? e.g: how fast is
"add" comparing with "multiplication" on a typical machine.

I don't know. You can probably assume that addition is faster
than multiplication, and division is slower than either, but there
are no consistent guarantees. The relative performance of integer
and floating-point arithmetic can vary a great deal depending on
the hardware.
 
E

Eric Sosman

Hey all,
I'm writing, for my own personal enlightenment, a little program
that will convert a floating-point number, passed as an argument on
the command line, into a character string. I'm now aware of
sprintf(), but I wasn't when I started writing this program, and I
feel it is my duty to finish it ;-)

Everything, in general, works fine, except that some values are
converted to one less than their actual value. For example, 15
returns 15, 14.123 returns 14, but 14 returns 13. I am _completely_
clueless, although I suspect it has something to do with rounding and
truncating happening behind my back. The important part is as
follows...

while (foo>= 10) {
foo = foo/10.0;

This line is "behind your back." On nearly every computer these
days, floating-point numbers use a base-two representation. A few
use base-sixteen, but the days of base-ten floating-point are pretty
much over and gone. (You'll still find it in hand-held calculators,
but you'll be hard-pressed to find it anywhere else.)

Consequence: You're dividing by 10, which is to say you're dividing
by 2*5. The 2 makes no trouble for a base-two system (and only a little
and subtler trouble for base-sixteen), but the 5 is a major problem.
It's very much like dividing by 35=5*7 in decimal: The 5 is no problem,
but the 7 is likely to involve an infinite repeating decimal fraction.
But just as you lack the patience to write out an infinite number of
fraction digits, the computer lacks the patience (and memory space) to
do the same. So you both stop writing digits after a while, round off
the result, and content yourselves with an approximate answer.

... which is why when you divide 14 by 10 you do not get exactly
1.4, but something close to it like 1.3999999999999999111821580299875.
Since this is just a smidgen less than you expected (it might turn out
to be a smidgen greater, depending on the particular values involved),
your subsequent calculations are likely to be off by just a little.
As you've seen...
tenDivs++;
}

i = 0;
while (tenDivs>= 0) {
while (!(foo< 1)) {

This is *so* much clearer than `while(foo >= 1)' ...
foo--;
oneDivs++;
}
str = 48 + oneDivs;


"48?" Funny choice of "digits" you've made. But then, you
haven't shown us how oneDivs was declared and initialized, so maybe
I'm worrying about nothing.
foo = foo * 10;
oneDivs = 0;

Okay, so for places beyond the first you're using PQRSTUVWXY as
"decimal digits" on a system with ASCII-compatible encodings, or
(unassigned)(unassigned)(SYN)(unassigned)(PN)(RS)(UC)(EOT)(unassigned)
(unassigned) on an EBCDIC system, or Lord knows what on some other
system. I repeat: Funny choice of "digits."
tenDivs--;
i++;
}

Where "foo" is the value to be converted. Can anybody help, or point
me to a source that can?

In an unrelated question: Does anyone know of any link which describes
the (relative) performance of all kinds of C operations? e.g: how fast is
"add" comparing with "multiplication" on a typical machine.

Yes! All sorts of people will tell you all sorts of things
about what operations are cheap or costly. And I'll let you in on
a secret: 99.44% of those people are full of fertilizer, and the
other 0.56% who aren't will say "Fuhgeddaboudit." Like the days of
decimal floating-point, the days of "multiply is slower than add"
or "shift is faster than multiply" or "pointers are faster than
array indices" are long gone.
 
K

Keith Thompson

Eric Sosman said:
This line is "behind your back." On nearly every computer these
days, floating-point numbers use a base-two representation. A few
use base-sixteen, but the days of base-ten floating-point are pretty
much over and gone. (You'll still find it in hand-held calculators,
but you'll be hard-pressed to find it anywhere else.)

Right [*].

[*] Here's the footnote. IBM has recently been
pushing decimal floating-point. See, for example,
<http://www.ibm.com/developerworks/wikis/display/WikiPtype/Decimal+Floating+Point>
(which claims that The C draft standard includes _Decimal32,
_Decimal64, and _Decimal128, but N1570 has nothing like that).

I thought I remembered that the were storing 3 decimal digits (1000
values) in 10 bits (1024 values), which is almost as efficient
space-wise as pure binary, but I can't find a reference to that.

[...]
 
P

Peter Nilsson

Keith Thompson said:
... Floating-point numbers are represented in
binary, not decimal.

Perhaps you where guessing the OP's implementation, but the
C language does not require that.
 
K

Keith Thompson

Peter Nilsson said:
Perhaps you where guessing the OP's implementation, but the
C language does not require that.

You're right.

I don't *think* there are any existing C implementations that use
anything other than binary (or base 16, but that still represents the
significand in binary). IBM has decimal floating-point on some of its
systems, but not for the predefined types.
 
K

Keith Thompson

Thad Smith said:
You have received at least three responses now with a combination of helpful
comments and cheap shots. It's awfully easy to say "that's a strange way" (in a
derogatory manner) to someone just learning what the language supports and not
knowing the idioms developed over time.

Ignore the pot shots. Pay attention to the posters that have the knowledge and
patience to help others without making cutting remarks.

Maybe my news server has better filtering than yours. I haven't seen
anything in this thread that I'd consider snide. In particular,
I just checked and there are no other responses with the phrase
"that's a strange way".
 
T

Thad Smith

Hey all,
I'm writing, for my own personal enlightenment, a little program
that will convert a floating-point number, passed as an argument on
the command line, into a character string. I'm now aware of
sprintf(), but I wasn't when I started writing this program, and I
feel it is my duty to finish it ;-)

Dirk,

You have received at least three responses now with a combination of helpful
comments and cheap shots. It's awfully easy to say "that's a strange way" (in a
derogatory manner) to someone just learning what the language supports and not
knowing the idioms developed over time.

Ignore the pot shots. Pay attention to the posters that have the knowledge and
patience to help others without making cutting remarks.
 
E

Eric Sosman

Eric Sosman said:
This line is "behind your back." On nearly every computer these
days, floating-point numbers use a base-two representation. A few
use base-sixteen, but the days of base-ten floating-point are pretty
much over and gone. (You'll still find it in hand-held calculators,
but you'll be hard-pressed to find it anywhere else.)

Right [*].

[*] Here's the footnote. IBM has recently been
pushing decimal floating-point. See, for example,
<http://www.ibm.com/developerworks/wikis/display/WikiPtype/Decimal+Floating+Point>
(which claims that The C draft standard includes _Decimal32,
_Decimal64, and _Decimal128, but N1570 has nothing like that).

Right [**].

[**] The last decimal F-P "computer" (as opposed to "calculator")
I used was an IBM system. In the Johnson administration.[***]

[***] Sorry, ambiguous: I mean the Johnson who became President
when his predecessor was assassinated.[****]

[****] Coincidence? YOU be the judge!
 
E

Eric Sosman

[...]
str = 48 + oneDivs;


"48?" Funny choice of "digits" you've made.[...]


My mistake (and not my first, or last). I sort of thought
you probably meant '0' and double-checked by looking at an ASCII
table, and its formatting fooled me.

In any event, though, you should write '0' when you want the
integer that encodes the digit zero, not 48 or 240 or 30[*].

[*] Trivia question.
 
S

Shao Miller

On 6/7/2011 8:47 PM, Eric Sosman wrote:
In any event, though, you should write '0' when you want the
integer that encodes the digit zero, not 48 or 240 or 30[*].

[*] Trivia question.

East Indian?
 
B

BartC

In an unrelated question: Does anyone know of any link which describes
the (relative) performance of all kinds of C operations? e.g: how fast is
"add" comparing with "multiplication" on a typical machine.

You'll just have to measure these things on your machine. And perhaps turn
off optimisation if using a simple loop. And even then, you'll have to see
whether the differences are going to be significant in your application.

But in general, try and avoid division (for example, by using foo*0.1
instead of foo/10.0, bearing in mind that 0.1 will be only an approximation
to 1/10.0).
 
D

Dirk T. Shelley

Keith said:
Dirk T. Shelley said:
I'm writing, for my own personal enlightenment, a little program
that will convert a floating-point number, passed as an argument on the
command line, into a character string. I'm now aware of sprintf(), but
I wasn't when I started writing this program, and I feel it is my duty
to finish it ;-)

Typically you *can't* pass floating-point numbers on the command line.
You can only pass strings. Presumably you convert those strings to
floating-point numbers somehow, but you don't tell us how.
Everything, in general, works fine, except that some values are
converted to one less than their actual value. For example, 15 returns
15, 14.123 returns 14, but 14 returns 13. I am _completely_ clueless,
although I suspect it has something to do with rounding and truncating
happening behind my back. The important part is as follows...

If you don't underrstand what's happening, how do you know that this is
the important part?
while (foo >= 10) {
foo = foo/10.0;

This will yield an inexact result (and for all we know the original
value of foo was inexact). Floating-point numbers are represented in
binary, not decimal.

Suggested reading: section 14 of the comp.lang.c FAQ,
<http://www.c-faq.com/>. For more advanced reading, search for
Goldberg's "What Every Computer Scientist Should Know About
Floating-Point Arithmetic".
tenDivs++;
}

i = 0;
while (tenDivs >= 0) {
while (!(foo < 1)) {
foo--;
oneDivs++;
}
str = 48 + oneDivs;


Where did the value 48 come from?

Oh, I see, it's the ASCII value of '0'. Just write '0' (yes, character
constants are of type int); it makes your code clearer and potentially
more portable. It assumes that the representations of '0' through '9'
are contiguous; as it happens, the language guarantees that.
foo = foo * 10;
oneDivs = 0;
tenDivs--;
i++;
}

Where "foo" is the value to be converted. Can anybody help, or point
me to a source that can?

I haven't taken the time to analyze what your program is doing, but
consider this. As I mentioned above, dividing a floating-point number x
by 10.0 is likely to give you an inexact result. Multiplying that result
by 10.0 could give you something other than the original number, and
truncating that result could give you x - 1.
In an unrelated question: Does anyone know of any link which describes
the (relative) performance of all kinds of C operations? e.g: how fast
is "add" comparing with "multiplication" on a typical machine.

I don't know. You can probably assume that addition is faster than
multiplication, and division is slower than either, but there are no
consistent guarantees. The relative performance of integer and
floating-point arithmetic can vary a great deal depending on the
hardware.


Thanks for all the responses.

I think I now have a vague understanding of the nature of the
problem, though not enough to fix it. My complete code is as follows
if anyone cares to further pick it apart.

#include<stdio.h>
#include<stdlib.h>

int main(int argc, char *argv[]) {
int tenDivs, oneDivs, i, neg_foo;
char str[sizeof(float)+1];
float foo;

/* Make sure the user entered a number, and retrieve it's value.
*/
switch (argc) {
case 1:
printf("Not enough arguments.\n");
exit(0);
break;
case 2:
foo = atof(argv[1]);
break;
default:
printf("Too many arguments.\n");
exit(0);
break;
}

tenDivs = 0;
oneDivs = 0;
neg_foo = 0;

/* if foo is a negative number, make it positive and take note */
if (foo < 0) {
foo = 0 - foo;
neg_foo = 1;
}

while (foo >= 10) {
foo = foo/10.0;
tenDivs++;
}

i = 0;
while (tenDivs >= 0) {
while (foo >= 1) {
foo--;
oneDivs++;
}
str = 48 + oneDivs;
foo = foo * 10;
oneDivs = 0;
tenDivs--;
i++;
}

str = '\0';

/* if foo was positive, we are done. */
if (neg_foo == 0)
printf("The value of str is: %s\n", str);

/* if foo was negative, adjust our value. */
if (neg_foo == 1) {
while (i >= 0) {
str[i+1] = str;
i--;
}
str[0] = '-';
printf("The value of str is: %s\n", str);
}

return 0;
}
 
I

Ian Collins

Thanks for all the responses.

I think I now have a vague understanding of the nature of the
problem, though not enough to fix it. My complete code is as follows
if anyone cares to further pick it apart.

You don't appear to have addressed any of the issues people have pointed
to to you. The most notable is the repeated division by 10.

Try and fix those and see how you get on.

<code snipped>
 
D

Dirk T. Shelley

Ian said:
You don't appear to have addressed any of the issues people have pointed
to to you. The most notable is the repeated division by 10.

Try and fix those and see how you get on.

<code snipped>

As I said, I could do with a bit more help, Ian.

Seeing that I want to get out the decimal representation, I don't really
see how I can avoid working with tens.
 
E

Edward A. Falk

Hey all,

Everything, in general, works fine, except that some values are
converted to one less than their actual value. For example, 15
returns 15, 14.123 returns 14, but 14 returns 13.

As others have pointed out by now, the divide-by-ten part
of the loop is problematic. I'll add that repeatedly dividing
by ten, and then going back and multiplying by ten is just
looking for roundoff error.

The usual approach to this particular problem is to split
foo into integer and fraction components and deal with
them seperately.

Also: the use of 48 is poor style. You should use '0', which
is still an integer constant (once it gets promoted) and
makes your intent more clear. It also works in ebcdic.

You also didn't show us how str[] was declared, but I'm going
to assume that it's a character array large enough to hold any
possible result.

Finally, you haven't dealt with the e.g. foo = 1.71E98 case.

But assuming that foo has a reasonable, positive value, say
less than 1e10, then your code should have looked something like
this:

float foo = someValue;
int ifoo = foo;
int idigits;
int i;

ifoo = foo; /* int part */
foo -= ifoo; /* fraction part */

/* count int digits */
idigits = 0;
for(i=ifoo; i > 0; i /= 10)
++idigits;
if( idigits == 0 ) idigits = 1;

/* Generate int digits in reverse order */
for(i=idigits; --i >= 0;) {
str = '0' + ifoo % 10;
ifoo /= 10;
}

/* Now add the fraction part */
str[idigits++] = '.';
for(i=0; i < precision; ++i) {
foo *= 10;
str[idigits++] = '0' + (int)foo;
foo -= (int)foo;
}

/* terminate the string */
str[idigits] = '\0';


Just for fun, if you like quick-and-dirty:

void
putInt(int foo)
{
if( foo >= 10 )
putInt(foo/10);
putchar('0' + foo%10);
}
 
B

BartC

while (foo >= 10) {
foo = foo/10.0;
tenDivs++;
}

This code prints only the integral part? On my machine it shows 9 for foo up
to 9.999999, and 10 for 9.9999999.

Code to convert binary floating point to decimal is very tricky to write.
I've just had a go, and while the following sort of works, it's probably
full of numerical problems too.

It's not meant to be fast, so divisions etc are not optimised out, and
display of large/low magnitude numbers will be slow. Use of pow() is a bit
of a cheat (if there is no built-in way of stringifying a float, then you
probably won't have pow() either). But it can be replaced with a loop.

#include<stdio.h>
#include<stdlib.h>
#include<math.h>

/* print x using <dp> decimal places */

void printx(double x,int dp){
double y;
int digits=0;
int i;

if (dp<0) dp=0;
y=0.5*pow(10,-dp); /* y is eg. 0.0005 when dp=4 */

if (x<0.0) {
printf("-");
x=-x;
}

x+=y; /* possible round-up of last digit */

while (x>=1.0) { /* count digits before decimal point */
x=x/10.0;
++digits;
}

if (digits==0) printf("0");

for (i=0; i<(digits+dp); ++i) {
if (i==digits) printf(".");
x=x*10.0; /* x should be 1.0 to 9.999.. here */
printf("%d",(int)x); /* assume (int)x rounds down */
x=x-(int)x;
}
}

int main(void){
printx(1234.56789,3);
puts("");
}
 
E

Eric Sosman

[...]
I think I now have a vague understanding of the nature of the
problem, though not enough to fix it. My complete code is as follows
if anyone cares to further pick it apart.

#include<stdio.h>
#include<stdlib.h>

int main(int argc, char *argv[]) {
int tenDivs, oneDivs, i, neg_foo;
char str[sizeof(float)+1];

This is wrong. `sizeof(float)' is the number of bytes a
`float' occupies in memory, which has little to do with the number
of digits in the decimal representation of any particular `float'
value. On most systems, `sizeof(float)' will be 4, yet a `float'
might hold a value like 1E30 = 1000000000000000000000000000000
(approximately; see up-thread). You're going to have a hard time
packing thirty-two bytes into a five-byte array ...
float foo;

/* Make sure the user entered a number, and retrieve it's value.
*/

You mean "its" value.
switch (argc) {
case 1:
printf("Not enough arguments.\n");
exit(0);

An aside: The exit status of zero means "success." When your
program can't fulfill it's mission, it's more user-friendly to
announce "failure," using the exit status EXIT_FAILURE.
break;
case 2:
foo = atof(argv[1]);

An aside: This will misbehave if the command-line argument is
"FORTY-TWO" or "$107.99", or any other string that doesn't look
like a `float'. In a serious program, use strtod() and inspect
the information it passes back to you.
break;
default:
printf("Too many arguments.\n");

An aside: You'll also get here if argc is zero, in which case
the message will be slightly misleading.
exit(0);
break;
}

tenDivs = 0;
oneDivs = 0;
neg_foo = 0;

/* if foo is a negative number, make it positive and take note */
if (foo< 0) {
foo = 0 - foo;
neg_foo = 1;
}

This could be tightened up a bit, for example:

neg_foo = foo < 0;
if (neg_foo)
foo = -foo;
while (foo>= 10) {
foo = foo/10.0;

This is still vulnerable to all the rounding issues you've been
told about. That prompts me to suspect you didn't read what you were
told, not with much attention. Go back and read it again.
tenDivs++;
}

i = 0;
while (tenDivs>= 0) {
while (foo>= 1) {
foo--;
oneDivs++;
}

You could dispense with the loop by relying on the fact that
float-to-integer conversion discards any fractional part:

oneDivs = foo;
foo -= oneDivs;

Of course, the value of `foo' is already only an approximation,
thanks to the (likely) inexactitude of division by tens.
str = 48 + oneDivs;


Now I *know* you haven't been listening!
foo = foo * 10;
oneDivs = 0;
tenDivs--;
i++;
}

str = '\0';

/* if foo was positive, we are done. */
if (neg_foo == 0)
printf("The value of str is: %s\n", str);

/* if foo was negative, adjust our value. */


Yeah, okay (except that str[] is probably too short, see above).
But wouldn't it have been *much* simpler to stuff the '-' into the
array beforehand, up at the top when you first discovered you were
going to need it? You could have deposited the '-' in str[0] and
set `i=1', or left str[0] alone and set `i=0', and then stuffed the
digits in str[i++] thereafter. No post-conversion worries!
if (neg_foo == 1) {
while (i>= 0) {
str[i+1] = str;
i--;
}
str[0] = '-';


Why '-' instead of 45? :)
printf("The value of str is: %s\n", str);
}

return 0;
}

Finally, I see that you're content to represent a number like
27.63 as just "27" and 0.1234 as "0". That's perhaps good enough
for the exercise you've set yourself, but few people would term
it "satisfactory."
 
E

Eric Sosman

[...]
Also: the use of 48 is poor style. You should use '0', which
is still an integer constant (once it gets promoted) and
makes your intent more clear. It also works in ebcdic.

Pedantry: There's no promotion involved: '0' is a constant
of type `int', right from the get-go.

The advice is good, though.
 
M

Mark Bluemel

I think I now have a vague understanding of the nature of the
problem,

Then you probably need to read the responses more carefully and perhaps
find the paper Keith Thompson referenced - "What Every Computer
Scientist Should Know About Floating-Point Arithmetic" (easily found via
Google).

though not enough to fix it.

What would you regard as a fix? Conversion of a string representation of
a real number expressed in base 10 (e.g. "1.5") to floating-point binary
is a lossy translation and the conversion back to string, no matter how
you do it, is not guaranteed to produce the original representation.

Think about it a bit - how big is the set of real numbers? How many of
these can accurately be reflected in a finite number of bits for the
exponent and mantissa of a floating point number? What has to happen to
all the others?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,037
Messages
2,570,371
Members
47,013
Latest member
JewellChes

Latest Threads

Top