Rounding double

Gordon Burditt · Nov 24, 2007

Nowhere did the OP say that he wanted 100% accuracy!

The OP said (and I'm quoting this for the *third* time): "Does any body
know, how to round a double value with a specific number of digits after
the decimal points?"

Thus, rounding 0.33 to one decimal place should result in a result with
*one* decimal place, not a couple of dozen decimal places.

In other words, to round to one decimal place to get a result with
*one* decimal place, you round to the nearest multiple of 0.5 . To
round to N decimal places, you round to the nearst multiple of
2.0**-N, where ** is an exponentiation operator.

Somehow I don't think this is what was meant.

Gordon Burditt · Nov 24, 2007

Quite - if you print 'em both to 19dp...

mark@thelinux clc_tests $ ./a.out 0.33
0.3300000000000000155: 0 decimals 0.0000000000000000000
0.3300000000000000155: 1 decimals 0.2999999999999999889
0.3300000000000000155: 2 decimals 0.3300000000000000155
0.3300000000000000155: 3 decimals 0.3300000000000000155

I especially like the second answer.

You still aren't printing enough digits by a long shot (hint: if
the last non-zero digit after the decimal point isn't 5, and the
value isn't an exact integer, it's not exactly representable in
binary floating point):

0.33 as double:
Before: 0.329999999999999960031971113494364544749259948730468750000000
Value: 0.330000000000000015543122344752191565930843353271484375000000
After: 0.330000000000000071054273576010018587112426757812500000000000

0.3 as double:
Before: 0.299999999999999933386618522490607574582099914550781250000000
Value: 0.299999999999999988897769753748434595763683319091796875000000
After: 0.300000000000000044408920985006261616945266723632812500000000

Your values do seem to match the closest representable value.

santosh · Nov 24, 2007

Mark McIntyre said:
jacob navia wrote:

(nothing of substance)

oh look. Another interminable thread in which Jacob insists he's
right, and everybody else, irrespective of pedigree, [ ... ]

?

<snip>

James Kuyper · Nov 24, 2007

Richard said:
James Kuyper said: ....

Indeed. That's why I said "almost valueless".

I still disagree, even with the 'almost'. Compilers that have a mode in
which they implement most features of C99 are now widely available.

Richard Heathfield · Nov 24, 2007

James Kuyper said:

I still disagree, even with the 'almost'. Compilers that have a mode in
which they implement most features of C99 are now widely available.

If a formal, verifiable list were available of which C99 features are
supported by *all* current mainstream hosted C implementations (including
the big iron compilers), then you'd have the makings of a point, because
it would then be possible to write portable C99 programs.

James Kuyper · Nov 24, 2007

Richard Heathfield wrote:
....

Like others here, I used gcc in non-conforming mode in an effort to compile

You deliberately choose the wrong non-conforming mode, when it was quite
clear what the correct non-conforming mode was.

the code. Anyone who claims he used gcc in *conforming* mode is mistaken,
since the code requires C99 features that are not available in any
conforming gcc mode. This is because gcc does not have a mode that
conforms to C99.

It has a mode which conforms to C99 pretty closely; well enough to make
this code work as intended. It is your free choice to not to use that
mode for code which obviously requires it, simply because that mode is
not fully conforming. That doesn't qualify as a valid objection to
Jacob's code.

James Kuyper · Nov 24, 2007

Richard said:
jacob navia said:

So? It still works just fine with properly written, portable code.

For your specific definitions of "properly written" and "portable". It
fails other definitions, ones that allow for C99-specific features that
are correctly implemented by a wide variety of easily available compilers.

James Kuyper · Nov 24, 2007

Richard said:
James Kuyper said: ....

Then we are at the stage where the OP needs to clarify his requirements,
because it seems to me that the conventional interpretation that I and
some others here have used is diametrically opposite to your conventional
interpretation (or at least the conventional interpretation that you are
defending).

The convention I'm familiar with accepts that most floating point
operations are inexact, and therefore does not require wording that
explicitly addresses that inexactness. A norm of "best approximation
that can be calculated with reasonable efficiency" is considered to be
implicit unless there's an explicit statement establishing either
stronger or weaker accuracy requirements.

I'm not sure what your convention is; from this one example I would
guess that you assume that perfect accuracy is implied unless there are
explicit statements to the contrary. You apply this assumption even when
perfect accuracy is clearly impossible, thereby justifying your
assumption that the author was too stupid to be aware of the
impossibility. This might be the case, but a more plausible assumption
is that he was using the more normal convention.

These conventions are incompatible, but not diametrically opposed. The
diametrical opposite of your convention would be that perfect inaccuracy
is always considered to be implied unless there are explicit statements
to the contrary. The normal convention is intermediate between these two
extreme positions, making it harder to describe what it's diametrical
opposite would be.

James Kuyper · Nov 24, 2007

Keith Thompson wrote:
....

Either (A) the OP really wants the exact result, and wasn't aware
that it's impossible, or (B) the OP wants an approximate result, and
wasn't aware that it's not particularly useful, or (C) the OP wants
an approximate result, and knows that it's actually useful to him
for some reason that none of us can figure out and he hasn't chosen
to share with us.

My guess is that (A) is the most likely scenario.

My guess would be (B).

Richard Heathfield · Nov 24, 2007

James Kuyper said:

Richard Heathfield wrote:
...

You deliberately choose the wrong non-conforming mode, when it was quite
clear what the correct non-conforming mode was.

Steady on there, James. "Deliberately"? That sounds a little heavy to me.
Perhaps you'd be so kind as to tell me what *you* think *my*
implementation's "correct" non-conforming mode is (setting aside, for the
time being, my view that "correct" and "non-conforming" contradict each
other). Once you've told me what options I should tell my implementation
to use, I'll happily use those options and report back to you.

It has a mode which conforms to C99 pretty closely; well enough to make
this code work as intended. It is your free choice to not to use that
mode for code which obviously requires it, simply because that mode is
not fully conforming. That doesn't qualify as a valid objection to
Jacob's code.

You appear to have misunderstood my response. I was getting weird results
here, *knowing* that I'd invoked gcc in non-conforming mode, so I asked
other people to check the code out, to see whether the results could be
duplicated.

The portability of the code is a very minor issue, compared to the more
serious issue that it doesn't achieve the required objective even on
platforms where it runs as intended by its programmer.

Richard Heathfield · Nov 24, 2007

James Kuyper said:

The convention I'm familiar with accepts that most floating point
operations are inexact,

The convention I'm familiar with is that we believe what people say unless
or until we have reason to believe they are lying or mistaken. I believed
what the OP said. If you choose to believe that he was lying or mistaken,
that's entirely up to you.

Juha Nieminen · Nov 24, 2007

Richard said:
*value = (int)(*value * p + 0.5) / (double)p;

Using "int(value+.5)" is the wrong way to round because it works
incorrectly with negative values.
The correct way is "std::floor(value+.5)".

James Kuyper · Nov 24, 2007

Richard said:
James Kuyper said:

Steady on there, James. "Deliberately"? That sounds a little heavy to me.

You've made it clear that you're aware that the code depends upon C99
features, but as far as I can tell you've not yet attempted to compile
it with a compiler in a mode where that compiler supports those features
of C99. It's very hard for me to me to see that as something that you
did accidentally. This seems to be a deliberate expression of your
attitude that if full implementations of C99 are rare, there's no point
in making any use of any C99 features, no matter how widely available
compilers are which support those features.

Perhaps you'd be so kind as to tell me what *you* think *my*
implementation's "correct" non-conforming mode is (setting aside, for the
time being, my view that "correct" and "non-conforming" contradict each
other). Once you've told me what options I should tell my implementation
to use, I'll happily use those options and report back to you.

As far as I can tell from what you've said, you don't seem to possess a
compiler with a mode in which it supports the required features of C99.
Unless and until you're willing to install one which does, I don't think
there's anything I can tell you. I'm no compiler expert; if the one you
do choose to install is too obscure, I might still be unable to tell you
how to get it working with this code.

....

The portability of the code is a very minor issue, compared to the more
serious issue that it doesn't achieve the required objective even on
platforms where it runs as intended by its programmer.

Here I'm in agreement with you; Jacob's solution falls short of fully
implementing even the conventional interpretation of the OP's request,
much less your excessively strict interpretation. It matches the
conventional interpretation only for a range of values that is more
restricted than it needs to be. On the other hand, that range is
probably good enough for many of those rare (possibly non-existent)
situations where the OP's requested functionality is actually useful.
Personally, I expect that the negative values for the number of digits,
which it doesn't support, would be more likely to be useful than
positive numbers.

Joe Wright · Nov 24, 2007

Gordon said:
None of the floating-point numbers printed here can be represented
exactly in binary floating point, except zero. For debugging
purposes, I recommend a better output format, say %300.200f ,
enough to ensure that you can have enough digits to exactly represent
the number you get as a result (this is quite a bit more than
FLT_DIG, DBL_DIG, or LDBL_DIG, as appropriate to the type being
used). Or perhaps someone is using base-10 floating point.

I think you're asking too much here. Note..

00111111 11010101 00011110 10111000 01010001 11101011 10000101 00011111
Exp = 1021 (-1)
111 11111111
Man = .10101 00011110 10111000 01010001 11101011 10000101 00011111
3.3000000000000002e-01

The first line is the 64-bit double, then I split it into exponent and
mantissa. The last line is the *printf format "%.16e" which will print
in decimal all the precision that a 64-bit double has.

Representing this value (or any double) wider than 17 decimal digits can
only yield nonsense.

James Kuyper · Nov 24, 2007

Richard said:
James Kuyper said: ....

The convention I'm familiar with is that we believe what people say unless
or until we have reason to believe they are lying or mistaken. I believed
what the OP said. If you choose to believe that he was lying or mistaken,
that's entirely up to you.

I believe that he was using English as it is normally used, with
infinitely many implicit assumptions. Failing to state assumptions that
are conventionally left unstated would make him neither mistaken nor a liar.

Richard Heathfield · Nov 24, 2007

Juha Nieminen said:

Using "int(value+.5)" is the wrong way to round because it works
incorrectly with negative values.

Please bear in mind that the above code was not mine. Earlier in that
article, I wrote:

"Elsethread, you were given this suggestion (suitably modified so that it
will actually compile, and with a driver added):"

and further on in that same article, I wrote:

"As you can see, it doesn't really round at all."

The correct way is "std::floor(value+.5)".

No, that doesn't work in C. So we adjust it to:

floor(value+.5)

which has the merit of compiling in C, and the disadvantage of being poor
style in C++.

But it still fails to round the value of a double to a specified number of
decimal places. On my system, floor(0.33 * 10.0 + 0.5) / 10.0 yields
0.299999999999999988898 - which is wrong in the first decimal place.

Richard Heathfield · Nov 24, 2007

James Kuyper said:

You've made it clear that you're aware that the code depends upon C99
features, but as far as I can tell you've not yet attempted to compile
it with a compiler in a mode where that compiler supports those features
of C99.

Right. I don't have such a compiler. So I did the best I could with the
features provided by gcc extensions. It is, of course, now clear from the
results I got that those gcc extensions were not compatible with the C99
features used by the code.

<snip>

Bart · Nov 24, 2007

Which is precisely the problem. If it is wanted for display then there
are better ways, if it is wanted for further calculations then it is
very important that the OP understand why it is not possible in general.

I see the argument is still going on.

If taken to it's logical conclusion then:

double x = 0.1;

is not possible to do in general. In that case we can all give up.

The rounding problem has a solution which works within the limitations
of binary floating point, like many things.

Bart

#include <stdio.h>
#include <math.h>

int main(void)
{
double x=0.1;
double y;

y=(x*10.0-1.0);

printf("0.1*10 - 1.0 = %e\n",y);
}

Richard Heathfield · Nov 24, 2007

Bart said:

I see the argument is still going on.

If taken to it's logical conclusion then:

double x = 0.1;

is not possible to do in general.

It's possible and legal to initialise x in this way. What we *can't* do
(and this is the trap that many programmers fall into) is now assume that
x stores a value that is exactly one-tenth.

In that case we can all give up.

No, there's no need to give up - we just have to be aware that we can't
always do with floating point representation the things we might like to
do, things that we *can* do with a textual representation. That doesn't
mean that floating point representation is useless. It just means that we
shouldn't expect it to do things that, by its very nature, it can't do.

Mathematicians have a similar problem, in that no sufficiently powerful
formal system can be both complete and consistent (both highly desirable
qualities) - but that doesn't stop mathematicians from using mathematics.
It just means they have to be careful *how* they use it. Similarly,
computer programmers need to be careful how they use floating point
representation.

<snip>

Gordon Burditt · Nov 24, 2007

pgp@medusa-s2:~/tmp$ gcc -std=c99 -pedantic -W -Wall -lm jn.c -ojn

I think you're asking too much here. Note..

No, I'm not. I want you to print out enough digits to get the
*EXACT* value of the result you actually got. This doesn't make
sense when you are interested in the value you are calculating, but
it does make sense when you are debugging floating-point rounding
problems. (You will never get an infinite repeating decimal taking
binary floating point values with finite mantissa bits and converting
them to decimal).

00111111 11010101 00011110 10111000 01010001 11101011 10000101 00011111
Exp = 1021 (-1)
111 11111111
Man = .10101 00011110 10111000 01010001 11101011 10000101 00011111
3.3000000000000002e-01

The first line is the 64-bit double, then I split it into exponent and
mantissa. The last line is the *printf format "%.16e" which will print
in decimal all the precision that a 64-bit double has.

But it's not enough to print the exact value you are getting. When
you are debugging rounding problems, why introduce *more* rounding
error that may obscure the problem you are trying to debug?

Representing this value (or any double) wider than 17 decimal digits can
only yield nonsense.

No, it's not nonsense. The value you *actually got* can be
represented exactly if you use enough digits. The value you should
have gotten in infinite-precision math, and taking into account the
accuracy of the inputs cannot be, and you have a point outside the
context of debugging rounding issues.

Decimal rounding function	24	Nov 21, 2009
Struct Member Variable Problems	1	Jun 21, 2023
Rounding error when converting from double to int	41	Aug 4, 2009
Access violation reading location	0	Oct 23, 2022
rounding to an input value	2	Sep 4, 2006
rounding to an input value	1	Sep 7, 2006
Rounding a floating point number	21	Feb 25, 2008
Java MemoryLayout/ValueLayout Questions.	2	Feb 5, 2023

Rounding double

Gordon Burditt

Gordon Burditt

santosh

James Kuyper

Richard Heathfield

James Kuyper

James Kuyper

James Kuyper

James Kuyper

Richard Heathfield

Richard Heathfield

Juha Nieminen

James Kuyper

Joe Wright

James Kuyper

Richard Heathfield

Richard Heathfield

Bart

Richard Heathfield

Gordon Burditt

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads