Rounding double

Dik T. Winter · Nov 26, 2007

> I claim that this will deliver the best approximation to the
> rounding to n decimal places, and I have never claimed otherwise.

Does it? I think you need a whole lot of work to show that it is
the *best* approximation. You can claim that it gives a *good*
approximation.

> double roundto(double value, unsigned digits)
> {
> long double v = value;
> long double fv = fabs(value),p = powl(10.0L,digits);
> if (fv > powl(10.0L,DBL_DIG) || digits > DBL_DIG)
> return value;
> return roundl(p*value)/p;
> }

In the last line there are two roundings, they can work in such
a way that the final result is *not* the best approximation.

jacob navia · Nov 26, 2007

Ralf said:
It depends on the cleverness of such an implementation.
If such implementation does not also have the magic to forget the
extra declaration of atof when stdlib.h was not included, it
violates 7.1.3. The following program is strictly conforming.

#include <math.h>
static int atof = 4;
int main()
{
return 0;
}

Ralf

Use lcc -ansic. In that case the declaration in math.h disappears

James Kuyper · Nov 26, 2007

jacob said:
It is also in stdlib. It is in BOTH, and nothing is written about not
putting it in math.h

So, if I write the following strictly conforming code:

#include <stdlib.h>
int atof = 3;
int *pAtof(void) { return &atof;}

Will it compile correctly under your implementation?

7.1.3p1:

jacob navia · Nov 26, 2007

James said:
So, if I write the following strictly conforming code:

#include <stdlib.h>
int atof = 3;
int *pAtof(void) { return &atof;}

Will it compile correctly under your implementation?

7.1.3p1:

yes, use
lcc -ansic

jacob navia · Nov 26, 2007

Dik said:
Does it? I think you need a whole lot of work to show that it is
the *best* approximation. You can claim that it gives a *good*
approximation.

In the last line there are two roundings, they can work in such
a way that the final result is *not* the best approximation.

Propose a better method then.

Dik T. Winter · Nov 26, 2007

>
> In the last line there are two roundings, they can work in such
> a way that the final result is *not* the best approximation.

Eh, actually there are even three roundings in that last line, of which
one is explicit.

James Kuyper · Nov 26, 2007

jacob said:
James Kuyper wrote: ....

That is the case. I changed the function to clean it up, and
its last version was:

#include <stdio.h>
#include <float.h>
#include <math.h>

double roundto(double value, unsigned digits)

{
long double v = value;
long double fv = fabs(value),p = powl(10.0L,digits);
if (fv > powl(10.0L,DBL_DIG) || digits > DBL_DIG)
return value;
return roundl(p*value)/p;
}

That's better. You've still got potential for overflow on the
multiplication, but to be honest that's true of a lot of my code too.
However, I write scientific software where the range of possible values
is known and validated. Therefore, I only need to provide overflow
protection for those operations where overflow is an actual possibility.
As this is a utility program, it should prevent overflow for any pair of
arguments for which overflow is possible and preventable - it doesn't.

Also, providing support for negative values of 'digits' would be
trivial, and would significantly improve what little usefulness this
routine has (while creating a need to prevent denormalization at the
multiplication).

jacob navia · Nov 26, 2007

Richard said:
jacob navia said:

As you ought to know already, I don't see any point in trying to solve a
problem that is inherently impossible for reasons that I have already
explained.

We need to know why we're rounding. If we're dealing with, say, currency
(or some analogous system), the proper solution is to do calculations in
an integer unit of which all other currency units are a multiple (e.g. for
Sterling, use pennies; in the USA, use cents; in Europe, use Euros), and
to establish a protocol for dealing with calculations that don't fit into
this process (e.g. interest calculations). If we're dealing with
calculations that simply require a neatening off for display purposes, on
the other hand, then the proper solution is to round the text
representation, not the value itself.

I want to have the distance between point a and point b in meters,
and I have it in millimeters.
Or
I want to know how many euro cents I have with US$56.87 using
1.4655444 as exchange rate, or WHATEVER problem it is.

You say that the problem is impossible to solve EXACTLY.

I agree with that. My solution solves it inexactly, i.e. it
gives the (maybe) best approximation to the true value,
as usual with all floating point calculations.

That code is perfectly topical, of course - but it doesn't solve the
problem. Given this fact, the fact that it doesn't cater for those without
C99 compilers is of little consequence.

Who cares about those people?
There are many freely available C99 compilers. You for instance, you use
an obsolete version of gcc, and want to stay that way forever, frozen in
some distant past.

Your choice. I do not care about those people. I have told you how to
fix it and if you do not want (or you have an employer that forbids you
to upgrade your compiler and persist using the older buggy versions)
that's YOUR problem, not mine. I use standard C. Not some substandard
to please Mr Heathfield.

Data point: on my system, it gives very very very incorrect results (even
within the context that you've adopted: e.g. if I ask it to round 0.33 to
1dp I get -1.3543851449227162220), but then I don't have a C99 compiler,
merely a gcc implementation that provides non-C99-conforming extensions -
but clearly this is a separate issue, and one on which the opinions of
reasonable people are divided.

Yes. You were lying by omission, and it took some time for everyone to
realize that.

(For those who may well be thinking - and indeed have already expressed the
thought - that I should "get a C99 compiler then - or at least a compiler
that supports many C99 features", my position is this: Many professional C
programmers do not get to choose the implementation they are using.

Poor Mr Heathfield! You must have a terrible employer. I am thinking
of starting a petition to him to allow you upgrade your gcc installation
older than 1999

In
many software environments, the decision about which compiler to use was
made long ago for reasons that do not count "having the latest C99 stuff"
as being particularly important when measured against more important
stability criteria.

Yeah. You have missed all the bug fixes of gcc for the last 7 years.

GREAT!

Kai-Uwe Bux · Nov 26, 2007

Actually, it is not be possible to implement a rounding method that is
guaranteed to give the _best approximation_ in standard C nor in standard
C++. In order to actually do that, you would need some guarantees of the
underlying floating point arithmetic. Unless you have those guarantees
(e.g., that a/b is the best approximation to the quotient of a and b) you
will have a hard time to show that a rounding function yields the best
approximation to the result. The problem now is, of course, that neither C
nor C++ make such guarantees about floating point arithmetic.

Propose a better method then.

Even if nobody should be able to provide a better method, that still would
not imply that the method you proposed yields best approximations to the
rounding results.

Best

Kai-Uwe Bux

jacob navia · Nov 26, 2007

Dik said:
Eh, actually there are even three roundings in that last line, of which
one is explicit.

Click to expand...

Yes.
First one is explicit:

roundl(p*value)

Then you divide by the power of ten, and round the result to the nearest
ulp, then you convert the long double result into
double and round THAT, to the nearest ulp and return that result.

The idea is that all those roundings are done in HIGHER precision that
what double offers, and will NOT affect the result.

You have any objection against this supposition?

James Kuyper · Nov 26, 2007

Richard said:
jacob navia said:

As you ought to know already, I don't see any point in trying to solve a
problem that is inherently impossible for reasons that I have already
explained.

Are you asserting that the specification Jacob just described is
inherently impossible to implement? As far as I can see, the reasons
you've already explained do not apply to this specification.

I was under the impression that the only thing you could say against
this specification is that it doesn't match your own unconventionally
strict interpretation of the OP's specification.

Bart · Nov 26, 2007

We need to know why we're rounding. If we're dealing with, say, currency
(or some analogous system), the proper solution is to do calculations in
an integer unit of which all other currency units are a multiple (e.g. for
Sterling, use pennies; in the USA, use cents; in Europe, use Euros), and
to establish a protocol for dealing with calculations that don't fit into
this process (e.g. interest calculations). If we're dealing with
calculations that simply require a neatening off for display purposes, on
the other hand, then the proper solution is to round the text
representation, not the value itself.

Not many posting here seem to believe there are real and practical
reasons for rounding values to so many decimals (or rounding to a
nearest fraction, a related problem).

Currency seems the most understood, and storing values in floating
point as dollars/pounds/euros, and rounding intermediate calculations
to the nearest cent/penny (0.01) works perfectly well for a typical
shop or business invoice. For large banks adding up accounts for
millions of customers, government and so on, I'm sure they have their
specialist developers.

Another example is CAD (drawing tools) where input is inherently noisy
(23.423618182 mm) and the usual practice is to round ('snap') to the
nearest aesthetic value, depending on zoom factor and scale and so on,
so 23., 23.5, 23.42 and so on. Otherwise you would get all sorts of
skewy lines.

There is still a noise factor present (the errors we've been
discussing) but you would need to zoom in by factor of a billion to
see them. In typical printouts things look perfect. In fact it's
interesting to zoom in and see these errors come to life on the
screen.

Also often everything is stored as, say, millimetres, while the user
might be using inches, and rounding would need to be as inches (say
hundredths of an inch, which would be the nearest multiple of 0.254),
again an approximation but works well enough (this allows designs
created with different units to be combined).

Actually rounding for printing purposes is probably not done much
outside printf() and such functions. In fact perhaps it's because
printf() does round floating point numbers, and therefore shows a
value that is only an approximation, that gives rise to much
misunderstanding. Maybe it should indicate (with a trailing ? perhaps)
that the value printed is not quite right unless explicitly told to
round.

Bart

jacob navia · Nov 26, 2007

James said:
That's better. You've still got potential for overflow on the
multiplication, but to be honest that's true of a lot of my code too.
However, I write scientific software where the range of possible values
is known and validated. Therefore, I only need to provide overflow
protection for those operations where overflow is an actual possibility.
As this is a utility program, it should prevent overflow for any pair of
arguments for which overflow is possible and preventable - it doesn't.

I disagree.

The test fv > powl(10.0L, DBL_DIG) ensures that the absolute value
of "value" is less than 10 ^ 16. If "digits" is 15, the maximum
value of the multiplication can be 10 ^ 15 * 10 ^ 15 == 10 ^ 30,
a LOT less than the maximum value of double precision that is
DBL_MAX 1.7976931348623157e+308. Note that the calculations are done
in long double precision and LDBL_MAX 1.18973149535723176505e+4932L
so the overflow argument is even less valid in long double precision.
Other systems LDBL_MAX are even higher since they use 128 bits and not
80 bits as the 80x86. For systems where long double is equal to double
precision, the value is well within range anyway.

Hence, the multiplication can't overflow.

If we accepted negative values, the division of 10 ^ 30 by 10 ^ -15 -->
10 ^ 45, still LESS than the value of DBL_MAX. Hence the division can't
overflow, and in my function I do not accept negative values so the
division result will be always less than 10 ^ 30.

The rounding of long double to double can't overflow either since it is
always less than 10 ^ 308.

jacob navia · Nov 26, 2007

Kai-Uwe Bux said:
Actually, it is not be possible to implement a rounding method that is
guaranteed to give the _best approximation_ in standard C nor in standard
C++. In order to actually do that, you would need some guarantees of the
underlying floating point arithmetic. Unless you have those guarantees
(e.g., that a/b is the best approximation to the quotient of a and b) you
will have a hard time to show that a rounding function yields the best
approximation to the result. The problem now is, of course, that neither C
nor C++ make such guarantees about floating point arithmetic.

Even if nobody should be able to provide a better method, that still would
not imply that the method you proposed yields best approximations to the
rounding results.

True. I lessen my claim saying that it is the best approximation
presented till now

James Kuyper · Nov 26, 2007

James said:
jacob navia wrote: ....

So, if I write the following strictly conforming code:

#include <stdlib.h>

That should, of course, have been <math.h>. I must not have been fully
awake yet.

James Kuyper · Nov 26, 2007

jacob said:
James Kuyper wrote: ....

I disagree.

The test fv > powl(10.0L, DBL_DIG) ensures that the absolute value
of "value" is less than 10 ^ 16. If "digits" is 15, the maximum
value of the multiplication can be 10 ^ 15 * 10 ^ 15 == 10 ^ 30,

Sorry, for some reason I was confusing DBL_DIG with DBL_MAX_10_EXP. I
use neither macro frequently enough to have memorized which one is
which; I should have checked before I said anything.

James Kuyper · Nov 26, 2007

Bart wrote:
....

Not many posting here seem to believe there are real and practical
reasons for rounding values to so many decimals (or rounding to a
nearest fraction, a related problem).

Incorrect. What I believe is that the real and practical reasons tend to
fall into two categories:

a) Conversion of floating point numbers to digit strings, usually for
output.

b) Calculations that should, properly, be carried out in fixed-point
arithmetic. In the absence of direct language support for fixed-point,
it should be emulated by the programmer using, for instance, an integer
to represent 1000 times the actual value, if that value is to be stored
with 3 digits after the decimal place. All of the example you gave
should fall into this second category.

There's probably at least one additional category, but I can't think of
any right now.

James Kuyper · Nov 26, 2007

jacob navia wrote:
....

The idea is that all those roundings are done in HIGHER precision that
what double offers, and will NOT affect the result.

You're assuming that long double has greater precision than double.
That's not required. That's one reason why I prefer the approach that
uses sprintf() and sscanf().

Dik T. Winter · Nov 26, 2007

>
> Propose a better method then.

Do you not understand what I complain about? It is your remark that
it gives the *best* approximation. This is a wrong claim. I think
that if you actually *want* to return the best approximation, it will
be a lot of work, and it is doubtful whether it will be useful at all.
So your *good* (not *best*) approximation may be optimal in the sense
of effort vs. result.

Bart · Nov 26, 2007

Bart wrote:

...

Incorrect.

As I said..

What I believe is that the real and practical reasons tend to
fall into two categories:

a) Conversion of floating point numbers to digit strings, usually for
output.

b) Calculations that should, properly, be carried out in fixed-point
arithmetic. ....

All of the example you gave should fall into this second category.

But it isn't necessary. The examples were from actual code that worked
well.

If I invest $1000 at 5.75% for 5 years I will get
$1322.51887874443359375 at the end. If my interest calculating
function rounds that to 1322.51 (rounding down in this case) so that
the user of my function will see 1322.510000.. at most precision
settings he prints at, that seems perfectly acceptable.

But it seems this thread is less concerned about the 0.008878.. cents
he's not seeing, than about the million billionth of a cent that the
51 cents differs from exactly 51 cents.

Bart

Decimal rounding function	24	Nov 21, 2009
Struct Member Variable Problems	1	Jun 21, 2023
Rounding error when converting from double to int	41	Aug 4, 2009
Access violation reading location	0	Oct 23, 2022
rounding to an input value	2	Sep 4, 2006
rounding to an input value	1	Sep 7, 2006
Rounding a floating point number	21	Feb 25, 2008
Java MemoryLayout/ValueLayout Questions.	2	Feb 5, 2023

Rounding double

Dik T. Winter

jacob navia

James Kuyper

jacob navia

jacob navia

Dik T. Winter

James Kuyper

jacob navia

Kai-Uwe Bux

jacob navia

James Kuyper

Bart

jacob navia

jacob navia

James Kuyper

James Kuyper

James Kuyper

James Kuyper

Dik T. Winter

Bart

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads