Trouble with integer floating point conversion

Mark Bluemel · Dec 13, 2007

Chris said:
It also notes, in passing, that setting the precision field in
the FPU control word does *not* affect the exponent range. This
means that the in-FPU computations are *still* not quite what
one might expect from an IEEE implementation.

I have the same note (with an example) at the
end of <http://web.torek.net/torek/c/numbers.html>.

As someone else noted else-thread, one can get "proper" IEEE floating
point on the x86 using gcc's "-ffloat-store" flag. This does,
however, slow down computation.

And still doesn't work for the OP's testcase :-

$ cat gr.c
#include <stdio.h>

double GRID_POINT(int i) {
return ( 0.1 + ( (80.1-0.1)/(400.0) )*((double)(i)) );
}

int main (void) {
double temp1 = GRID_POINT(2);
double temp2 = temp1-GRID_POINT(2);

printf("%.18e; %.18e\n", temp1, temp2);

return 0;
}

$ cc -ffloat-store gr.c -o gr

$ ./gr
5.000000000000000000e-01; -2.775557561562891351e-17

As I commented else-thread -ffloat-store doesn't appear to alter how
a floating point return value from a function (presumably in a register)
is handled. This is probably covered by the comment in the manual page
:-

... "a few programs rely on the precise definition of IEEE floating
point. Use -ffloat-store for such programs, _after modifying them to
store all pertinent intermediate computations into variables_."

Ben Bacarisse · Dec 13, 2007

Mark Bluemel said:
Chris Torek wrote:

And still doesn't work for the OP's testcase :-

$ cat gr.c
#include <stdio.h>

double GRID_POINT(int i) {
return ( 0.1 + ( (80.1-0.1)/(400.0) )*((double)(i)) );
}

int main (void) {
double temp1 = GRID_POINT(2);
double temp2 = temp1-GRID_POINT(2);

printf("%.18e; %.18e\n", temp1, temp2);

return 0;
}

$ cc -ffloat-store gr.c -o gr

$ ./gr
5.000000000000000000e-01; -2.775557561562891351e-17

A data point. It does here:

$ gcc -ffloat-store gr.c -o gr
$ ./gr
5.000000000000000000e-01; 0.000000000000000000e+00
$ gcc gr.c -o gr
$ ./gr
5.000000000000000000e-01; -2.775557561562891351e-17
$ gcc --version
gcc (GCC) 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)
<snip>

Flash Gordon · Dec 13, 2007

jacob navia wrote, On 13/12/07 09:03:

This is typical of the "regulars". Talk to say nothing. That part
of the standard has not anything to say about the precision
of intermediate results (what we are talking about here)
but about how big the LIMITS of the floating point representation
are.

It also talks about precision. It even gives a way to test it.

To the subject of precision, the standard says:
<quote>
The accuracy of the floating-point operations (+, -, *, /) and of the
library functions in <math.h> and <complex.h> that return floating-point
results is implementation defined.

The implementation may state that the accuracy is unknown
<end quote>

To be precise, accuracy and precision are not the same.

i.e. IT CONFIRMS what I am saying. There aren't any guarantees at all.

If there isn't even an accepted standard that is enforced, how
can you guarantee anything.

Question Mr "flash"

If you are going to use the honorific it is "Mr Gordon". If you are
trying to imply I am anonymous just do a whois lookup on my domain (it
is obvious if you look that it is a personal domain). I will even give
you a link for it http://whois.domaintools.com/flash-gordon.me.uk

How could the standard guarantee ANYTHING about the precision of
floating point calculations when it doesn't even guarantee a common
standard?

Yes I am seriously saying that the absence of an enforced standard
makes any guarantee IMPOSSIBLE.

Please prove the contrary.

Easy, it provides information about precision, just not accuracy.

Importantly, it provides a mechanism to *test* the precision, so if your
method is appropriate you can apply it only if it is required.

How learned you are Mr Gordon....

Anything else?

Like all the "regulars", you speak with word games. If you have anything
else to propose please say it clearly. If not, say clearly
what "adverse side effects" would happen with my proposition.

That depends on what the library depends on and what it does. Some
algorithms become unstable if too high or too low a precision or
accuracy is used. There is no guarantee that the library, specifically
the maths function in it, don't use any such algorithms. For other
algorithms, the optimum value of various constants depends on the
precision and accuracy with which the calculations are done.

NOTHING is guaranteed by the standard. See above

Yes, see above.

This is stupid since the library can't change the way the program is
compiled. If you substitute the call to the user function by a call to
a library function the problem remains THE SAME. You fail to grasp what
the problem is.

You fail to grasp that a high quality extended precision maths library
is likely to have key parts implemented in assembler for common targets,
and by doing so it *can* guarantee precision and accuracy.

Note that GMP which one person suggested states on the from page, "with
highly optimized assembly code for the most common inner loops for a lot
of CPUs". It goes on to say in the manual, "All calculations are
performed to the precision of the destination variable. Each function is
defined to calculate with 'infinite precision' followed by a truncation
to the destination precision, but of course the work done is only what's
needed to determine a result under that definition." So it provides
precisely the guarantees you say are impossible. So GMP will give
guaranteed behaviour in a more portable manner than your suggestion,
although there are other costs to using GMP,

This is not necessary, you just change the precision that's all!

Unless it causes problems. Or some of the library functions change the
precision (e.g. to increase stability) and then set it to what the
implementation assumes it is normally set to.

This is an example of how the "regulars" spread nonsense just with the
only objective of "contradicting jacob"
No.

as they announced here
yesterday.

No one announced that.

Alok Singhal · Dec 14, 2007

And still doesn't work for the OP's testcase :-

$ cat gr.c
#include <stdio.h>

double GRID_POINT(int i) {
return ( 0.1 + ( (80.1-0.1)/(400.0) )*((double)(i)) );
}

int main (void) {
double temp1 = GRID_POINT(2);
double temp2 = temp1-GRID_POINT(2);

printf("%.18e; %.18e\n", temp1, temp2);

return 0;
}

$ cc -ffloat-store gr.c -o gr

$ ./gr
5.000000000000000000e-01; -2.775557561562891351e-17

This is OT, but the above happens because GRID_POINT(2) in the second call
is an 'unnamed temporary', and is not necessarily flushed to the memory by
gcc (even with -ffloat-store).

See http://arxiv.org/abs/cs/0701192 - "The pitfalls of verifying
floating-point computations".

-Alok

Chris Torek · Dec 14, 2007

And still doesn't work for the OP's testcase ... [snip code; but it

reappears below, modified slightly]

As I commented else-thread -ffloat-store doesn't appear to alter how
a floating point return value from a function (presumably in a register)
is handled.

This is allowed by the Standard -- you have to store the result
in a temporary variable, or use a cast, to trim off "extra"
precision.

This is probably covered by the comment in the manual page :-
... "a few programs rely on the precise definition of IEEE floating
point. Use -ffloat-store for such programs, _after modifying them to
store all pertinent intermediate computations into variables_."

If gcc fails to store the number through memory (on the x86, that
is; this is how -ffloat-store trims the excess precision and
exponents) on a cast, that would be a bug.

On my Linux box here, the bug is not showing up, with the same
compilation options you used, or indeed any others. Then again,
I probably have a different Linux, different version of gcc, and
different libraries. (In particular I think this Linux and/or
libc defaults C programs to double, not extended, precision in
the FPU control word.)

Here is the modified test-case; compile with -DCAST_AWAY_EXCESS to
test the effect of adding a cast.

% cat gr.c
#include <stdio.h>

double GRID_POINT(int i) {
return ( 0.1 + ( (80.1-0.1)/(400.0) )*((double)(i)) );
}

#ifdef USE_CAST
#define CAST_AWAY_EXCESS (double)
#else
#define CAST_AWAY_EXCESS /* nothing */
#endif

int main(void) {
double temp1 = GRID_POINT(2);
double temp2 = temp1 - CAST_AWAY_EXCESS GRID_POINT(2);

printf("%.18e; %.18e\n", temp1, temp2);

return 0;
}
% cc -o gr -ffloat-store gr.c
% ./gr
5.000000000000000000e-01; 0.000000000000000000e+00

Mark Bluemel · Dec 14, 2007

Ben said:
Mark Bluemel said:

Chris Torek wrote:
And still doesn't work for the OP's testcase :-

Click to expand...

[Snip]

A data point. It does here:

$ gcc -ffloat-store gr.c -o gr
$ ./gr
5.000000000000000000e-01; 0.000000000000000000e+00
$ gcc gr.c -o gr
$ ./gr
5.000000000000000000e-01; -2.775557561562891351e-17
$ gcc --version
gcc (GCC) 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)

Looks like I need a later gcc then

Flash Gordon · Dec 14, 2007

CBFalconer wrote, On 13/12/07 15:05:

Please describe how you implement infinite precision in a finite
number of bits.

That has nothing to do with James's comment. You do not need infinite
precision to get guaranteed accuracy. You do not need infinite precision
to get high precision.

jameskuyper · Dec 14, 2007

CBFalconer said:
Please describe how you implement infinite precision in a finite
number of bits.

Do you have any particular reason for posing such an absurd challenge?
It's not relevant to any point that I, or anyone else, has made so far
in this discussion. Maybe if you explain more what you're looking for,
I might be able to help.

Keith Thompson · Dec 15, 2007

CBFalconer said:
Please describe how you implement infinite precision in a finite
number of bits.

I think you misunderstood James's use of the phrase "inaccurate
value".

For example, given a precision of 5 digits after the decimal point,
3.14159 is an accurate value of pi (to within the stated precision);
3.12345 is an innaccurate value of pi. Infinite precision is not
required for what's being discussed.

Barry Schwarz · Dec 16, 2007

Hi all!

The following piece of code has (for me) completely unexpected behaviour.
(I compile it with gcc-Version 4.0.3)
Something goes wrong with the integer to float conversion.
Maybe somebody out there understands what happens.
Essentially, when I subtract the (double) function value GRID_POINT(2) from
a variable which has been assigned the same value before this gives a
non-zero result and I really do not understand why.

The program prints
5.000000000000000000e-01; -2.775557561562891351e-17;
0.000000000000000000e+00

And the comparison
if(temp1==GRID_POINT(2))
has negative outcome.

When I replace
((double)(i)) by 2.0
everything behaves as expected. So the output is
5.000000000000000000e-01; 0.000000000000000000e+00; 0.000000000000000000e+00

But: even if the integer to float conversion is inexact (which, I think,
should not be the case) something like
temp1 = GRID_POINT(2);
temp3 = temp1-GRID_POINT(2);
should still result in temp3==0.0, whatever function GRID_POINT does.

What do You think?

I think you should output the internal representation of the doubles
in hex so you can see the actual results of your computation, not what
printf thinks it should display.

Thank you!

---------------------------------------------------
#include <stdio.h>

double GRID_POINT(int i);

double GRID_POINT(int i)
{
return ( 0.1 + ( (80.1-0.1)/(400.0) )*((double)(i)) );

What purpose do you think the cast serves?

All the parentheses are superfluous except the pair around the
subtraction.

Do you really think 80.1 - .1 is exactly equal to 80.0?

Even if it is on your system, do you think 80.0/400.0 is exactly equal
to 0.2? Can any floating point value be exactly equal to 0.2 on your
system? (It is possible on mine but very few other people here have a
system with decimal floating point; most use binary.)

}

int main (void) {

double temp1, temp2, temp3;

temp1 = GRID_POINT(2);
temp2 = GRID_POINT(2);
temp3 = temp1-GRID_POINT(2);

printf("%.18e; %.18e; %.18e\n", temp1, temp3, temp1-temp2 );

if(temp1==GRID_POINT(2)){
printf("these two are equal\n");
}

return 0;
}

Remove del for email

Fibonacci	0	May 13, 2023
Java OpenJDK Floating Point Dare	3	Jan 17, 2023
Crossword	2	May 11, 2020
Crossword	14	May 13, 2020
Drawing missing in bitmap in a pure C win32 program	4	Jun 3, 2023
Floating point to integer casting	48	Oct 10, 2009
floating point	13	Jul 24, 2012
Java Problems (Really Need Help!)	2	Jan 27, 2014

Trouble with integer floating point conversion

Mark Bluemel

Ben Bacarisse

Flash Gordon

Alok Singhal

Chris Torek

Mark Bluemel

Flash Gordon

jameskuyper

Keith Thompson

Barry Schwarz

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads