Typecast long double->double seems to go wrong

Michael Mair · May 24, 2004

Hi there,

actually, I have posted the same question in g.g.help.
As there were no answers, I am still not sure whether this
is a bug or only something open to the compiler that is
seemingly inconsistent or whether my understanding of C
is not complete enough.
I would appreciate answers or pointers to answers very much.

Cheers,
Michael

---------original post to g.g.help-----------------------

for a C course I gave the students the task to compute
FLT_EPSILON, DBL_EPSILON and LDBL_EPSILON.
We are using gcc 3.2 and gcc 3.4 on Linux machines.

The following

epsilon_float = 1.0F;
while (1.0 + eps_float != 1.0)
eps_float /= 2.0;

eps_float *= 2.0;

will give the students not FLT_EPSILON, but LDBL_EPSILON
in float precision, as the expressions in the condition for
while are evaluated in long double precision.
Now we apply the necessary type casts

epsilon_float = 1.0F;
while ((float)(1.0 + eps_float) != (float)1.0)
eps_float /= 2.0;

eps_float *= 2.0;

and everything works fine.
Right. Up to now, everything is fine and completely to my
understanding. But if I try the same with doubles, I get
LDBL_EPSILON even with the cast. (Example provided below)

Is this a problem in gcc or am I missing some finer points
of C? I tried RTFM and STFW but probably looked in the wrong
places.
It would be great if someone could tell me what is going
wrong or point me in the right direction.

Cheers,
Michael

gcc -Wall -std=c99 -pedantic castepsilon.c
--------------------------castepsilon.c-------------------
#include <stdio.h>
#include <float.h>

int main (void)
{
float eps_float = 1.0F;
double eps_double_test = 1.0, eps_double = 1.0;
long double eps_long_double = 1.0L;

// find FlT_EPSILON
while ((float)(1.0 + eps_float) != (float)1.0)
eps_float /= 2.0;

eps_float *= 2.0;

// find DBL_EPSILON
while ((double)(1.0 + eps_double_test) != (double)1.0)
eps_double_test /= 2.0;

eps_double_test *= 2.0;

// alternative way
for ( double test=1.0 + eps_double; test != (double)1.0;
test=1.0 + eps_double)
eps_double /= 2.0;

eps_double *= 2.0;

// find LDBL_EPSILON
while ((long double)(1.0 + eps_long_double) != (long double)1.0)
eps_long_double /= 2.0;

eps_long_double *= 2.0;

// Output
printf("Epsilon is --");
printf(" exact val\n");
printf(" - for float : %8.7g -- %8.7g\n",
eps_float, FLT_EPSILON);
printf(" - for double (1) : %17.16g -- %17.16g\n",
eps_double_test, DBL_EPSILON);
printf(" - for double (2) : %17.16g -- %17.16g\n",
eps_double, DBL_EPSILON);
printf(" - for long double: %20.19Lg -- %20.19Lg\n\n",
eps_long_double, LDBL_EPSILON);

return(0);
}
-------------------------------------------------------------------
mairml@cip20:~/C/test> ./a.out
Epsilon is -- exact value --
- for float : 1.192093e-07 -- 1.192093e-07
- for double (1) : 1.084202172485504e-19 -- 2.220446049250313e-16
- for double (2) : 2.220446049250313e-16 -- 2.220446049250313e-16
- for long double: 1.084202172485504434e-19 -- 1.084202172485504434e-19

Tim Prince · May 24, 2004

Michael Mair said:
for a C course I gave the students the task to compute
FLT_EPSILON, DBL_EPSILON and LDBL_EPSILON.
We are using gcc 3.2 and gcc 3.4 on Linux machines.

The following

epsilon_float = 1.0F;
while (1.0 + eps_float != 1.0)
eps_float /= 2.0;

eps_float *= 2.0;

will give the students not FLT_EPSILON, but LDBL_EPSILON
in float precision, as the expressions in the condition for
while are evaluated in long double precision.
Now we apply the necessary type casts

epsilon_float = 1.0F;
while ((float)(1.0 + eps_float) != (float)1.0)
eps_float /= 2.0;

eps_float *= 2.0;

and everything works fine.
Right. Up to now, everything is fine and completely to my
understanding. But if I try the same with doubles, I get
LDBL_EPSILON even with the cast. (Example provided below)

Is this a problem in gcc or am I missing some finer points
of C? I tried RTFM and STFW but probably looked in the wrong
places.
It would be great if someone could tell me what is going
wrong or point me in the right direction.

Cheers,
Michael

gcc -Wall -std=c99 -pedantic castepsilon.c
--------------------------castepsilon.c-------------------
#include <stdio.h>
#include <float.h>

int main (void)
{
float eps_float = 1.0F;
double eps_double_test = 1.0, eps_double = 1.0;
long double eps_long_double = 1.0L;

// find FlT_EPSILON
while ((float)(1.0 + eps_float) != (float)1.0)
eps_float /= 2.0;

eps_float *= 2.0;

// find DBL_EPSILON
while ((double)(1.0 + eps_double_test) != (double)1.0)
eps_double_test /= 2.0;

eps_double_test *= 2.0;

// alternative way
for ( double test=1.0 + eps_double; test != (double)1.0;
test=1.0 + eps_double)
eps_double /= 2.0;

eps_double *= 2.0;

// find LDBL_EPSILON
while ((long double)(1.0 + eps_long_double) != (long double)1.0)
eps_long_double /= 2.0;

eps_long_double *= 2.0;

// Output
printf("Epsilon is --");
printf(" exact val\n");
printf(" - for float : %8.7g -- %8.7g\n",
eps_float, FLT_EPSILON);
printf(" - for double (1) : %17.16g -- %17.16g\n",
eps_double_test, DBL_EPSILON);
printf(" - for double (2) : %17.16g -- %17.16g\n",
eps_double, DBL_EPSILON);
printf(" - for long double: %20.19Lg -- %20.19Lg\n\n",
eps_long_double, LDBL_EPSILON);

return(0);
}
-------------------------------------------------------------------
mairml@cip20:~/C/test> ./a.out
Epsilon is -- exact value --
- for float : 1.192093e-07 -- 1.192093e-07
- for double (1) : 1.084202172485504e-19 -- 2.220446049250313e-16
- for double (2) : 2.220446049250313e-16 -- 2.220446049250313e-16
- for long double: 1.084202172485504434e-19 -- 1.084202172485504434e-19

You might check how the enquire.c program performs these tests, in order to
be less dependent on the interpretations of individual compilers, and
distinguish between precision of evaluation and precision of declared type.
I find that gcc-3.3.4 on Windows gives the results you quote for your
sample. If I add -march=pentium4 -mfpmath=sse, with or without -O, all
double casts do narrow the precision, but if I add -O without sse
or -ffloat-store, neither is narrowed. I think the only advice you will
find in the gcc references is to use either the sse or -ffloat-store
options.
I don't know of any CPU still in production which does not support fixed
53-bit precision, like SSE2, so the fact that gcc might be lacking in
support for such a feature on certain older CPUs is nearly moot. There may
have been optimization reasons for gcc behaving as it did on certain CPUs;
many programmers meant "at least double" when they cast to double, and
specific language features to support that never became widely accepted.
I grant that I might normally avoid the sse option on Pentium-M, but even on
that CPU it is the more efficient method to accomplish what you ask.
The gcc options which produce the behavior to which you object are quite
unlikely to be used on the x86-64 OS, and are contrary to the Windows-64
ABI, so I don't see anyone changing gcc at this late date.

Martin Dickopp · May 24, 2004

Michael Mair said:
for a C course I gave the students the task to compute
FLT_EPSILON, DBL_EPSILON and LDBL_EPSILON.

[...]
Now we apply the necessary type casts

epsilon_float = 1.0F;
while ((float)(1.0 + eps_float) != (float)1.0)
eps_float /= 2.0;

eps_float *= 2.0;

and everything works fine.
Right. Up to now, everything is fine and completely to my
understanding. But if I try the same with doubles, I get
LDBL_EPSILON even with the cast. (Example provided below)

See <[email protected]>.

Martin

Tim Prince · May 24, 2004

I think the only advice you will
find in the gcc references is to use either the sse or -ffloat-store
options.

For older CPU's, you could set the x87 precision mode to 53 bits. Last I
looked, that was the procedure described for SPEC benchmarks on Athlon, on
the SuSE site.

Michael Mair · May 24, 2004

Hi Tim,

thank you for your extensive response!

So, if I may sum it up as I understood it: The inconsistency is a
"problem" which most of the time is a feature as the compiler uses
the available precision for computations and comparisons on variables
kept in the registers. As double is the "natural" floating point
data type, it is in some sense also "natural" to use the "natural"
floating point precision of the FPU or whatever, neglecting the
expected values from the IEEE standard.

The gcc option -ffloat-store (which I managed to not find in the man
page even though it's there) switches off this "unwanted" behaviour,
discarding the excess precision.

The whole thing does not apply to floats as they already are heavily
restricted when compared agains the abilities of modern
cpus/fpus/whatever, thus the cast works as intended.

So, the only "inconsistency" is the fact that the symbolic constant
DBL_EPSILON refers to the IEEE double instead of the actual double
data type when performing arithmetic operations and comparisons.
However, as soon as we get out of the register or have an explicit
assignment of the value to a double variable (or put it on the stack
as in the message Martin refered to), we automatically lose the excess
precision as we go back to 64 bits, so DBL_EPSILON is consistent in
this case and respect.

I hope that was not to convoluted...

Thank you for helping me understand

Best regards,
Michael

Comparison of Integer and Pointer (that's supposed to be an Integer). Where did I go wrong?	0	Nov 19, 2022
Can anyone help me whats wrong with this	2	Jun 2, 2022
single x double precision on 32bit arch	6	May 19, 2006
[semi OT] - Lack of long double implementation in VS	10	Oct 23, 2011
Machine epsilon: conclusion	46	Jun 30, 2007
RNGs: A double KISS	10	Apr 14, 2010
The machine epsilon	51	Jun 29, 2007
unsigned long to double	6	Sep 4, 2008

Typecast long double->double seems to go wrong

Michael Mair

Tim Prince

Martin Dickopp

Tim Prince

Michael Mair

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads