parametized numerical constants

John · Mar 24, 2010

Hi,

I'm writing a code that I would like to compile in either single or
double precision. So, I declare all my floating point variables with a
typedef such as

//typedef float FLOAT
typedef double FLOAT

However, for expressions like

typedef float FLOAT

FLOAT a,b;

a = 1.0;
b = 2.0 + a;

I get pages and pages of warning about possible loss of data in the
conversion from double to float. Is there any way to parameterize the
numerical constants like 1.0 so that they can easily be converted. Or
are my only option to suppress the warning through compiler flags or
cast every numerical constant.

I'm thinking of something like Fortran where you can parameterize a
numerical constant as

1.0_pr

where you can define '_pr' to be your desired precision.

Thanks,
John

Victor Bazarov · Mar 24, 2010

Juha said:
Well, you can write FLOAT(1.0) to get rid of the warning.

Note, however, that if FLOAT is actually larger than a double (eg. a
long double in most compilers/systems), initializing it with a literal
of type double will introduce a rounding error if the literal is not
fully representable as a double. For example, (long double)(0.1) will
have a rounding error of 11 bits in most Intel systems (because a double
has 53 mantissa bits while a long double has 64 mantissa bits, so 11
least significant bits will be lost when initializing the variable).

There's no easy solution to that problem. You could always use
FLOAT(0.1L), and that will work for all basic floating point types.
However, if you ever change FLOAT to be something larger (usually by
using a multiple-precision floating point library), the problem will
appear once again.

Along with the 'typedef' have all the constants to be defined using the
right type. Something like

typedef float FLOAT;
const float one = 1.0f;
const float pointOne = 0.1f;
....
typedef double FLOAT;
const double one = 1.0;
const double pointOne = 0.1;
....

That's an awful mess from the maintenance point of view, but you win
some, and you lose some, yes?

V

Andrew Poelstra · Mar 24, 2010

I'm thinking of something like Fortran where you can parameterize a
numerical constant as

1.0_pr

where you can define '_pr' to be your desired precision.

You can write 1.0f in C. I'm pretty sure C++ is the same.

Paul Bibbings · Mar 25, 2010

John said:
Hi,

I'm writing a code that I would like to compile in either single or
double precision. So, I declare all my floating point variables with
a typedef such as

//typedef float FLOAT
typedef double FLOAT

However, for expressions like

typedef float FLOAT

FLOAT a,b;

a = 1.0;
b = 2.0 + a;

I get pages and pages of warning about possible loss of data in the
conversion from double to float. Is there any way to parameterize the
numerical constants like 1.0 so that they can easily be converted. Or
are my only option to suppress the warning through compiler flags or
cast every numerical constant.

I'm thinking of something like Fortran where you can parameterize a
numerical constant as

1.0_pr

where you can define '_pr' to be your desired precision.

Coincidentally, I've been encountering just the same thing in a project
that I'm currently working on. I haven't tried any such thing as yet,
but I did wonder about something like:

// file: float_type.h

#ifndef FLOAT_TYPE_H_
#define FLOAT_TYPE_H_

#define FLOAT(x) x##F

typedef float FLOAT_TYPE;

#endif /* FLOAT_TYPE_H_ */

// file: main.cc

#include "float_type.h"

int main()
{
FLOAT_TYPE f = FLOAT(1.0);
}

so that:

02:28:30 Paul Bibbings@JIJOU
/cygdrive/d/CPPProjects/nano/float_type $gcc -E main.cc
# 1 "main.cc"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "main.cc"

# 1 "float_type.h" 1

typedef float FLOAT_TYPE;
# 4 "main.cc" 2

int main()
{
FLOAT_TYPE f = 1.0F;
}

and then, with:

// file: float_type.h

#ifndef FLOAT_TYPE_H_
#define FLOAT_TYPE_H_

#define FLOAT(x) x##L

typedef long double FLOAT_TYPE;

#endif /* FLOAT_TYPE_H_ */

you get:

02:32:26 Paul Bibbings@JIJOU
/cygdrive/d/CPPProjects/nano/float_type $gcc -E main.cc
# 1 "main.cc"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "main.cc"

# 1 "float_type.h" 1

typedef long double FLOAT_TYPE;
# 4 "main.cc" 2

int main()
{
FLOAT_TYPE f = 1.0L;
}

Regards

Paul Bibbings

Peter A. Kerzum · Mar 25, 2010

Suppress this particular warning is the most suitable solution in this
real world situation IMHO

Alf P. Steinbach · Mar 25, 2010

* Juha Nieminen:

And when FLOAT changes to something else, what happens?

Is there any compiler that issues diagnostics on e.g.

double const d = 1.0f;

?

I haven't checked but I can't imagine that there is.

Thus, it should not be a problem. However, the OP's idea of compiling the same
code with different kinds of number representations sounds like a problem. A
design level problem.

Cheers,

- Alf

Alf P. Steinbach · Mar 25, 2010

* Juha Nieminen:

1.0 doesn't present any problem. However, what if the initial value is
0.1 instead? Are you aware of the *precision loss* that causes?

Relative to what. It's not like for the in-practice 0.1 will be exact no matter
what standard C++ floating point type is used; in practice all C++ floating
point types are binary. Counting on high precision for an ordinary floating
point literal would in most cases, I think, be asking for trouble.

The 30
least-significant bits (about 9 decimal digits) will be lost. What you
will be assigning to 'd' will not be even near 0.1 (from the point of
view of a double-precision floating point variable).

In practice it never is, depending on the definition of "near".

Where the OP needs the highest precision available, instead of

MyFloat const z = 0.1f;

he should then write

MyFloat const z( 0.1L );

But considering that the whole point is that MyFloat *can* be ordinary float the
code should generally work well with ordinary float precision for literals.

I can think of some exceptional cases where the highest precision is desirable,
like the value for pi (often available as M_PI, if I recall correctly, but
depends on the compiler). Then the construction above might be necessary.

Cheers,

- Alf

Alf P. Steinbach · Mar 25, 2010

* Juha Nieminen:

Why would you *purposefully* want to initialize the variable with
0.100000001490116 when what you want is 0.1?

The OP might want to do that to avoid warnings. After all, that was the stated
problem -- solved by just accepting the float precision for literals. For
precision down in the 10^-8 or somewhere is usually irrelevant, but as I noted,
for e.g. the value of pi one can write e.g. MyFloat( 3.141592653....L ).

Cheers,

- Alf

Alf P. Steinbach · Mar 25, 2010

* Juha Nieminen:

Well, that's the wrong solution to avoid the warning, and suggesting
it as a viable solution is irresponsible.

Not at all, IMO.

But since you're not offering any reasoning as to why you think that, the best I
can do, in addition to the reasoning I've already presented, is to point to you
towards Pete Becker's post else-thread; with that it's opinion (that it might be
a reasonable thing to do) against opinion (irresponsible), with 2 against 1.

OK, it's fallacy to take such things to a vote, but really, I think charges such
as "irresponsible" should be accompanied by at least some modicum of reasoning.

Just because a compiler happens to not issue a warning about the
precision loss of converting "0.1" to float and then to double doesn't
mean it's the correct thing to do.

No, it means that for that compiler (which means just about every compiler) it's
a practical solution to the stated problem. Unless there are strong reasons not
to do it. There are also other practical solutions, but involving more typing.

Cheers,

- Alf

James Kanze · Mar 25, 2010

On Mar 24 said:
I'm thinking of something like Fortran where you can
parameterize a numerical constant as

where you can define '_pr' to be your desired precision.

How? I can't figure out how it could be made to work. How
would you define _pr to cause 1.0_pr to have type float, for
example?

The only ways I know to make 1.0 float are to write it 1.0f, or
to use some sort of explicit conversion operator. The first
requires token pasting, since the 'f' must be part of the token,
and token pasting requires that the 1.0 also be a parameter of
the macro. And all of the forms of explicit conversion I know
require text before the 1.0.

Perhaps you meant something along the lines of:

#define _pr >> MyType()

float operator(double d, MyType)
{
return static_cast<float>(d);
}

This would work for 1.0_pr in isolation, but lead to surprising
results in expressions, because the priority of >> isn't very
high (and I don't think that there are any binary operators with
a high enough priority).

John · Mar 26, 2010

There's no easy solution to that problem. You could always use

FLOAT(0.1L), and that will work for all basic floating point types.
However, if you ever change FLOAT to be something larger (usually by
using a multiple-precision floating point library), the problem will
appear once again.

Hi,

This is probably the best solution I've seen in the thread (or else set
the compiler to ignore the warning).

Thanks

John · Mar 26, 2010

Alf said:
* Juha Nieminen:

The OP might want to do that to avoid warnings. After all, that was the
stated problem -- solved by just accepting the float precision for
literals. For precision down in the 10^-8 or somewhere is usually
irrelevant, but as I noted, for e.g. the value of pi one can write e.g.
MyFloat( 3.141592653....L ).

I'm doing heavy numerical work. The digits are imporant.

However, it's a large code and when I compile I get more than 10000
warnings, so I do want to avoid the warnings. However, I hate to turn
off warnings since they sometimes show real dangers, but those would be
lost in the noise at this point.

John · Mar 26, 2010

Andrew said:
You can write 1.0f in C. I'm pretty sure C++ is the same.

This is not the same as what Fortran is doing. Fortran can truly
parameterize the precision. For example, define a global variable

integer, parameter :: sg = kind(0.D0) ! this is double precision (A)
integer, parameter :: db = kind(0.0) ! this is single precision (B)
integer, parameter :: pr = db ! set the precision here

1.0_pr ! this becomes whatever precision was set above

By selecting either line (A) or (B) I can effectively change the
precision in the whole code. In C, this would be like going through the
code and replacing all your floating points from 1.0f to 1.0 (or
vice-versa).

John

John · Mar 26, 2010

James said:
How? I can't figure out how it could be made to work. How
would you define _pr to cause 1.0_pr to have type float, for
example?

I wasn't saying you can make it work in C/C++, but that you can do it in
Fortran. As I wrote elsewhere, in Fortran you can

integer, parameter :: sg = kind(0.D0) ! this is double precision (A)
integer, parameter :: db = kind(0.0) ! this is single precision (B)
integer, parameter :: pr = db ! set the precision here

real(pr) a = 1.0_pr ! this becomes whatever precision was set above

My question was is there a way to do it C. Apparently there is not.

Java MemoryLayout/ValueLayout Questions.	2	Feb 5, 2023
Floating Point Constants - Inlining Questions	1	Aug 4, 2008
Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024
1.0f vs. 1.0	7	Jul 2, 2007
Template specialization for a list of types	9	Mar 20, 2013
Typedef for float or double	8	Jul 19, 2009
Engineering numerical format PEP discussion	25	Apr 26, 2010
Using type prefixes with floating point constants	0	Mar 26, 2009

parametized numerical constants

John

Victor Bazarov

Andrew Poelstra

Paul Bibbings

Peter A. Kerzum

Alf P. Steinbach

Alf P. Steinbach

Alf P. Steinbach

Alf P. Steinbach

James Kanze

John

John

John

John

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads