Comparing fp types for equality

E

Edward Rutherford

This code for the comparison of fp types is taken from the C FAQ.
Any problems using it in a macro?

/* compare 2 doubles for equality */
#define DBL_ISEQUAL(a,b) (((fabs)(((a)-(b)))<=(DBL_EPSILON)*(fabs)((a))))

Do the same issues involved in comparing 2 fp types for equality
apply to comparing a float to zero? E.g. is if(x == 0.)
considered harmful?
 
N

Nick Keighley

This code for the comparison of fp types is taken from the C FAQ.
Any problems using it in a macro?

/* compare 2 doubles for equality */
#define DBL_ISEQUAL(a,b) (((fabs)(((a)-(b)))<=(DBL_EPSILON)*(fabs)((a))))

it evaluates 'a' twice. This might cause problems.
Do the same issues involved in comparing 2 fp types for equality
apply to comparing a float to zero? E.g. is if(x == 0.)
considered harmful?

It might not give the answer you expect. If x is very small but non-
zero.

if (sin(0.0) == 0.0)

might not yield true (depending how sin() is calculated)
 
T

Tim Prince

This code for the comparison of fp types is taken from the C FAQ.
Any problems using it in a macro?

/* compare 2 doubles for equality */
#define DBL_ISEQUAL(a,b) (((fabs)(((a)-(b)))<=(DBL_EPSILON)*(fabs)((a))))

Do the same issues involved in comparing 2 fp types for equality
apply to comparing a float to zero? E.g. is if(x == 0.)
considered harmful?
You might give thought as to whether if(fabs(x) < DBL_MIN) is intended
in the latter case. In these cases, your result may vary with compiler
options, e.g. abrupt vs. gradual underflow, x87 precision mode, ...
 
J

James Kuyper

This code for the comparison of fp types is taken from the C FAQ.
Any problems using it in a macro?

/* compare 2 doubles for equality */
#define DBL_ISEQUAL(a,b) (((fabs)(((a)-(b)))<=(DBL_EPSILON)*(fabs)((a))))

You will often come across some very good advice that it's unwise to
compare floating point values for equality. The reason is that numbers
that are calculated in two different ways that, mathematically, should
produce exactly the same value (such as 1.0 and 3.0*(1.0/3.0), usually
do not do so if calculated using floating point. That's mainly because
of floating point round-off: at best, a 64-bit floating point format can
represent exactly a maximum of 2^64 different floating point numbers;
real floating point formats represent significantly less that that
number of different values. The best you can hope for from a floating
point calculation is that the result will be one of the two
representable values that is closest to the mathematically correct
value; in many cases the best achievable result has several times that
amount of error. In pathological cases, the uncertainty can be infinite.

The right way to deal with this problem is to compare values with a
comparison tolerance:

#define COMP_TOL(a, b, tol) (fabs((a)-(b)) < (tol))

Your macro is equivalent to setting tol to DBL_EPSILON*fabs(a). That is,
unfortunately, a value that is generally too small to avoid having
exactly the same problems as a direct equality comparison.

The right value to use for 'tol' depends upon how a and b were
calculated; there's a sophisticated science to the estimation of
uncertainties in calculated values, called "propagation of errors". What
I'm about to say is no more than a simplified version of one of the most
basic aspects of that science..

If there's any measurement uncertainty in either a or b, call it sigma_a
and sigma_b, respectively, and those uncertainties are uncorrelated with
each other, then the uncertainty in the difference between a and b is
sqrt(sigma_a*sigma_a + sigma_b*sigma_b)). The C standard library
contains a function that simplifies that calculation: hypot(sigma_a,
sigma_b). If floating point round-off is the ONLY reason for uncertainty
in the value of a, then sigma_a can be approximated by multiplier *
DBL_EPSILON * fabs(a); the value of multiplier is never smaller than
1.0, and gets larger as the complexity of the calculations used to
determine the value of 'a' increases.

Your macro is equivalent to implementing this concept with the
assumptions that sigma_b = 0, implying that b is absolutely accurate,
and that the uncertainty in the value of 'a' is due entirely to only a
single floating point roundoff. This is an incredibly unlikely case.
Do the same issues involved in comparing 2 fp types for equality
apply to comparing a float to zero? E.g. is if(x == 0.)
considered harmful?

The only value that compares exactly equal to 0 is 0 itself; if there's
any uncertainty in the value of x at all, such a comparison will almost
always turn out to be false, even when x is calculated from values that
mathematically should have produced a 0 (such as 1.0-3.0*(1.0/3.0)).
Only do a comparison to 0 without a tolerance if you're certain that x
is exactly equal to the value you're interested in comparing.
 
R

Rui Maciel

Edward said:
This code for the comparison of fp types is taken from the C FAQ.
Any problems using it in a macro?

/* compare 2 doubles for equality */
#define DBL_ISEQUAL(a,b) (((fabs)(((a)-(b)))<=(DBL_EPSILON)*(fabs)((a))))

Do the same issues involved in comparing 2 fp types for equality
apply to comparing a float to zero? E.g. is if(x == 0.)
considered harmful?

The macro normalizes the difference between scalar a and scalar b regarding
scalar a. This may not be acceptable, as, in some border cases, it will
cause a == b but b != a, and such a binary relation is not an equality
relation.

And to nit-pick even further, as the macro compares two scalars then it
should refer to the plural DBL_AREEQUAL().


Rui Maciel
 
B

bert

The macro normalizes the difference between scalar a and scalar b regarding
scalar a. This may not be acceptable, as, in some border cases, it will
cause a == b but b != a, and such a binary relation is not an equality
relation.

And to nit-pick even further, as the macro compares two scalars then it
should refer to the plural DBL_AREEQUAL().


Rui Maciel

So what about comparing with
DBL_EPSILON * (fabs(a) + fabs(b)) / 2,
adding parentheses as advisable?
--
 
R

Rui Maciel

bert said:
So what about comparing with
DBL_EPSILON * (fabs(a) + fabs(b)) / 2,
adding parentheses as advisable?

That would take care of the commutative property problem. The division by 2
is unnecessary, because DBL_EPSILON can be defined as delta/2 from the
start.

Nevertheless, in my opinion this solution is too needlessly bloated. The
fabs() function is needlessly called a considerable number of times, without
any added value. If the plain old Euclidean norm is used, the following
expression will do quite well:

(a-b)*(a-b) < epsilon

This involves two subtractions, a multiplication and a comparisson, without
a single call to fabs(). The compiler may optimize this quite a bit, if we
consider how operators work. For example, GCC converts that expression into
4 or 5 operations, including a couple operations intended to move data
between registers. If the comparisson function is inlined then the compiler
is able to drop 1 or 2 operations. This may not look like much, but if you
need to employ that norm in an iteration then the difference may be
noticeable.


Rui Maciel
 
B

BartC

Rui Maciel said:
That would take care of the commutative property problem. The division by
2
is unnecessary, because DBL_EPSILON can be defined as delta/2 from the
start.

Nevertheless, in my opinion this solution is too needlessly bloated. The
fabs() function is needlessly called a considerable number of times,
without
any added value. If the plain old Euclidean norm is used, the following
expression will do quite well:

(a-b)*(a-b) < epsilon

This involves two subtractions, a multiplication and a comparisson,
without
a single call to fabs().

fabs() usually just involves clearing a single bit. I can't imagine it's a
bit overhead.
 
R

Rui Maciel

BartC said:
fabs() usually just involves clearing a single bit. I can't imagine it's a
bit overhead.

Each call to fabs() may not be computationally intensive on its own, but
it's enough to double the number of operations needed to perform such a
simple task.


Rui Maciel
 
E

Eric Sosman

Each call to fabs() may not be computationally intensive on its own, but
it's enough to double the number of operations needed to perform such a
simple task.

Whence came the numbers being compared? From the evaluation of a
Chebyshev series, or maybe the quotient of a pair of polynomial
approximations with iterative improvement, or (gasp!) an I/O device?

An old acquaintance used to speak of clearing the beach of bottle
caps and cigarette butts, so the sand would be nice and clean around
the whale carcasses. In other words, don't sweat the small stuff.
 
K

Kaz Kylheku

The macro normalizes the difference between scalar a and scalar b regarding
scalar a. This may not be acceptable, as, in some border cases, it will
cause a == b but b != a, and such a binary relation is not an equality
relation.

It is not an equivalence relation because it is only symmetric and reflexive,
but not transitive: DBL_ISEQUAL(A, B) and DBL_ISEQUAL(B, C) does not imply
DBL_ISEQUAL(A, C). In other words, the relation fails to exhaustively
partition the domain into equivalence classes. I would call this "isnear" not
"isequal".
 
R

Rui Maciel

Eric said:
Whence came the numbers being compared? From the evaluation of a
Chebyshev series, or maybe the quotient of a pair of polynomial
approximations with iterative improvement, or (gasp!) an I/O device?

An old acquaintance used to speak of clearing the beach of bottle
caps and cigarette butts, so the sand would be nice and clean around
the whale carcasses. In other words, don't sweat the small stuff.

I see what you mean and it's true that wasting time with small stuff is,
well, wasting time. Nevertheless, being "small stuff" is relative, in the
sense that some stuff may only be considered small when compared to other
stuff which may not be that small. As you've implied, once we compare a
meek implementation of DBL_ISEQUAL() test to complete implementations of
some algorithms, it doesn't make much sense wasting time overanalyzing a
small function. Yet, this implementation of DBL_ISEQUAL() was being
compared with an alternative implementation of the same function, which
means that, in this context, the small stuff may not be that small. In this
case, and when compiling the code to a x86-64 platform, while the straight
forward implementation of the Euclidean norm takes 5 instructions, the
fabs()-based implementation takes 19 instructions. Depending on the
algorithm, the effect of this might range from the irrelevant to the clearly
noticeable, but if we have a choice then we certainly should choose the best
option. After all, just because premature optimization is a bad thing it
doesn't mean that we should intentionally write or use bloated code.


Rui Maciel
 
R

Rui Maciel

Edward said:
This code for the comparison of fp types is taken from the C FAQ.
Any problems using it in a macro?

/* compare 2 doubles for equality */
#define DBL_ISEQUAL(a,b) (((fabs)(((a)-(b)))<=(DBL_EPSILON)*(fabs)((a))))

Do the same issues involved in comparing 2 fp types for equality
apply to comparing a float to zero? E.g. is if(x == 0.)
considered harmful?

I've just noticed a considerable problem with your test macro. If a is zero
then DBL_ISEQUAL(a,b) is only true if b is also zero. This can cause
serious problems, such as infinite loops.


Rui Maciel
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,022
Latest member
MaybelleMa

Latest Threads

Top