comparison on non-integer types

P

Pietro Cerutti

Hi group,
I always thought that applying a binary operator such as ==, !=, <= or
= to two double or float values was an operation which results were
well defined.

Now, I'm passing a program through splint[1] and it says:

Dangerous equality comparison involving double types:
l_lnk1->lnk_freq == l_lnk2->lnk_freq
Two real (float, double, or long double) values are compared directly
using == or != primitive. This may produce unexpected results since
floating point representations are inexact. Instead, compare the
difference to FLT_EPSILON or DBL_EPSILON.

l_lnk1->lnk_freq and l_lnk2->lnk_freq being defined as double values.

Is it really better to write the comparison as
(l_lnk1->lnk_freq - l_lnk2->lnk_freq < DBL_EPSILON)
or is splint being over-pedantic in this situation?

Thank you!

[1] http://www.splint.org/
 
P

Pietro Cerutti

Pietro said:
Hi group,
I always thought that applying a binary operator such as ==, !=, <= or
= to two double or float values was an operation which results were
well defined.

Now, I'm passing a program through splint[1] and it says:

Dangerous equality comparison involving double types:
l_lnk1->lnk_freq == l_lnk2->lnk_freq
Two real (float, double, or long double) values are compared directly
using == or != primitive. This may produce unexpected results since
floating point representations are inexact. Instead, compare the
difference to FLT_EPSILON or DBL_EPSILON.

l_lnk1->lnk_freq and l_lnk2->lnk_freq being defined as double values.

Is it really better to write the comparison as
(l_lnk1->lnk_freq - l_lnk2->lnk_freq < DBL_EPSILON)
or is splint being over-pedantic in this situation?

Thank you!

[1] http://www.splint.org/

P.S. just to show that I've done my own homework:
n1124: 6.2.5 Types
18 Integer and floating types are collectively called arithmetic types.

n1123: 6.5.9 Equality operators
1 equality-expression:
relational-expression
equality-expression == relational-expression
equality-expression != relational-expression
Constraints
2 One of the following shall hold:
— both operands have arithmetic type;

Thus, it should be fully legal to compare two double values with, e.g., ==.
 
F

Flash Gordon

Pietro Cerutti wrote, On 05/09/07 18:59:
Hi group,
I always thought that applying a binary operator such as ==, !=, <= or
= to two double or float values was an operation which results were
well defined.

Now, I'm passing a program through splint[1] and it says:

Dangerous equality comparison involving double types:
l_lnk1->lnk_freq == l_lnk2->lnk_freq
Two real (float, double, or long double) values are compared directly
using == or != primitive. This may produce unexpected results since
floating point representations are inexact. Instead, compare the
difference to FLT_EPSILON or DBL_EPSILON.

l_lnk1->lnk_freq and l_lnk2->lnk_freq being defined as double values.

Is it really better to write the comparison as
(l_lnk1->lnk_freq - l_lnk2->lnk_freq < DBL_EPSILON)
or is splint being over-pedantic in this situation?

It is best to understand the real issues and then make an appropriate
decision based on your particular situation.

The basic problem is that computers are finite. Therefore when working
with float or double you tend to get only approximate results. Exactly
what you are doing with them and the vagaries of your particular
implementation will determine how fast and in what direction those
errors grow. Then there is the question of whether it is better for your
equality test to return true for numbers that theoretically should be
different or false for numbers that theoretically be equal.

I suggest you look at the section of the comp.lang.c FAQ at
http://c-faq.com/ that is all about floating point numbers. It is by no
means exhaustive, but it is a start.
 
F

Flash Gordon

Pietro Cerutti wrote, On 05/09/07 19:08:

P.S. just to show that I've done my own homework:

Always helps to show you have done some research, also helps to prevent
people pointing out what you already know.
n1124: 6.2.5 Types

Thus, it should be fully legal to compare two double values with, e.g., ==.

It is legal, splint is complaining because it is generally inadvisable.
Splint, and all the other lint tools, are mainly there to pick up on
things which whilst legal (in the sense of not requiring a diagnostic,
i.e. error or warning) are generally a bad idea.
 
W

Walter Roberson

Now, I'm passing a program through splint[1] and it says:
Dangerous equality comparison involving double types:
l_lnk1->lnk_freq == l_lnk2->lnk_freq
Two real (float, double, or long double) values are compared directly
using == or != primitive. This may produce unexpected results since
floating point representations are inexact. Instead, compare the
difference to FLT_EPSILON or DBL_EPSILON.
Is it really better to write the comparison as
(l_lnk1->lnk_freq - l_lnk2->lnk_freq < DBL_EPSILON)
or is splint being over-pedantic in this situation?


Split is correct. If you have two floating point numbers, X and Y,
and one was not created by assigning the other to it
(e.g., Y = X; ), then the numbers will not necessarily compare
the same even if you use the same numeric code to compute them.

For example,

X = 1./7.; Y = 1./7.;

X and Y will not necessarily compare equal: even with this
simple example, X and Y could end up differing in the last bit.

And if you have more complex formulations such as

X = 9./10.; Y = 1./100. * 90.;

then more than just the last bit could end up differing.

Another example: Y = P + X - P; then X and Y will not necessarily
come out the same, not even if P is 0.


If you -have- used a direct assignment,

Y = X;

then if my memory is correct, then Y "should" compare equal to X
according to the abstract machine, but (if my memory is correct),
there are compilers for x86 machines where the abstract machine
semantics can be violated for the last bit: X might be in an
80 bit floating point register, and if the compiler does not take
the time to round the calculation to 64 bits at each step
[for speed, or because the user wants higher accuracy], then
if Y is in memory (instead of in a CPU register), the storage
to Y rounds (or truncates) the value; the comparison of X == Y
would likely load Y into an 80-bit CPU register on x86, and if
the 80-bit extension of the rounded Y doesn't happen to compare
exactly equal to the 80-bit *un*rounded X, the comparison will fail.
I believe this behaviour violates C's "as if" rules, but it does
happen in practice.
 
P

Pietro Cerutti

Walter said:
Now, I'm passing a program through splint[1] and it says:
Dangerous equality comparison involving double types:
l_lnk1->lnk_freq == l_lnk2->lnk_freq
Two real (float, double, or long double) values are compared directly
using == or != primitive. This may produce unexpected results since
floating point representations are inexact. Instead, compare the
difference to FLT_EPSILON or DBL_EPSILON.
Is it really better to write the comparison as
(l_lnk1->lnk_freq - l_lnk2->lnk_freq < DBL_EPSILON)
or is splint being over-pedantic in this situation?


Split is correct. If you have two floating point numbers, X and Y,
and one was not created by assigning the other to it
(e.g., Y = X; ), then the numbers will not necessarily compare
the same even if you use the same numeric code to compute them.

For example,

X = 1./7.; Y = 1./7.;

X and Y will not necessarily compare equal: even with this
simple example, X and Y could end up differing in the last bit.

And if you have more complex formulations such as

X = 9./10.; Y = 1./100. * 90.;

then more than just the last bit could end up differing.

Right, it actually happens:
cat test.c
#include <stdio.h>
#include <float.h>

#define DBL_CMP(x,y) (x-y < DBL_EPSILON)

int main(void)
{
double a, b;

a = 9./10;
b = 9./100*10;

printf("%lf and %lf are %sequal\n", a, b, (a == b) ? "" : "un");
printf("%lf and %lf are %sequal\n", a, b, (DBL_CMP(a,b)) ? " " : un");

return (0);
}
gcc -ggdb -W -Wall -std=c99 -pedantic -Wbad-function-cast -Wcast-align
-Wcast-qual -Wchar-subscripts -Winline -Wmissing-prototypes
-Wnested-externs -Wpointer-arith -Wredundant-decls -Wshadow
-Wstrict-prototypes -Wwrite-strings -o test test.c
0.900000 and 0.900000 are unequal
0.900000 and 0.900000 are equal

Given the above, is the standard to be considered right in saying that
== is defined for arithmetic (including double) values?

Thanks!
 
M

Mark McIntyre

Hi group,
I always thought that applying a binary operator such as ==, !=, <= or
well defined.

Er, no.

All comparisons of floating point values are risky. If this isn't a
FAQ, I'll be surprised, but in any events the reason is simply that FP
isn't exact and you might have two numbers which mathematically ought
to be equal, but which won't be due to rounding and precision effects.
Is it really better to write the comparison as
(l_lnk1->lnk_freq - l_lnk2->lnk_freq < DBL_EPSILON)
or is splint being over-pedantic in this situation?

In real world situations I've seen live systems fall over because
comparisons like

if (x >= 100.00)

failed when x was retrieved from a database, even when the database
admin tools told me the value stored was 100.0000000000.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
M

Mark McIntyre

0.900000 and 0.900000 are unequal
0.900000 and 0.900000 are equal

Given the above, is the standard to be considered right in saying that
== is defined for arithmetic (including double) values?

Defined and guaranteed are different.

The result is defined - its zero if the arguments are identical,
nonzero otherwise.

The result is however not guaranteed to be what you expect, based on
the laws of maths, because computers have finite precision.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
E

Erik Trulsson

Pietro Cerutti said:
Pietro said:
Hi group,
I always thought that applying a binary operator such as ==, !=, <= or
= to two double or float values was an operation which results were
well defined.

Now, I'm passing a program through splint[1] and it says:

Dangerous equality comparison involving double types:
l_lnk1->lnk_freq == l_lnk2->lnk_freq
Two real (float, double, or long double) values are compared directly
using == or != primitive. This may produce unexpected results since
floating point representations are inexact. Instead, compare the
difference to FLT_EPSILON or DBL_EPSILON.

l_lnk1->lnk_freq and l_lnk2->lnk_freq being defined as double values.

Is it really better to write the comparison as
(l_lnk1->lnk_freq - l_lnk2->lnk_freq < DBL_EPSILON)
or is splint being over-pedantic in this situation?

Thank you!

[1] http://www.splint.org/

P.S. just to show that I've done my own homework:
n1124: 6.2.5 Types
18 Integer and floating types are collectively called arithmetic types.

n1123: 6.5.9 Equality operators
1 equality-expression:
relational-expression
equality-expression == relational-expression
equality-expression != relational-expression
Constraints
2 One of the following shall hold:
â?? both operands have arithmetic type;

Thus, it should be fully legal to compare two double values with, e.g., ==.

Yes, of course it is *legal* to compare floating point values for equality.
It is just a bad idea. It is very often the case that two doubles will not
compare equal, even if an inexperienced programmer thinks they should.

Take for example the following program:


#include <stdio.h>

int main(void)
{
double a = 2.2;

a = a - 2.1;

if(a == 0.1)
printf("Equal\n");
else
printf("Not equal\n");

return 0;
}



Naively one would expect it to always print "Equal" when run, but that
is not necessarily the case. In fact using most compilers on most architectures
the program will probably print "Not equal".

The reason is that most numbers cannot be represented exactly in the computer, so
what is stored in a floating point variable will be an approximation of the real value.
Such errors can easily be compunded by using these approximations in calculations.

In the example above none of the values involved (2.2, 2.1, 0.1) can be represented exactly
using the most common floating-point formats.
When you subtract the internal representation of 2.1 from the representation of 2.2, you get
a new value which is slightly different from 0.1, and which is also slightly different from
the internal representation of 0.1.
 
K

Keith Thompson

Mark McIntyre said:
Defined and guaranteed are different.

The result is defined - its zero if the arguments are identical,
nonzero otherwise.

The result is 1 if the arguments are identical, 0 otherwise.
The result is however not guaranteed to be what you expect, based on
the laws of maths, because computers have finite precision.

Right.

Suppose you have two numbers that would be mathematically equal if the
computer operated on infintely precise real numbers, but that are
unequal becase of rounding errors. The real question is not so much
whether they're equal; it's *how should your program behave*? That's
something that only you can decide.

Note that comparing the difference to FLT_EPSILON or DBL_EPSILON, as
splint suggests, For one thing, FLT_EPSILON and DBL_EPSILON are
relevant only for numbers close to 1.0 or -1.0. For another,
accumulated roundoff errors can become arbitrarily large, depending on
how you implement your algorithm.

To summarize: "Math is hard".

See also David Goldberg's classic paper "What Every Computer Scientist
Should Know About Floating-Point Arithmetic" (Google it).
 
C

christian.bau

Hi group,
I always thought that applying a binary operator such as ==, !=, <= or>= to two double or float values was an operation which results were

well defined.

Now, I'm passing a program through splint[1] and it says:

Dangerous equality comparison involving double types:
l_lnk1->lnk_freq == l_lnk2->lnk_freq
Two real (float, double, or long double) values are compared directly
using == or != primitive. This may produce unexpected results since
floating point representations are inexact. Instead, compare the
difference to FLT_EPSILON or DBL_EPSILON.

l_lnk1->lnk_freq and l_lnk2->lnk_freq being defined as double values.

Is it really better to write the comparison as
(l_lnk1->lnk_freq - l_lnk2->lnk_freq < DBL_EPSILON)
or is splint being over-pedantic in this situation?

It is more complicated.

For a good floating-point implementation, comparing two floating point
numbers for equality is perfectly safe. The result will be 1 if they
are equal and 0 if they are not equal. And there are the special cases
that +0 == -0 and NaN !- NaN.

That will tell you that two floating-point numbers are equal. That,
however, may or may not be what you wanted to know. For example, 1.0 /
3.0 + 2.0 / 3.0 == 1.0 and 1.0 + 1e-20 == 1.0 may or may not give the
result you expect. Just because you think two floating point numbers
should be the same/different, doesn't mean they are.

Now lets say you write a compiler, and you need to keep track of all
different floating point numbers used in a program. You better compare
for equality, and don't do anything stupid with DBL_EPSILON and the
like. Or you want to sort an array of numbers: Are you sure your
sorting algorithm will work if a == b and b == c but a != c? As soon
as you go down the route of checking whether numbers are nearby
instead of equal, you just have to be very careful.
 
M

Mark McIntyre

(snip my idiocy)

The result is 1 if the arguments are identical, 0 otherwise.

D'oh! of course it is. Gak.
accumulated roundoff errors can become arbitrarily large, depending on
how you implement your algorithm.

mhm - involve logs and powers somewhere, and watch 'em multiply.

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
K

Keith Thompson

Mark McIntyre said:
On Wed, 05 Sep 2007 12:54:19 -0700, in comp.lang.c , Keith Thompson


mhm - involve logs and powers somewhere, and watch 'em multiply.

sin(1e100)
 
P

pete

Pietro said:
printf("%lf and %lf are %sequal\n", a, b, (DBL_CMP(a,b)) ? " " : un");

Try real hard to use copy and paste,
or something just like copy and paste,
instead of purporting to have posted a program that you ran.
 
P

pete

Pietro said:
Hi group,
I always thought that applying a binary operator such as ==, !=, <= or
= to two double or float values was an operation which results were
well defined.

Now, I'm passing a program through splint[1] and it says:

Dangerous equality comparison involving double types:
l_lnk1->lnk_freq == l_lnk2->lnk_freq
Two real (float, double, or long double) values are compared directly
using == or != primitive. This may produce unexpected results since
floating point representations are inexact. Instead, compare the
difference to FLT_EPSILON or DBL_EPSILON.

l_lnk1->lnk_freq and l_lnk2->lnk_freq being defined as double values.

Is it really better to write the comparison as
(l_lnk1->lnk_freq - l_lnk2->lnk_freq < DBL_EPSILON)
or is splint being over-pedantic in this situation?

I agree with christian.bau's statement: "It is more complicated.";
and also with Flash Gordon's
"It is best to understand the real issues
and then make an appropriate decision
based on your particular situation."

If you take a look at the math functions at
http://www.mindspring.com/~pfilandr/C/fs_math/fs_math.c
the !=, ==, >=, and > operators
are all used to compare double values,
but as far as I can tell, the usage is appropriate.

In the same file,
there are also examples where DBL_EPSILON needs to be, and is, used.
 
U

user923005

Pietro Cerutti wrote, On 05/09/07 18:59:




Hi group,
I always thought that applying a binary operator such as ==, !=, <= or
well defined.
Now, I'm passing a program through splint[1] and it says:
Dangerous equality comparison involving double types:
l_lnk1->lnk_freq == l_lnk2->lnk_freq
Two real (float, double, or long double) values are compared directly
using == or != primitive. This may produce unexpected results since
floating point representations are inexact. Instead, compare the
difference to FLT_EPSILON or DBL_EPSILON.
l_lnk1->lnk_freq and l_lnk2->lnk_freq being defined as double values.
Is it really better to write the comparison as
(l_lnk1->lnk_freq - l_lnk2->lnk_freq < DBL_EPSILON)
or is splint being over-pedantic in this situation?

It is best to understand the real issues and then make an appropriate
decision based on your particular situation.

The basic problem is that computers are finite. Therefore when working
with float or double you tend to get only approximate results. Exactly
what you are doing with them and the vagaries of your particular
implementation will determine how fast and in what direction those
errors grow. Then there is the question of whether it is better for your
equality test to return true for numbers that theoretically should be
different or false for numbers that theoretically be equal.

I suggest you look at the section of the comp.lang.c FAQ athttp://c-faq.com/that is all about floating point numbers. It is by no
means exhaustive, but it is a start.

I believe I have posted this before:

#include <float.h>
#include <math.h>

int double_compare (double d1, double d2)
{
if (d1 > d2)
if ((d1 - d2) < fabs (d1 * DBL_EPSILON))
return 0;
else
return 1;
if (d1 < d2)
if ((d2 - d1) < fabs (d2 * DBL_EPSILON))
return 0;
else
return -1;
return 0;
}

int float_compare (float d1, float d2)
{
if (d1 > d2)
if ((d1 - d2) < fabsf (d1 * FLT_EPSILON))
return 0;
else
return 1;
if (d1 < d2)
if ((d2 - d1) < fabsf (d2 * FLT_EPSILON))
return 0;
else
return -1;
return 0;
}
 
C

CBFalconer

Pietro said:
I always thought that applying a binary operator such as ==, !=,
<= or >= to two double or float values was an operation which
results were well defined.

Now, I'm passing a program through splint[1] and it says:

Dangerous equality comparison involving double types:
l_lnk1->lnk_freq == l_lnk2->lnk_freq
Two real (float, double, or long double) values are compared
directly using == or != primitive. This may produce unexpected
results since floating point representations are inexact.
Instead, compare the difference to FLT_EPSILON or DBL_EPSILON.

l_lnk1->lnk_freq and l_lnk2->lnk_freq being defined as double
values.

Is it really better to write the comparison as
(l_lnk1->lnk_freq - l_lnk2->lnk_freq < DBL_EPSILON)
or is splint being over-pedantic in this situation?

Your expression is over-simplified. You have to take the absolute
value of the difference, and then you also have to scale either
that of DBL_EPSILON to the items compared.

Try to arrange your code to depend on <, >, <=, >= only.
 
P

pete

user923005 said:
I believe I have posted this before:

#include <float.h>
#include <math.h>

int double_compare (double d1, double d2)
{
if (d1 > d2)
if ((d1 - d2) < fabs (d1 * DBL_EPSILON))
return 0;
else
return 1;
if (d1 < d2)
if ((d2 - d1) < fabs (d2 * DBL_EPSILON))
return 0;
else
return -1;
return 0;
}

int float_compare (float d1, float d2)
{
if (d1 > d2)
if ((d1 - d2) < fabsf (d1 * FLT_EPSILON))
return 0;
else
return 1;
if (d1 < d2)
if ((d2 - d1) < fabsf (d2 * FLT_EPSILON))
return 0;
else
return -1;
return 0;
}

That way, it's possible to have a number
which compares equal to two other numbers,
which do not compare equal to each other.

There's no place for EPSILON's
if you're writing a qsort compar function for an array of floats.
 
T

Thad Smith

Mark said:

The comparisons are well defined and, in my experience, yield
predictable results. The problem is not accounting for the variation
caused by rounding of intermediate results and approximations of
transcendental functions used to calculate the operands being compared.
In the past I used long double to successfully manipulate integers
that didn't fit into the largest native integer type (long).
In real world situations I've seen live systems fall over because
comparisons like

if (x >= 100.00)

failed when x was retrieved from a database, even when the database
admin tools told me the value stored was 100.0000000000.

Let's be careful out there...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top