comparison on non-integer types

R

Richard

Mark McIntyre said:
Er, no.

All comparisons of floating point values are risky. If this isn't a
FAQ, I'll be surprised, but in any events the reason is simply that FP
isn't exact and you might have two numbers which mathematically ought
to be equal, but which won't be due to rounding and precision effects.

Then they are not "equal". Simple enough.
In real world situations I've seen live systems fall over because
comparisons like

if (x >= 100.00)

failed when x was retrieved from a database, even when the database
admin tools told me the value stored was 100.0000000000.

The it was the error of whoever bound the variable to the database value.

100.00 in the database should convert to "exact" 100.00 float.
 
A

Army1987

Try real hard to use copy and paste,
or something just like copy and paste,
instead of purporting to have posted a program that you ran.
What's wrong with that statement?
 
A

Army1987

sin(1e100)
The argument of the function is approximately equal to the
reciprocal of the probability that the function is ever called
with that argument for any reason other than making a point...
 
U

user923005

That way, it's possible to have a number
which compares equal to two other numbers,
which do not compare equal to each other.

The other way, it is possible to have two numbers which are the same
mathematically but which do not compare the same. Is that better?
There's no place for EPSILON's
if you're writing a qsort compar function for an array of floats.

I disagree. The above comparison functions will produce potentially a
different sort order. But not one that is less correct. And in
general, a comparision without a sphere of uncertainty is not a
correct comparison for floating point.

People tend to get surprised when (1/7)*7 is not equal to 1.
 
U

user923005

The other way, it is possible to have two numbers which are the same
mathematically but which do not compare the same. Is that better?


I disagree. The above comparison functions will produce potentially a
different sort order. But not one that is less correct. And in
general, a comparision without a sphere of uncertainty is not a
correct comparison for floating point.

People tend to get surprised when (1/7)*7 is not equal to 1.

See also the GSL function:
/gsl-1.9/sys/fcmp.c
for a similar approach (obviously also inspired by TAOCP Volume 2:
"Seminumerical Algorithms", pages 233-235)
 
P

pete

user923005 said:
The other way, it is possible to have two numbers which are the same
mathematically but which do not compare the same. Is that better?


I disagree. The above comparison functions will produce potentially a
different sort order. But not one that is less correct. And in
general, a comparision without a sphere of uncertainty is not a
correct comparison for floating point.

Those functions can produce a different sort order
depending on which algorithm the sorting function uses.
For a qsort compar function,
the ordering of the array must be completely
defined by the compar functions

N869
7.20.5 Searching and sorting utilities
[#4] When the same objects (consisting of size bytes,
irrespective of their current positions in the array) are
passed more than once to the comparison function, the
results shall be consistent with one another. That is, for
qsort they shall define a total ordering on the array, and
for bsearch the same object shall always compare the same
way with the key.
 
U

user923005

The other way, it is possible to have two numbers which are the same
mathematically but which do not compare the same. Is that better?
I disagree. The above comparison functions will produce potentially a
different sort order. But not one that is less correct. And in
general, a comparision without a sphere of uncertainty is not a
correct comparison for floating point.

Those functions can produce a different sort order
depending on which algorithm the sorting function uses.
For a qsort compar function,
the ordering of the array must be completely
defined by the compar functions

N869
7.20.5 Searching and sorting utilities
[#4] When the same objects (consisting of size bytes,
irrespective of their current positions in the array) are
passed more than once to the comparison function, the
results shall be consistent with one another. That is, for
qsort they shall define a total ordering on the array, and
for bsearch the same object shall always compare the same
way with the key.

Of course it is also possible that:

double x = get_user_input();
if (x == x) return 0;
return 1;

will return 1. If there is an interrupt between when the first copy
of x gets pushed into a floating point register and the second copy
gets pushed into a floating point register, and if the machine has
extended precision (e.g 80 bits for Intel CPUs) and is in a mode where
the extra precision can be used, then one of the instances may get
some bits trimmed off and the comparison will say unequal.

Using the normal relational operators does not reliably solve the
problem. That may be the reason that for OpenVMS floating point is
not even allowed in a key field by the operating system.
 
C

Charlie Gordon

user923005 said:
I believe I have posted this before:

#include <float.h>
#include <math.h>

int double_compare (double d1, double d2)
{
if (d1 > d2)
if ((d1 - d2) < fabs (d1 * DBL_EPSILON))
return 0;
else
return 1;
if (d1 < d2)
if ((d2 - d1) < fabs (d2 * DBL_EPSILON))
return 0;
else
return -1;
return 0;
}

Your code does not produce the expected behaviour around 0, especially if d2
< 0 and d1 == 0...
You should compute eps = fmax(fabs(d1), fabs(d2)) * DBL_EPSILON and use it
in the comparisons.
 
C

Charlie Gordon

user923005 said:
Of course it is also possible that:

double x = get_user_input();
if (x == x) return 0;
return 1;

will return 1. If there is an interrupt between when the first copy
of x gets pushed into a floating point register and the second copy
gets pushed into a floating point register, and if the machine has
extended precision (e.g 80 bits for Intel CPUs) and is in a mode where
the extra precision can be used, then one of the instances may get
some bits trimmed off and the comparison will say unequal.

That's an operating system bug, change operating systems.

On the other hand, if get_user_input() returns a NaN, the above code will
return 1.
Using the normal relational operators does not reliably solve the
problem. That may be the reason that for OpenVMS floating point is
not even allowed in a key field by the operating system.

Letting the OS handle database stuff was definitaely a design flaw.
 
U

user923005

"user923005" <[email protected]> a écrit dans le message de (e-mail address removed)...









Your code does not produce the expected behaviour around 0, especially if d2
< 0 and d1 == 0...
You should compute eps = fmax(fabs(d1), fabs(d2)) * DBL_EPSILON and use it
in the comparisons.

I don't understand what you mean. Can you show an example where it
goes wrong?

#include <stdio.h>
#include <float.h>
#include <math.h>


int double_compare(double d1, double d2)
{
if (d1 > d2)
if ((d1 - d2) < fabs(d1 * DBL_EPSILON))
return 0;
else
return 1;
if (d1 < d2)
if ((d2 - d1) < fabs(d2 * DBL_EPSILON))
return 0;
else
return -1;
return 0;
}

int main(void)
{
double d1,
d2;
int result;
d1 = -1.000000000001e-19;
d2 = 0;
result = double_compare(d1, d2);
if (result < 0)
printf("%g is less than %g\n", d1, d2);
if (result > 0)
printf("%g is more than %g\n", d1, d2);
if (result == 0)
printf("%g is equal to %g\n", d1, d2);
d1 = -1.000000000001e-15;
result = double_compare(d1, d2);
if (result < 0)
printf("%g is less than %g\n", d1, d2);
if (result > 0)
printf("%g is more than %g\n", d1, d2);
if (result == 0)
printf("%g is equal to %g\n", d1, d2);

return 0;
}
/*
-1e-019 is less than 0
-1e-015 is less than 0
*/
 
O

Old Wolf

What's wrong with that statement?

It has at least two errors. If the missing quotation
mark is supplied, it invokes undefined behaviour in
C90 because the 'l' modifier is not defined for the
%f printf format specifier.
 
C

Charlie Gordon

user923005 said:
The other way, it is possible to have two numbers which are the same
mathematically but which do not compare the same. Is that better?


I disagree. The above comparison functions will produce potentially a
different sort order. But not one that is less correct. And in
general, a comparision without a sphere of uncertainty is not a
correct comparison for floating point.

qsort is not guaranteed to work if the comparison function does not define a
full order on the whole set of values in the array. This function is
inappropriate for sorting an array. A simplistic function that performs
straight floating point comparisons is not appropriate either :

int compare_doubles(const double *pa, const double *pb) {
return (*pb < *pa) - (*pa < *pb);
}

it defines a complete order on all doubles except NaNs.
it does not either distinguish 0.0 and -0.0.

Designing a comparison function to property sort on floating point values is
non trivial. I wonder why none is provided in math.h.
 
U

user923005

"user923005" <[email protected]> a écrit dans le message de (e-mail address removed)...








qsort is not guaranteed to work if the comparison function does not define a
full order on the whole set of values in the array. This function is
inappropriate for sorting an array. A simplistic function that performs
straight floating point comparisons is not appropriate either :

int compare_doubles(const double *pa, const double *pb) {
return (*pb < *pa) - (*pa < *pb);

}

it defines a complete order on all doubles except NaNs.
it does not either distinguish 0.0 and -0.0.

Designing a comparison function to property sort on floating point values is
non trivial. I wonder why none is provided in math.h.

For *sorting* floating point, I use this trick from Terje Mathisen
(obviously, you have to write one of these for each and every distinct
floating point that you use):

#include <assert.h>

typedef unsigned long uint32;
#define SB_MASK32 0x80000000UL

#ifdef _MSC_VER
typedef unsigned __int64 uint64;
typedef __int64 sint64;
#define SB_MASK64 0x8000000000000000ui64
#else
typedef unsigned long long uint64;
typedef long long sint64;
#define SB_MASK64 0x8000000000000000ULL
#endif
extern uint32 float2key(float f);
uint64 double2key(double d);
uint32
float2key(float f)
{
uint32 sign,
mant,
mask;

assert(sizeof(float) == sizeof(uint32));
mant = *(uint32 *) & f; /* Load float as array of bits */
sign = mant & SB_MASK32; /* Isolate the leading sign bit */
mant ^= SB_MASK32; /* Invert the sign bit, making + > -
*/
mask = sign - (sign >> 31); /* Either 0 or 0x7fffffff */
mant ^= mask; /* Invert exp and mant if negative */
return mant;
}

uint64
double2key(double d)
{
uint64 sign,
mant,
mask;

assert(sizeof(double) == sizeof(uint64));
mant = *(uint64 *) & d; /* Load float as array of bits */
sign = mant & SB_MASK64; /* Isolate the leading sign bit */
mant ^= SB_MASK64; /* Invert the sign bit, making + > -
*/
mask = sign - (sign >> 63); /* Either 0 or 0x7fffffffffffffff */
mant ^= mask; /* Invert exp and mant if negative */
return mant;
}


But for comparison (e.g. in a database operation) I use the function
that I presented (actually, one rather similar to that)
 
K

Keith Thompson

Charlie Gordon said:
qsort is not guaranteed to work if the comparison function does not define a
full order on the whole set of values in the array. This function is
inappropriate for sorting an array. A simplistic function that performs
straight floating point comparisons is not appropriate either :

int compare_doubles(const double *pa, const double *pb) {
return (*pb < *pa) - (*pa < *pb);
}

For one thing, the comparison function must take 'const void*'
arguments, but that's easily fixed.
it defines a complete order on all doubles except NaNs.

That's ok if you happen to know that there are no NaNs in the array.
The standard requires the comparison function to define a total
ordering on the array elements, not necessarily on all elements of the
type. Sorting an array containing NaNs doesn't necessarily make sense
anyway.

If you do need to worry about NaNs, of course, then the comparison
function will have to deal with them. For example, you could
arbitrarily say that all NaNs are equal to each other, less than any
real value, and greater than -Infinity.
it does not either distinguish 0.0 and -0.0.

It may or may not need to, depending on what you're trying to do.
Designing a comparison function to property sort on floating point
values is non trivial. I wonder why none is provided in math.h.

Perhaps because there's no one clear way to deal with NaNs and with
-0.0 (though treating -0.0 as less than +0.0 wouldn't cause any harm
even if you think of them as equal).
 
C

Charlie Gordon

user923005 said:
For *sorting* floating point, I use this trick from Terje Mathisen
(obviously, you have to write one of these for each and every distinct
floating point that you use):

#include <assert.h>

typedef unsigned long uint32;
#define SB_MASK32 0x80000000UL

#ifdef _MSC_VER
typedef unsigned __int64 uint64;
typedef __int64 sint64;
#define SB_MASK64 0x8000000000000000ui64
#else
typedef unsigned long long uint64;
typedef long long sint64;
#define SB_MASK64 0x8000000000000000ULL
#endif
extern uint32 float2key(float f);
uint64 double2key(double d);
uint32
float2key(float f)
{
uint32 sign,
mant,
mask;

assert(sizeof(float) == sizeof(uint32));
mant = *(uint32 *) & f; /* Load float as array of bits */
sign = mant & SB_MASK32; /* Isolate the leading sign bit */
mant ^= SB_MASK32; /* Invert the sign bit, making + > -
*/
mask = sign - (sign >> 31); /* Either 0 or 0x7fffffff */
mant ^= mask; /* Invert exp and mant if negative */
return mant;
}

uint64
double2key(double d)
{
uint64 sign,
mant,
mask;

assert(sizeof(double) == sizeof(uint64));
mant = *(uint64 *) & d; /* Load float as array of bits */
sign = mant & SB_MASK64; /* Isolate the leading sign bit */
mant ^= SB_MASK64; /* Invert the sign bit, making + > -
*/
mask = sign - (sign >> 63); /* Either 0 or 0x7fffffffffffffff */
mant ^= mask; /* Invert exp and mant if negative */
return mant;
}


But for comparison (e.g. in a database operation) I use the function
that I presented (actually, one rather similar to that)

Do you then sort with a signed 32 and 64 bit int comparison ?
I doubt this approch solves the NaN issue.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top