sanity check - floating point comparison

R

Richard Herring

Phlip said:
RH said "perfectly well defined", not "perfectly equal".

I don't know if The Standard says two floats with the same bit pattern must
compare equal.

I think dan2online is the only poster who's raised the issue of bit
patterns.
I will not rely on that, and I can envision a CPU with an
arithmetic logic unit that optimizes such a comparison in some inscrutable
way that breaks equality. So I would prefer my compiler to produce fast
opcodes, not opcodes that hold my hand.
I would hope that we can rely on floating-point numbers being ordered
such that for any pair X, Y of "normal" floating values (i.e. excluding
things like NaN) exactly one of X<Y, X==Y, Y<X is true, and that these
relations are transitive, and that if X==Y, X-Y is zero.
 
D

dan2online

Richard said:
I think dan2online is the only poster who's raised the issue of bit
patterns.
Here is an off-topic discussion, but perhaps it is helpful.

When using == to compare two floating point number, the FP unit will
compare them based on bit patterns. (sign, exponent, fraction bits,
IEEE754 standard)
S EEEEEEEEEEE FFFFFFFFFFFFFF....FFFF
0 1 11 12 63
Any of parts is different, two floating points is not equal.
Computer hardware supporting double floating point number also employs
80 bit temp registers for arithmetic/logic operation. In this scenario,
== is OK to compare two floating point number.

But if some embedded hardware or old computer has no hardware to
support floating point, or has no 80 bit temp registers,
arithmetic/logic operation will rely on software emulation. In this
scenario, == is questionable because of the potential precision loss.

That's why I mentioned the issue of bit pattern.

Practically, we can use the difference between two floating point
numbers. Strictly speaking, nearEqual is more pricise than isEqual for
floating point number.

If two floating point number bit patterns are the same except the least
bit, two floating point number can be considered equal in most cases.
I guess the original post want to try it for this reason. The method
was used in old days.

So epsilon is least detectable difference between two floating point
numbers near 1.0.
epsilon = 2^(-52) = 2.2204460492503130808472633361816e-16

Two floating point numbers are not near to 1 like the example of
original post,
compute diff = a/b -1, if compare diff < epsilon or diff > -epsilon.
then a = b.

The example of original post is not correct ! Ruchard Herring pointed
out the problem.
 
J

Jerry Coffin

@u72g2000cwu.googlegroups.com>, (e-mail address removed)
says...
template <class T> inline bool isEqual( const T& a, const T& b,
const T epsilon = std::numeric_limits<T>::epsilon() )
{
const T diff = a - b;
return ( diff <= epsilon ) && ( diff >= -epsilon );
}

[ ... ]
What makes me nervous though is the floating point comparsion. After
all numeric_limits is not defined on one platform (using gcc 2.96).
That said, I was opting to use iterators with - I think std::distance
but I'll end up doing floating point comparison anyways, in which case
my own version ( with my own comparator - isEqual) works best. Correct?

This question came up a while back in
comp.lang.c++.moderated. Here's what I had to say about
it there:

http://tinyurl.com/lvpl8

Here's my idea of a function that does a comparison in a
reasonable fashion:

http://tinyurl.com/qcmyd

Richard Herring does have a good point though -- it would
be better to rename the function to reflect the fact that
it tests for approximate equality rather than equality.
 
M

ma740988

Here's my idea of a function that does a comparison in a
reasonable fashion:

http://tinyurl.com/qcmyd
Jerry, appreaciate the link and after playing with isEqual I think I've
got a handle on it.
Got a follow-up question for you, with regards to the comparion in
question.

Give two files with contents akin to:
file1.txt
65.3433
43.9999
// more

file2.txt
65.3433
43.9999
// more

Now with regards to your version of isEqual. How would i use said
version to compare the contents of the two files?
 
J

Jerry Coffin

@e56g2000cwe.googlegroups.com>, (e-mail address removed)
says...

[ ... ]
Give two files with contents akin to:

[ one floating point number per line ]
Now with regards to your version of isEqual. How would i use said
version to compare the contents of the two files?

Well, I suppose you'd read an item from each file,
compare them, and react appropriately based on whether
they're nearly equal or not.
 
M

ma740988

Jerry said:
Well, I suppose you'd read an item from each file,
compare them, and react appropriately based on whether
they're nearly equal or not.
Actually what I was after was a way to improve my source to take
advantage of the function object in isEqual with an algorithm. With my
current approach i read both files into a vector<double> then compare
them with a for loop. Works so I wont fuss with it.

Thanks.
 
M

Marcus Kwok

Jerry Coffin said:
This question came up a while back in
comp.lang.c++.moderated. Here's what I had to say about
it there:

http://tinyurl.com/lvpl8

Here's my idea of a function that does a comparison in a
reasonable fashion:

http://tinyurl.com/qcmyd

Richard Herring does have a good point though -- it would
be better to rename the function to reflect the fact that
it tests for approximate equality rather than equality.

I was going through some old bookmarks and I found another article that
may be of interest:

Comparing Floating Point Numbers
by Bruce Dawson
http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm
 
P

Pete Becker

ma740988 said:
Actually what I was after was a way to improve my source to take
advantage of the function object in isEqual with an algorithm. With my
current approach i read both files into a vector<double> then compare
them with a for loop. Works so I wont fuss with it.

Read each line as text and compare them as text. No conversions. Much
better than doing possibly lossy conversions and then trying to guess
whether there was a big enough loss of information to be concerned about.
 
M

ma740988

Read each line as text and compare them as text. No conversions. Much
better than doing possibly lossy conversions and then trying to guess
whether there was a big enough loss of information to be concerned about.
I think what you're alluding to is some getline approach?


Here's my approach in a nutshell.


# include <iostream>
# include <string>
# include <algorithm>
# include <iterator>
# include <vector>
# include <algorithm>
# include <fstream>
# include <numeric>

template<typename T>
void read_any_t(std::ifstream& i, const std::vector<T>& v ) {
size_t len = v.size();
i.read( reinterpret_cast<T*>( &v[ 0 ] ), streamsize ( len ) * sizeof
( T ) );
}

unsigned int DEBUG ( 0 );
int main(int argc, char* argv[])
{
std::vector< double > v1;
std::vector< double > v2;
v1.reserve ( 0x100000 ); // wild guess.. reserve enough to prevent
relocation
v2.reserve ( 0x100000 ); // wild guess.. reserve enough to prevent
relocation

std::string file1 ( "file1.txt" );
std::string file2 ( "file2.txt" );

std::ifstream f1 ( file1.c_str() );
std::ifstream f2 ( file2.c_str() );

if ( !f1 || !f2 )
return EXIT_FAILURE;


// only problem i have with this approach
// .. is determining how to check for failure .. if error here how
does one _know_?

std::copy(std::istream_iterator<double>(f1),
std::istream_iterator<double>(),
std::back_inserter(v1));
std::copy(std::istream_iterator<double>(f2),
std::istream_iterator<double>(),
std::back_inserter(v2));


// doesn't fly .. i suspect binary expected.

// read_any_t<double> ( f1, v1 );
// read_any_t<double> ( f2, v2 );

if ( DEBUG )
{
std::copy ( v1.begin(), v1.end(),
std::eek:stream_iterator< double > ( std::cout, "\n" ) );
std::cout << std::endl;
std::copy ( v2.begin(), v2.end(),
std::eek:stream_iterator< double > ( std::cout, "\n" ) );
}
typedef std::vector< double >::size_type size_type;
size_type const sz1 = v1.size();
size_type const sz2 = v2.size();

if ( sz1 == sz2 )
{
typedef std::vector< double >::const_iterator const_iter;
const_iter v1beg = v1.begin();
const_iter v1end = v1.end();
const_iter v2beg = v2.begin();

for ( ; v1beg != v1end; ++v1beg )
{
// use Jerry's isEqual
}
}

// finally can we replace that for loop with some algo of sorts..
// more on this later
}

I'm using Jerry's isEqual - which is not shown in the above but you get
the point
 
J

Jerry Coffin

@i40g2000cwc.googlegroups.com>, (e-mail address removed)
says...
Actually what I was after was a way to improve my source to take
advantage of the function object in isEqual with an algorithm. With my
current approach i read both files into a vector<double> then compare
them with a for loop. Works so I wont fuss with it.

Ah, I see. It'll still depend on what sort of comparison
you're doing. For example, if you want to know how many
of them are equal, you could use something like
std::count_if. If you only want to know whether the files
are the same or not, you can stop reading at the first
pair that compares non-equal -- and if the files are big,
that may save quite a bit of time.

As far as most algorithms care, you should be able to use
std::istream_iterators to work with the files directly,
rather than reading from the files into vectors, and then
applying the algorithm to those iterators.
 
J

Jerry Coffin

[ ... ]
Read each line as text and compare them as text. No conversions. Much
better than doing possibly lossy conversions and then trying to guess
whether there was a big enough loss of information to be concerned about.

That depends. If you want to know whether the values in
the files were precisely identical, textual comparison is
clearly the way to go -- and you'd usually be much better
off writing the values on in hexadecimal or binary rather
than converting them to decimal at all.

If, OTOH, you're doing something like regression testing,
and want to allow an optimized version of calculations,
as long as the results agree to (say) ten significant
decimal digits (with proper rounding), that's usually not
nearly as easy to do by textual comparison.

As an aside, are you honestly implying that if I read and
convert two identical pieces of text from the input file,
that the double values produced might not be identical?

Taking a value, converting to decimal, and then
converting back to a double (or float or long double) I'd
expect to usually produce a value slightly different from
the original before being written out. IOW, the process
has less than perfect accuracy. I'm a bit surprised,
however, at the implication that it might not be
repeatable...
 
P

Pete Becker

Jerry said:
That depends. If you want to know whether the values in
the files were precisely identical, textual comparison is
clearly the way to go

In the example files in the message that I replied to that was the case.
 
J

Jerry Coffin

@u72g2000cwu.googlegroups.com>, (e-mail address removed)
says...
I think what you're alluding to is some getline approach?


Here's my approach in a nutshell.

[ code elided ... ]

I'd consider using std::equal:

std::istream_iterator<double> input1(stream1), end;
std::istream_iterator<double> input2(stream2);

if (!std::equal(input1, end, input2, isApproxEqual) ||
input1 || input2)
{
std::cout << "The streams are different.\n";
}
else
{
std::cout << "The streams are equal.\n";
}

Basically, we consider the streams equal if and only if
each value we read from each compare approximately equal,
and we reach the end of both streams at the same time.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,139
Latest member
JamaalCald
Top