Float precision and float equality

A

Anton81

I'd like to do calculations with floats and at some point equality of
two number will be checked.
What is the best way to make sure that equality of floats will be
detected, where I assume that mismatches beyond a certain point are
due to truncation errors?
 
M

Mark Dickinson

I'd like to do calculations with floats and at some point equality of
two number will be checked.
What is the best way to make sure that equality of floats will be
detected, where I assume that mismatches beyond a certain point are
due to truncation errors?

Well, it depends a lot on the details of the application, but
a good general scheme is to allow both a fixed relative error
and an absolute error, and to assert that your two values are
'nearly equal' if they're within *either* the relative error *or*
the absolute error. Something like, for example:

def almostEqual(expected, actual, rel_err=1e-7, abs_err = 1e-20):
absolute_error = abs(actual-expected)
return absolute_error <= max(abs_err, rel_err * abs(expected))

Then choose the values of rel_err and abs_err to suit your
application.

What sort of calculations are you doing?
 
R

Raymond Hettinger

I'd like to do calculations with floats and at some point equality of
two number will be checked.
What is the best way to make sure that equality of floats will be
detected, where I assume that mismatches beyond a certain point are
due to truncation errors?

Short answer: use round().
Less short answer: use Decimal with a high precision and then round()
or quantize.
Long answer: the amount of error depends on the calculation and the
scale of the inputs; some calculations potentially propagate tiny
errors to the point where they become large enough to overpower the
signal in your data (e.g. the Lorentz equation or some other chaotic
sequence).


Raymond
 
M

Mark Dickinson

Short answer: use round().

Can you explain how this would work? I'm imagining a test
something like:

if round(x, 6) == round(y, 6): ...

but that still would end up missing some cases where x and y
are equal to within 1ulp, which presumably isn't what's wanted:
False
 
R

Raymond Hettinger

Can you explain how this would work?  I'm imagining a test
something like:

if round(x, 6) == round(y, 6): ...

but that still would end up missing some cases where x and y
are equal to within 1ulp, which presumably isn't what's wanted:

if not round(x - y, 6): ...


Raymond
 
S

sturlamolden

I'd like to do calculations with floats and at some point equality of
two number will be checked.
What is the best way to make sure that equality of floats will be
detected, where I assume that mismatches beyond a certain point are
due to truncation errors?


isequal = lambda x,y : abs(x-y) < eps

where eps is the truncation error.
 
R

Raymond Hettinger

That's a dangerous suggestion.  It only works if x and y happen to be
roughly in the range of integers.

Right. Using abs(x-y) < eps is the way to go.


Raymond
 
D

dbd

..>
..> Right.  Using abs(x-y) < eps is the way to go.
..>
..> Raymond

This only works when abs(x) and abs(y) are larger that eps, but not
too much larger.

Mark's suggestion is longer, but it works. The downside is it requires
you to think about the scale and accuracy of your application.

Dale B. Dalrymple
 
A

Anton81

I do some linear algebra and whenever the prefactor of a vector turns
out to be zero, I want to remove it.

I'd like to keep the system comfortable. So basically I should write a
new class for numbers that has it's own __eq__ operator?
Is there an existing module for that?
 
R

r0g

dbd said:
.>
.> Right. Using abs(x-y) < eps is the way to go.
.>
.> Raymond

This only works when abs(x) and abs(y) are larger that eps, but not
too much larger.


Okay, I'm confused now... I thought them being larger was entirely the
point. At what point can they become too large? Isn't eps entirely
arbitrary anyway?


Mark's suggestion is longer, but it works. The downside is it requires
you to think about the scale and accuracy of your application.


Shouldn't one be doing that in any case??


Roger.
 
S

sturlamolden

Okay, I'm confused now... I thought them being larger was entirely the
point.

Yes. dbd got it wrong. If both a smaller than eps, the absolute
difference is smaller than eps, so they are considered equal.
 
D

Dave Angel

Anton81 said:
I do some linear algebra and whenever the prefactor of a vector turns
out to be zero, I want to remove it.

I'd like to keep the system comfortable. So basically I should write a
new class for numbers that has it's own __eq__ operator?
Is there an existing module for that?
You have to define your own "comfortable." But if it's zero you're
checking for, then I most certainly wouldn't try to hide it inside a
"number class." Most such formulas go ballistic when you get near zero.

The definition of 'close enough" is very context dependent, and
shouldn't be hidden at too low a level. But your mileage may vary.

For example, in your case, you might want to check that the prefactor is
much smaller than the average (of the abs values) of the vector
elements. Enough orders of magnitude smaller, and you call it equal to
zero.

DaveA
 
C

Carl Banks

I do some linear algebra and whenever the prefactor of a vector turns
out to be zero, I want to remove it.

I'd like to keep the system comfortable. So basically I should write a
new class for numbers that has it's own __eq__ operator?
Is there an existing module for that?

I highly recommend against it; among other things it invalidates the
transitive property of equality:

"If a == b and b == c, then a == c."

It will also make the number non-hashable, and have several other
negative consequences. Plus, it's not something that's never
foolproof. What numbers are close enought to be condidered "equal"
depends on the calculations.

(I remember once struggling in a homework assignment over seemingly
large discrepancies in a calculation I was doing, until i realized
that the actual numbers were on the scale of 10**11, and the
difference was around 10**1, so it really didn't matter.)



Carl Banks
 
T

TheSeeker

I highly recommend against it; among other things it invalidates the
transitive property of equality:

"If a == b and b == c, then a == c."

It will also make the number non-hashable, and have several other
negative consequences.    What numbers are close enought to be condidered "equal"
depends on the calculations.

(I remember once struggling in a homework assignment over seemingly
large discrepancies in a calculation I was doing, until i realized
that the actual numbers were on the scale of 10**11, and the
difference was around 10**1, so it really didn't matter.)

Carl Banks

Maybe it's the gin, but
"Plus, it's not something that's never foolproof.'

+1 QOTW

Cheers,
TheSeeker
 
D

David Cournapeau

Well, it depends a lot on the details of the application, but
a good general scheme is to allow both a fixed relative error
and an absolute error, and to assert that your two values are
'nearly equal' if they're within *either* the relative error *or*
the absolute error.  Something like, for example:

def almostEqual(expected, actual, rel_err=1e-7, abs_err = 1e-20):
   absolute_error = abs(actual-expected)
   return absolute_error <= max(abs_err, rel_err * abs(expected))

If you can depend on IEEE 754 semantics, one relatively robust method
is to use the number of representable floats between two numbers. The
main advantage compared to the proposed methods is that it somewhat
automatically takes into account the amplitude of input numbers:

abs(x - y) <= N * spacing(max(abs(x), abs(y)))

Where spacing(a) is the smallest number such as a + spacing(a) != a.
Whether a and b are small or big, the same value of N can be used, and
it tells you how close two numbers are in terms of internal
representation.

Upcoming numpy 1.4.0 has an implementation for spacing - implementing
your own for double is not difficult, though,

cheers,

David
 
D

Dave Angel

Carl said:
I highly recommend against it; among other things it invalidates the
transitive property of equality:

"If a =b and b == c, then a == c."

It will also make the number non-hashable, and have several other
negative consequences. Plus, it's not something that's never
foolproof. What numbers are close enought to be condidered "equal"
depends on the calculations.

(I remember once struggling in a homework assignment over seemingly
large discrepancies in a calculation I was doing, until i realized
that the actual numbers were on the scale of 10**11, and the
difference was around 10**1, so it really didn't matter.)



Carl Banks
A few decades ago I implemented the math package (microcode) under the
machine language for a proprietary processor (this is when a processor
took 5 boards of circuitry to implement). I started with floating point
add and subtract, and continued all the way through the trig, log, and
even random functions. Anyway, a customer called asking whether a
particular problem he had was caused by his logic, or by errors in our
math. He was calculating the difference in height between an
always-level table and a perfectly flat table (between an arc of a great
circle around the earth, and a flat table that doesn't follow the
curvature.) In a couple of hundred feet of table, the difference was
measured in millionths of an inch, as I recall. Anyway it turned out
his calculation was effectively subtracting
(8000 miles plus a little bit) - (8000 miles)
and if he calculated it three different ways, he got three different
results, one was off in about the 3rd place, while the other was only
half the value. I was able to show him another way (through geometrical
transformations) to solve the problem that got the exact answer, or at
least to more digits than he could possibly measure. I think I recall
that the new solution also cancelled out the need for trig. Sometimes
the math package shouldn't hide the problem, but give it to you straight.

DaveA
 
D

dbd

Yes. dbd got it wrong. If both a smaller than eps, the absolute
difference is smaller than eps, so they are considered equal.

Small x,y failure case:
eps and even eps squared are representable as floats. If you have
samples of a sine wave with peak amplitude of one half eps, the "abs(x-
y) < eps" test would report all values on the sine wave as equal to
zero. This would not be correct.
Large x,y failure case:
If you have two calculation paths that symbolically should produce the
same value of size one over eps, valid floating point implementations
may differ by an lsb or more. An single lsb error would be 1, much
greater than the test allows as 'nearly equal' for floating point
comparison.

1.0 + eps is the smallest value greater than 1.0, distinguishable from
1.0. Long chains of floating point calculations that would
symbolically be expected to produce a value of 1.0 many be expected to
produce errors of an eps or more due to the inexactness of floating
point representation. These errors should be allowed in floating point
equality comparison. The value of the minimum representable error will
scale as the floating point number varies. A constant comparison value
is not appropriate.

Mark was right, DaveA's discussion explains a strategy to use.

Dale B. Dalrymple
 
S

Steven D'Aprano

(I remember once struggling in a homework assignment over seemingly
large discrepancies in a calculation I was doing, until i realized that
the actual numbers were on the scale of 10**11, and the difference was
around 10**1, so it really didn't matter.)

Well that depends on the accuracy of the calculations, surely? If the
calculations were accurate to one part in 10**20, then an error around
10**1 is about ten trillion times larger than acceptable.

*wink*
 
S

sturlamolden

If you have
samples of a sine wave with peak amplitude of one half eps, the "abs(x-
y) < eps" test would report all values on the sine wave as equal to
zero. This would not be correct.

You don't understand this at all do you?

If you have a sine wave with an amplitude less than the truncation
error, it will always be approximately equal to zero.

Numerical maths is about approximations, not symbolic equalities.

1.0 + eps is the smallest value greater than 1.0, distinguishable from
1.0.

Which is the reason 0.5*eps*sin(x) is never distinguishable from 0.

A constant comparison value is not appropriate.

That require domain specific knowledge. Sometimes we look at a
significant number of digits; sometimes we look at a fixed number of
decimals; sometimes we look at abs(y/x). But there will always be a
truncation error of some sort, and differences less than that is never
significant.
 
M

Mark Dickinson

I do some linear algebra and whenever the prefactor of a vector turns
out to be zero, I want to remove it.

Hmm. Comparing against zero is something of a special case. So you'd
almost certainly be doing an 'if abs(x) < tol: ...' check, but the
question is what value to use for tol, and that (again) depends on
what you're doing. Perhaps 'tol' could be something like 'eps *
scale', where 'eps' is an indication of the size of relative error
you're prepared to admit (eps = 1e-12 might be reasonable; to allow
for rounding errors, it should be something comfortably larger than
the machine epsilon sys.float_info.epsilon, which is likely to be
around 2e-16 for a typical machine), and 'scale' is something closely
related to the scale of your problem: in your example, perhaps scale
could be the largest of all the prefactors you have, or some sort of
average of all the prefactors. There's really no one-size-fits-all
easy answer here.
I'd like to keep the system comfortable. So basically I should write a
new class for numbers that has it's own __eq__ operator?

That's probably not a good idea, for the reasons that Carl Banks
already enumerated.

Mark
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,186
Latest member
vinaykumar_nevatia

Latest Threads

Top