Float comparison

jameskuyper · May 5, 2009

Richard said:
....
An issue here is that if one is working with floating point
numbers one should, IMNSHO, think of them as representing ranges
of mathematical values, i.e., as an interval in the real number
line. At the same time a floating point number is also an exact
rational number. As such it is an exemplar of the range.

That sounds pretty much exactly like what Chuck said. Keith has been
saying that you don't have to think of them that way, that it's up to
the application whether or not it treats them in that fashion, so this
does not represent a common opinion that they could both agree with.

I have the impression he is the "only a discrete value" side,
that his view is that that is what the formalism says.

He's been very clear about the distinction between how floating point
numbers are stored and how they are used, and has repeatedly indicated
that it's up to the application how they are used.

....

That said, I think your thoughts on spacing are off the mark.
Floating point numbers are a representation of the real line; as
such they are necessarily inexact. Integers are a representation
of, well, integers. As such they are exact. If you are using
integers to denote things that are not integers then spacing
comes into play, but that is a different matter.

If I use an integer to store a count of the number of pixels in an
image, it represents an exact quantity. On the other hand, if I use an
integer to store the year that an event occurred in, that integer
represents the entire range of time from the start of that year to the
end of that year (with the handling of the boundary points an issue
I'd have to deal with, if it matters).

You could say that the integers merely identify the ranges, rather
than representing them, but the same can be said of floating point
numbers.

In most of my code, I treat floating point numbers, not as
representing a range of values, but as containing the best
representable approximation to the real number that I would store, if
I could. On those occasions when I do want to represent a range of
values, the width of that range always depends upon how the number was
calculated, it can't be calculated just by multiplying the central
value by DBL_EPSILON, and is usually much larger than the range you
get by such a calculation.

Therefore, even when I am treating a number as the center of a range
of values, rather than the best approximation to an exact value, it
would almost never be the kind of range that CBF is talking about.

Phil Carmody · May 5, 2009

Richard Heathfield said:
Oddly enough, however, I think that for once Chuck is at least
/almost/ right. He's wrong about *_EPSILON, but I can understand
his point about the limitations of floating-point numbers. (I think
we all understand this, actually.) His problem is not so much in
what he's thinking as in the words he's using to think it.

I have to disagree. Chuck has the problem of going from
a premise that floating point numbers /can be used/ to
represent real numbers approximately to a conclusion
that /any/ use of floating point numbers /must be/ an
approximation to a real number.

Absolutely no choice of words could make that valid.

Phil

Flash Gordon · May 5, 2009

jameskuyper wrote:

I think I do understand what Keith is saying, and as I understand it,
what he means is that "from the perspective of the application", a
given floating point number can be treated either as representing a
range of value, or as representing only a single discrete value.> It's
a matter of how the application is designed, not a matter of the
underlying hardware, nor the requirements of either the C or IEEE/IEC
standards, how they are to be interpreted. Exactly the same thing can
be said of integer types - the main difference is that the spacing of
representable integers is constant, whereas the spacing of
representable floating point numbers is not.

Which is pretty much the point I tried to make, including real-world
examples where integers *are* used for ranges and real-world examples
where floating point numbers (doubles in the application I am thinking
of) are used to (correctly) as exact numbers.

You can only ever tell whether it is a range (and what that range is) or
an exact value by analysing the requirements, design and code.

CBFalconer · May 5, 2009

(e-mail address removed) (Gordon Burditt) wrote:

ah, if only most people did learn this...

Heisenburg's principle doesn't practically apply to planets
or other macroscopic objects.

Oh yes it does. The uncertainties involved are calculated the same
way, but the results are usually negligibly small.

Dik T. Winter · May 6, 2009

> On Mon, 4 May 2009 11:47:40 GMT, "Dik T. Winter"

>
> I am puzzled. I haven't seen anything in your text that matches
> the "I asked him how ..." nor do I see any reason for expecting
> that x and y would necessarily be in the range for x*y.

That was much earlier in the discussion. CBF started with saying that under
all circumstances a C floating-point variable represented a range bounded
by X_EPSILON. That is wat is under discussion, and we see indeed that
even when x and y represent such a range, using interval analysis, after
z = x * y, z certainly does *not* represent such a range but a (possibly)
much larger range. The wording in my last sentence was indeed wrong, it
should have been:

> Falconer is treating range as {x-x*eps(x),x+x*eps(x)}, i.e., the
> epsilon is proportional error. From this definition he goes on
> to get eps(x*y) = eps(x) + eps(y) which is correct to the first
> order.

Right, and I do not contest that.

> It is quite clear that that is what he is doing. What,
> if anything, do you see as error in his analysis, given his
> definition?

That when you write: z = x * y, the range which z (according to CBF) represents
is no longer the range that the product would represent. And this means that
the view that a variable represents a range bounded that way is not the
proper view to take.

Rather than relying on the FP hardware to take care of the error-bounds, you
should view the stored values as exact and do serious error-analysis. And,
when you do that, you will find that von Neumann was right when he stated that
it would be impossible to find the solution to a system of linear equations
involving more than about 40 variables. It was Wilkinson who put that in the
proper perspective: "the result is the *exact* solution of a system that
differs from the original system within some specific bounds". And at
Karlsruhe, Uhlrich provided a system with which you would actually find a
solution within given bounds, assuming the original system was *exact*. But
that method is (in my opinion) much too time-consuming to be practical.

Dik T. Winter · May 6, 2009

>
> I used ex and ey for the original errors, and I calculated the
> exact error in x*y, which had terms in ex and ey and in ex*ey. I
> also stated that ex (and ey) were small compared to x (and y),
> which made the error term in ex*ey negligible. You simply ignored
> the equations, changed the nomenclature, and, as far as I can see,
> ignored the facts.

I did not ignore them, I did state they were correct! But when you store
x * y in a variable z, the range (which you stated) z does represent does
*not* cover the range of x * y.

Dik T. Winter · May 6, 2009

> An issue here is that if one is working with floating point
> numbers one should, IMNSHO, think of them as representing ranges
> of mathematical values, i.e., as an interval in the real number
> line. At the same time a floating point number is also an exact
> rational number. As such it is an exemplar of the range.

That is not the way numerical analysts do work. Designing the algorithms,
the first assumption is that the input is exact.

Richard Bos · May 6, 2009

[ BTW, get a newsreader. Google Beta is broken worse again, recently. ]

I draw the analogy with chess. Once the same position has been
repeated
a few times it's a draw. Applying this rule cuts out many more-heat-
than-light
arguments.

Unfortunately for that analogy, the three repetitions rule in chess
doesn't _force_ a draw. If both of the players insist on playing on, the
game goes on as if nothing has happened. It's only a draw when either of
the players claims it, at the moment of the third repetition.

Richard

Keith Thompson · May 6, 2009

christian.bau said:
If you work under the assumption that a floating point number
represents an interval in the real number line, then there are some
questions which you have to answer: First, _which_ interval does a
floating point number represent? If I write "double x = 1.0;", which
interval exactly is represented?

[...]

And if I write this:

double x = 1.0;
if (x < 1.0) puts("Oops!");
if (x > 1.0) puts("Oops!");

do I have any grounds for complaint if the program prints "Oops!"? If
the stored value represents a range, I would think the answer would be
no.

CBFalconer · May 6, 2009

Richard said:
CBFalconer said:

There's your problem right there. You are confusing "value" and
"interpretation". The value that was stored is easy to discover,
simply by looking. What you meant when you stored it there is a
completely different matter.

Try thinking about it. For example, if I write:

float x = 1.0/3.0;

you are claiming the value stored is 1/3. But if you examine the
value stored you will not find that. You will find something in
the range x*(1+FLT_EPSILON) and x*(1-FLT_EPSILON). That range is
system dependant. We won't move that range very much if we
substitute the desired value, 1/3, for the value found in x.
Things may be worse, but they won't be better.

Phil Carmody · May 6, 2009

Dik T. Winter said:
Rather than relying on the FP hardware to take care of the error-bounds, you
should view the stored values as exact and do serious error-analysis. And,
when you do that, you will find that von Neumann was right when he stated that
it would be impossible to find the solution to a system of linear equations
involving more than about 40 variables.

Upon what is your "would" (4th line) conditional?

Phil

CBFalconer · May 6, 2009

Dik T. Winter said:
(e-mail address removed) writes:
.... snip ...

I did not ignore them, I did state they were correct! But when
you store x * y in a variable z, the range (which you stated) z
does represent does *not* cover the range of x * y.

If you are trying to say that the computation may have errors
larger, even much larger, than that intrinsic in the fp
representation, I agree entirely.

Keith Thompson · May 6, 2009

CBFalconer said:
Try thinking about it. For example, if I write:

float x = 1.0/3.0;

you are claiming the value stored is 1/3.

[...]

I don't believe anybody made such a claim. (It's possible only of
FLT_RADIX is a multiple of 3.)

Beej Jorgensen · May 6, 2009

Keith Thompson said:
And if I write this:

Let's make a few little changes here:

double x = 1.0 + Z;
if (x == 1.0 && Z != 0.0) puts("Ooops!");

I ran this and it printed "Ooops!"

1. x is exactly 1.0, because it was tested during the run.

2. However, there exists a representable nontrivial interval where 1.0
plus any number in that interval evaluates to 1.0. Put another way,
given the above run, you can tell me what x is, but you cannot tell
me what Z is. In this case, x is an inexact approximation of the
true mathematical value of 1.0+Z, where 1.0+Z falls in the interval
between 1.0 and C's next representable number.

Can you and CBF agree on these points (assuming I have them correct)? I
think this might be where the two arguments come together.

-Beej

CBFalconer · May 6, 2009

Keith said:
.... snip ...

And if I write this:

double x = 1.0;
if (x < 1.0) puts("Oops!");
if (x > 1.0) puts("Oops!");

do I have any grounds for complaint if the program prints
"Oops!"? If the stored value represents a range, I would
think the answer would be no.

x is a double. So is the expression (1.0). So they represent the
same range. However in:

double x = 1.0;
if (x < (2.1 - 1.1) puts ("Oops!");
if (x > (2.1 - 1.1) puts ("Oops!");

I don't think you have any complaint.

Keith Thompson · May 6, 2009

Beej Jorgensen said:
Let's make a few little changes here:

I ran this and it printed "Ooops!"

It would have been helpful if you had shown us the declaration and
initialization of Z.

Here we go:

#include <stdio.h>
#include <float.h>
int main(void) {
double Z = DBL_EPSILON / 2.0;
double x = 1.0 + Z;
if (x == 1.0 && Z != 0.0) puts("Ooops!");
return 0;
}

1. x is exactly 1.0, because it was tested during the run.

2. However, there exists a representable nontrivial interval where 1.0
plus any number in that interval evaluates to 1.0. Put another way,
given the above run, you can tell me what x is, but you cannot tell
me what Z is. In this case, x is an inexact approximation of the
true mathematical value of 1.0+Z, where 1.0+Z falls in the interval
between 1.0 and C's next representable number.

I don't think that's actually guaranteed. A conforming implementation
could legally have 1.0 + Z yield 1.0 + DBL_EPSILON for any
sufficiently small Z > 0.0. But yes, for a typical implementation the
above program will print "Ooops!", or at least a similar program can
be constructed that differs only in the value of Z. (And yes, it
prints "Ooops!" on my implementation.)

Can you and CBF agree on these points (assuming I have them correct)? I
think this might be where the two arguments come together.

I think we agree that the result of a floating-point calculation has
an implementation-defined error range. We disagree on the meaning of
a *stored* floating-point value.

I assert that, for example, a value stored in an object of type double
(ignoring NaNs and infinities) represents a single mathematical real
value, in accordance with the model of floating-point numbers
specified in C99 5.2.4.2.2p1-2. (Add p3 if you want to deal with
subnormals and other oddities.) That value could have been produced
by any of a number of means, and some of those methods (such as the
floating-point "+" operator) have their own error ranges -- but once
the value is stored, any information about how it was computed is
irrelevant. Calculations have error ranges; stored values do not.

CBFalconer · May 6, 2009

Beej said:
Let's make a few little changes here:

I ran this and it printed "Ooops!"

1. x is exactly 1.0, because it was tested during the run.

2. However, there exists a representable nontrivial interval
where 1.0 plus any number in that interval evaluates to 1.0.
Put another way, given the above run, you can tell me what x
is, but you cannot tell me what Z is. In this case, x is an
inexact approximation of the true mathematical value of
1.0+Z, where 1.0+Z falls in the interval between 1.0 and C's
next representable number.

Can you and CBF agree on these points (assuming I have them
correct)? I think this might be where the two arguments come
together.

As long as you agree that you are considering the actual program,
and not just the value found in the double object x.

CBFalconer · May 6, 2009

Keith said:
.... snip ...

irrelevant. Calculations have error ranges; stored values do not.

Oh? I have a nuclear counter and a sensor. Over a given period T
I detect 10,000 counts of events of more than 1 keV energy. I
compute the expected error in the counts as sqrt(count), or 100 and
store that in cterror. Now you claim that 'counts' is exact, and
represents the radiation strength at the time of testing? You are
telling me to ignore cterror? I haven't even considered the time
resolution of the detection system.

We can switch from/to everything is physics to/from everything is
math, or from/to everything is analog to/from everything is digital
at the drop of a hat (straw).

CBFalconer · May 6, 2009

Keith said:
CBFalconer said:

Try thinking about it. For example, if I write:

float x = 1.0/3.0;

you are claiming the value stored is 1/3.

Click to expand...

[...]

I don't believe anybody made such a claim. (It's possible only of
FLT_RADIX is a multiple of 3.)

Then you can't claim you know what is stored there. How the
floating arithmetic is implemented is not specified by the C
standard. All you KNOW is that the value stored is within a range
(my word) of the value read out of x later. Maybe the float system
always stores the input value multiplied by three? In that case
the stored value will be exactly 1.0/3.0, stored as a value of
<i.oo>, just to separate the FP and normal numbers. I believe such
a system is legitimate in C. I could probably implement it.
Relax, I won't.

Beej Jorgensen · May 6, 2009

Keith Thompson said:
It would have been helpful if you had shown us the declaration and
initialization of Z.

But it wouldn't have clarified my point.

double Z = DBL_EPSILON / 2.0;

Sure. Or Z = DBL_MIN could also give the result. Or anything in
between, I think.

And, since I didn't make it clear for my previous post, I meant points 1
and 2 to not be generally and always true, but "possibly true and
certainly true in my test run with a (theoretically) conforming
compiler."

I think we agree that the result of a floating-point calculation has
an implementation-defined error range. We disagree on the meaning of
a *stored* floating-point value.

I'm going to speak out of turn here, but I think part of the issue is
how much of this error is presumed to exist in the result. And I don't
know that the standard really addresses this issue; it seems to be more
of a human-interpreted thing.

We agree that we can add small Zs to 1.0 and get a mathematically
incorrect result, and store that incorrect result exactly in x, but
then we might have two opinions:

A: "There's no error in x; it's exactly 1.0."
B: "There's some error in x; it's exactly 1.0."

Standard: "I got 1.0. I don't care what it means."

The standard makes it clear that when we add these small values to 1.0,
we're not necessarily going to move forward. It also makes clear that
1.0 can be represented exactly. And I agree with you that the standard
clearly spells out the model that defines a floating point number.

With all that in mind, consider:

A: "x means 1.0."
B: "x means everything around 1.0 within some unrepresentable margin
of error."

Does statement B violate the standard in some way? (This is a genuine
question. I'm looking, but not finding it, or anything that mentions
it.)

-Beej

Need Helping adding Square root code to an existing calculator. (Absolute begginer?)	0	Jan 12, 2025
How to alter the program so that when user types z or Z or 0, the program sets both a and b to zero?	0	Oct 10, 2022
Where is my mistake? Why is s equal to minus infinity at some loop iterations?	0	Oct 9, 2022
Comparison of Integer and Pointer (that's supposed to be an Integer). Where did I go wrong?	0	Nov 19, 2022
Structures and chained lists questions :	1	Feb 12, 2011
Rich Text Format (RTF) Document Builder in C++: Code and Features	0	Sep 28, 2025
Runtime Error with __gcd? (floating point exception)	1	Nov 27, 2024
Secure Keyboard v2.0 Modern C++ Virtual Keyboard for Windows (Glassmorphism UI, Clipboard Auto-Clear)	0	Mar 26, 2026

Float comparison

jameskuyper

Phil Carmody

Flash Gordon

CBFalconer

Dik T. Winter

Dik T. Winter

Dik T. Winter

Richard Bos

Keith Thompson

CBFalconer

Phil Carmody

CBFalconer

Keith Thompson

Beej Jorgensen

CBFalconer

Keith Thompson

CBFalconer

CBFalconer

CBFalconer

Beej Jorgensen

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads