Float comparison

CBFalconer · May 8, 2009

James said:
Keith Thompson wrote:
.... snip ...

The fact that they are countable means that they can be arranged
in a sequence that allows them to be mapped one-to-one with the
integers. It's therefore meaningful, within the context of such
a mapping, to talk about "adjacent" rationals. However, there
are infinitely many different such sequences, and the meaning of
"adjacent" would be different in each one. One of the few things
all such sequence have in common is that they cannot be in order
by the values of the rational numbers. Those two facts makes
utter nonsense of the conclusions CBFalconer reaches from the
countability of rationals.

Agreed. That comes from trying to find an argument to convince
people of the obvious. A very poor argument.

Keith Thompson · May 8, 2009

Mark McIntyre said:
On 07/05/09 22:36, Richard Heathfield wrote:
The value stored there is the result of dividing 1.0 by 3.0

Oh, I see - sophistry.

How is it sophistry? It seems to me to be a simple and clear
statement of fact.

I think you'll find that (barring weird compiler bugs), you are
guaranteed to have a value in the range suggested by CBF. Your comment
isn't incompatible with that, but the word "No" at the start, and the
capitalisation of "know" are misleading and somewhat
disingenuous. Thats ok of course, the regulars know your argumentative
nature...

Yes, you know that the value stored is within some range. (CBF didn't
actually define *what* range, as far as I can tell.) But that's not
*all* you know. You also know that the value stored is the same as
the value read out of x later.

No, because examining the system will influence the value stored in
there. You will in fact only know the value stored in it at the
precise instant you look. As soon as you stop looking it might change
by 1e-22, and before you looked it might have been 1124574787 times
larger.

Sorry, just continuing your earlier philosophy 101...

No, that would violate C99 6.2.4p2:

[...]
An object exists, has a constant address, and retains its
last-stored value throughout its lifetime.
[...]

Keith Thompson · May 8, 2009

CBFalconer said:
Keith said:

... snip ...

I believe you have misunderstood what I wrote.

Let's suppose that, in accordance with your model, the stored double
value 1.0 represents the range of real numbers 1.0-EPS .. 1.0+EPS
assuming for simplicity that the range is symmetric). Then given:
double x = 1.0;
double y = 1.0;
both x and y represent identical ranges. Each of these identical
ranges contains infinitely many real numbers. Some of the numbers
within x's range are less that some of the numbers within y's range;
for example, 1.0-EPS (which is in x's range) is less than 1.0+EPS
(which is in y's range). Nevertheless, in your model, (x < y) must
yield 0.

And just what do you mean by "the value in x (or y)"? Are you
implying that x has just a single value?

Incidentally, in my article there were 5 paragraphs following what you
quoted, full of questions directed to you in an attempt to understand
your conception of C's floating-point model. You have snipped those
paragraphs without so much as a "[...]". My assumption is that you
were unable to answer my questions because your floating-point model
is actually inconsistent. Either that or you didn't bother to read
them.

Click to expand...

No, I try to keep messages short and simple, and dealing with one
aspect at a time. This is getting far too long.

So you chose one out of a number of points I made, completely
misunderstood it, and ignored the rest.

I spent a substantial amount of time writing that article. Apparently
it was wasted.

WRT the values in x (or y), those values are exact, but they do NOT
represent exactness.

I think this is the first time you've acknowledged that there *is* an
exact value. That single exact value *is* the value of the
floating-point value. Read the standard; that's essentially what it
says.

This is due to the fundamental nature of a
floating value, i.e. any exact value ALWAYS represents a range,
such as:

x * (1 - x_EPSILON) through x * (1 + x_EPSILON)

Why that specific range?

For x==1.0 (type double), that range includes the following real
numbers:

1.0-DBL_EPSILON
1.0-DBL_EPSILON/2.0
1.0
1.0+DBL_EPSILON

all of which are exactly representable in a typical radix-2
implementation. Even if you mean to exclude the endpoints, there
still one additional exactly representable real number within the
range.

You claim to be discussing the fundamental model of floating-point
numbers. You can't reasonably ignore details like this.

and you know nothing better about it without detailed analysis of
the code that set the value.

So examining the value doesn't tell you anything about the value?
Nonsense.

This is one of the reasons using
equality relations in FP is unwise. x < y implies that ALL values
in the x range are less that ALL the values in the y range. The
structure of FP values is such that you can tell this immeditely if
the value stored in x is less than the value stored in y.

When you write out the value in x to umpty-ump digits you are doing
nothing useful. That is simply one of the many possible values
that that x can represent (I am mixing up the usage of value).

Can you express your idea *without* "mixing up the usage of value"?
If you can't explain it coherently, consider the possibility that
you're wrong.

If you start taking differences in values you will find that the
resultant errors can be overwhelmingly due to the original ranges,
which were not errors, but simply unknown details. Differences of
nearly equal quanities, etc. If you use products and quotients,
the resultant error will probably not become excessive. These
differences in resultant errors are what limits inversion of
matrices, etc.

If you take differences or use products and quotients, you're
performing floating-point operations. We all know that such
operations are inexact. The discussion is about stored values, not
operations.

You seem to be claiming that the inaccuracies in a floating-point
calculation (which we both agree exist) are somehow magically but
undetectably retained when the result of the calculation is stored.

They aren't.

Keith Thompson · May 8, 2009

CBFalconer said:
Exactly. All you know is that the content of x represents a value
in the range x*(1+FLT_EPSILON) through x*(1-FLT_EPSILON).

Yes! The stored float value represents *a* real value in the range.
It doesn't represent the range.

To find out *which* value you merely have to examine it.

(The range you've given is not correct in general, but we can leave
that aside.)

Keith Thompson · May 8, 2009

CBFalconer said:
Keith Thompson wrote: [...]

However floating-point types do have adjacent values. For example,
1.0 and 1.0+DBL_EPSILON are adjacent values of type double; there
are no double values between them.

Click to expand...

True again. Now all I want is for people to keep that (and the
consequences) in mind.

That's exactly what I've been doing. You should try it.

A stored value of type double cannot represent the real value
1.0+DBL_EPSILON/2.0, any more than a stored value of type int can
represent the value 1.5.

The computation 1.0/3.0 may yield any value within some
implementation-defined range. When the result of the computation is
stored in a double object, any range information is discarded; only a
single exact value (not exactly one third, but *some* exact value) is
stored.

Keith Thompson · May 8, 2009

Joe Wright said:
Keith said:

CBFalconer said:

Keith Thompson wrote:
Try thinking about it. For example, if I write:

float x = 1.0/3.0;

you are claiming the value stored is 1/3.
[...]

I don't believe anybody made such a claim. (It's possible only of
FLT_RADIX is a multiple of 3.)
Then you can't claim you know what is stored there.

Click to expand...

[...]

I can know what is stored in x by examining the value of x. On my
system, it's 0.3333333432674407958984375. On some other system, it
may be slightly different.

Given the above declaration, the C standard doesn't tell me exactly
*which* exact floating-point value is stored in x; that's
implementation-defined, and even the implementation's definition might
not be enough to determine it. But it does tell me that *some* exact
floating-point value is stored in x, since (barring trap
representations) that's the only thing that possibly *can* be stored
in x.

Click to expand...

Richard - You print 1/3. with "%.25f" apparently.

Richard? Who's Richard?

0.3333333432674407958984375
3.33333343e-01

I use "%.8e" which gives all the precision available in a float. I
contend the extra 16 digits don't contribute to the float value and
may confuse the casual reader.

I think I used something like "%.32f" and manually deleted the
trailing zeros.

I'm aware that 0.3333333432674407958984375 uses more digits than are
necessary to uniquely identify the stored value, and I wouldn't use so
many digits in most contexts. But in the context of this discussion,
the point is that the stored value is an exact value.

The stored value is close to one third, but it isn't one third.

The stored value is close to 3.33333343e-01, but it isn't 3.33333343e-01 .

The stored value *is* 0.3333333432674407958984375 (on my system).

(Incidentally, "%.8e" isn't sufficiently precise if FLT_DIG is
unusually large, though that's not relevant for the examples we're
discussing here.)

CBFalconer · May 8, 2009

Keith said:
.... snip ...

If you take differences or use products and quotients, you're
performing floating-point operations. We all know that such
operations are inexact. The discussion is about stored values,
not operations.

You seem to be claiming that the inaccuracies in a floating-point
calculation (which we both agree exist) are somehow magically but
undetectably retained when the result of the calculation is stored.

That depends on the uses to which the value is put. However there
is ALWAYS the intrinsic uncertainty imposed by the 'range', i.e.
via the EPSILONs.

I thought I was getting somewhere in explaining the facts. No
matter what any document says, the physics and mathematics of the
situation take precedence. I seem to have failed again to put that
across.

CBFalconer · May 8, 2009

Keith said:
.... snip ...

Yes! The stored float value represents *a* real value in the
range. It doesn't represent the range.

However your *a* means any of the values in that range. There is
no reason to prefer one over another[1]. Thus 'the range' is a
shorthand means of specifying that.

[1] barring strict analysis of the definition coding.

CBFalconer · May 8, 2009

Keith said:
.... snip ...

A stored value of type double cannot represent the real value
1.0+DBL_EPSILON/2.0, any more than a stored value of type int
can represent the value 1.5.

So what? doubles are used to store reals, more or less. ints are
used to store integers. 1.0+DBL_EPSILON/2.0 is a real[1]. 1.5 is
NOT an integer.

[1] but not a storable real, in a double.

Keith Thompson · May 8, 2009

Joe Wright said:
float x = 0.3333333432674407958984375;
float y = 3.33333343e-01;

The stored values in x and y are identical.

(x == y) is 1

Agreed (given certain reason able assumptions).

My point, however, is that the float value stored in x corresponds to
the mathematical real value 0.3333333432674407958984375 . The
mathemetical value 0.333333343 (notation tweaked for consistency)
cannot even be represented as a float value, though of course a close
approximation of it can.

Since the topic of this discussion is exact stored values, an
appoximation, even one that's sufficiently close for most purposes,
would not have been sufficient to make my point.

Sorry I called you Richard. No offense intended.

No problem.

Keith Thompson · May 8, 2009

CBFalconer said:
That depends on the uses to which the value is put. However there
is ALWAYS the intrinsic uncertainty imposed by the 'range', i.e.
via the EPSILONs.

I thought I was getting somewhere in explaining the facts. No
matter what any document says, the physics and mathematics of the
situation take precedence. I seem to have failed again to put that
across.

Physics nothing to do with this. Again, I'm not talking about
physical measurements.

I'm talking about the C standard, which you have yet to cite in any
way that directly supports your claims.

Keith Thompson · May 8, 2009

CBFalconer said:
Keith said:

... snip ...

Yes! The stored float value represents *a* real value in the
range. It doesn't represent the range.

Click to expand...

However your *a* means any of the values in that range. There is
no reason to prefer one over another[1]. Thus 'the range' is a
shorthand means of specifying that.

[1] barring strict analysis of the definition coding.

Please don't tell me what I mean.

I mean that a single real value (that happens to be within some
range).

RTFS.

Keith Thompson · May 8, 2009

CBFalconer said:
Keith said:

... snip ...

A stored value of type double cannot represent the real value
1.0+DBL_EPSILON/2.0, any more than a stored value of type int
can represent the value 1.5.

Click to expand...

So what? doubles are used to store reals, more or less. ints are
used to store integers. 1.0+DBL_EPSILON/2.0 is a real[1]. 1.5 is
NOT an integer.

[1] but not a storable real, in a double.

It's that "more or less" that bites you, isn't it? Your model makes
some sense if you ignore those pesky details where it falls apart.

ints can only store a small subset of the set of integers. No
integers outside that subset can be stored in an int object. It
happens that the set of int values completely covers a range of
mathematical integers; that's one difference between ints as a subset
of integers and doubles as a subset of reals.

doubles can only store a small subset of the set of reals. No value
that's not in that subset can be stored in a double object.

Have you read C99 5.2.4.2.2 paragraphs 1-2? (In PDF or hard copy;
plain text doesn't show the formula.) Do you understand what it
means?

Please do not post a followup to this article without quoting and
responding to the previous paragraph, starting with "Have you read".

Keith Thompson · May 8, 2009

Joe Wright said:
Boo. 'float x = 1/3.;' to assign a value to the float x. If I print
the value with %.25f I get your wide number. If I print with '%.9f' I
get my narrower one. There is no difference in the value of either
expression when stored in a float.

Yes, when stored in a float. I'm talking about the relationship
between stored float values and mathematical real values.

That 0.333333343 cannot even be
represented as a float value is wrong.

No, as I said, a close approximation of 0.333333343 can be stored as a
float value, but the exact mathematical value 0.333333343 (i.e., the
rational number 333333343 / 1000000000) cannot (unless FLT_RADIX is a
power of ten).

Again, my approximation is every bit as good as yours for float. In
fact mine is closer to 1/3. than yours.

Am I really being that unclear?

The point is that my "approximation" *isn't an approximation*. It is
the exact mathematical value represented by the floating-point vlaue
stored in x. In most contexts, the distinction is probably
unimportant. In the context of this discussion, the distinction is
central to the point I've been trying to make.

To put it another way, the stored value 0.3333333432674407958984375 is
exactly equal to the rational number 11184811 / 33554432 where the
denominator is 2**25. In a binary-radix floating-point system, each
representable number (ignoring infinities and NaNs) is exactly equal
to some rational number where the denominator is a power of 2.

The approximation 0.333333343, though it's perfectly fine in most
contexts, doesn't demonstrate this.

Flash Gordon · May 9, 2009

Keith said:
CBFalconer said:

Keith Thompson wrote:
... snip ...

A stored value of type double cannot represent the real value
1.0+DBL_EPSILON/2.0, any more than a stored value of type int
can represent the value 1.5.

Click to expand...

So what? doubles are used to store reals, more or less. ints are
used to store integers. 1.0+DBL_EPSILON/2.0 is a real[1]. 1.5 is
NOT an integer.

[1] but not a storable real, in a double.

Click to expand...

It's that "more or less" that bites you, isn't it? Your model makes
some sense if you ignore those pesky details where it falls apart.

<snip>

In addition doubles *are* used to store exact integral values and
integer types *are* used to store approximations. Any claim that you
have stored a range when you have in fact stored the exact value that
you intended to store is clearly wrong and would make error analysis
impossible.

Have you read C99 5.2.4.2.2 paragraphs 1-2? (In PDF or hard copy;
plain text doesn't show the formula.) Do you understand what it
means?

The model tells you whether you can store the exact values you would
like to store (when that is what you want to do).

Please do not post a followup to this article without quoting and
responding to the previous paragraph, starting with "Have you read".

Yes, it would be helpful if CBF did not ignore most of the points being
made.

Phil Carmody · May 9, 2009

Mark McIntyre said:
On 07/05/09 22:36, Richard Heathfield wrote:

The value stored there is the result of dividing 1.0 by 3.0

Oh, I see - sophistry.

Nope, argumentation. Of the factually correct variety.

I think you'll find that (barring weird compiler bugs), you are
guaranteed to have a value in the range suggested by CBF.

Almost certainly not, as he's the least reliable source of
information on this topic that c.l.c currently has.

Your comment
isn't incompatible with that, but the word "No" at the start, and the
capitalisation of "know" are misleading and somewhat
disingenuous.

Richard's absolute ('No') is absolutely correct, and not misleading
in any way. If you think you've been misled, then that's because
you're on the wrong path, and attempting to head further down the
wrong path - Richard's trying to pull you onto the correct path.

Phil

Phil Carmody · May 9, 2009

Joe Wright said:
Boo. 'float x = 1/3.;' to assign a value to the float x. If I print
the value with %.25f I get your wide number. If I print with '%.9f' I
get my narrower one. There is no difference in the value of either
expression when stored in a float. That 0.333333343 cannot even be
represented as a float value is wrong.

Again, my approximation is every bit as good as yours for float. In
fact mine is closer to 1/3. than yours.

You are so completely missing the point, it's painful to watch.

The values of (normalised) floating point variables are defined to
be real numbers. Keith provided the exact value, as a real number,
in order to specify the variable's value. That was the only truly
correct thing that he could do.

All you did was provide an expression for a different real number
which when converted to floating point type results in the same value
as Keith's (given fair presumptions about the FPU in use). Your value
is no more special than an infinitude of others which also have that
property. Only Keith's had the unique property of exact equality in
the reals.

Phil

BartC · May 9, 2009

Richard Heathfield said:
CBFalconer said:

Um, the value that was stored was stored. This isn't hard.

You don't need that. If you want to know the value that was stored,
all you have to do is look at it.

How? That's not so easy with floating point values.

Ben Bacarisse · May 9, 2009

BartC said:
How? That's not so easy with floating point values.

The %a format works well for this. To get exact value you also need
to know FLT_RADIXÂ ad FLT_MANT_DIG (or DBL_MANT_DIG).

Beej Jorgensen · May 9, 2009

CBFalconer said:

Right there is our fundamental disagreement. FP objects can only
store discrete values, with defined intervals between the available
values. Thus you can never tell which value in that range was
stored,

Is the interval actually defined for anything other than 1.0?

Richard Heathfield said:
Um, the value that was stored was stored. This isn't hard.

"You see this? This is this." CBF means the value that was attempted
to be stored, not the value that was stored.

I think the ultimate answer is in c99 6.3.1.4p2, 6.3.1.5p2, and
probably elsewhere:

# If the value being converted is in the range of values that can be
# represented but cannot be represented exactly, the result is either
# the nearest higher or nearest lower representable value, chosen in an
# implementation-defined manner.

The Standard, as such, is defining a relationship between an
unrepresentable subspace and its neighboring representable values:

s t
A|-----|B|-----|C

It says if a conversion ends up in unrepresentable subspace s, the value
stored is A or B. If a conversion ends up in subspace t, the value
stored cannot be A--it must be B or C.

So if I tell you the stored value is EXACTLY B, you really can't tell me
what mathematical value led to that result, but you can tell me with
certainty that it fell somewhere between A and C. This is my read on
what CBF is saying.

Yet the value ultimately stored is EXACTLY B. This is my read of what
everyone else is saying. I don't think CBF actually disagrees with
this.

What's really weird is that the argument persists.

-Beej

Need Helping adding Square root code to an existing calculator. (Absolute begginer?)	0	Jan 12, 2025
How to alter the program so that when user types z or Z or 0, the program sets both a and b to zero?	0	Oct 10, 2022
Where is my mistake? Why is s equal to minus infinity at some loop iterations?	0	Oct 9, 2022
Comparison of Integer and Pointer (that's supposed to be an Integer). Where did I go wrong?	0	Nov 19, 2022
Structures and chained lists questions :	1	Feb 12, 2011
Rich Text Format (RTF) Document Builder in C++: Code and Features	0	Sep 28, 2025
Runtime Error with __gcd? (floating point exception)	1	Nov 27, 2024
Secure Keyboard v2.0 Modern C++ Virtual Keyboard for Windows (Glassmorphism UI, Clipboard Auto-Clear)	0	Mar 26, 2026

Float comparison

CBFalconer

Keith Thompson

Keith Thompson

Keith Thompson

Keith Thompson

Keith Thompson

CBFalconer

CBFalconer

CBFalconer

Keith Thompson

Keith Thompson

Keith Thompson

Keith Thompson

Keith Thompson

Flash Gordon

Phil Carmody

Phil Carmody

BartC

Ben Bacarisse

Beej Jorgensen

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads