Float comparison

C

CBFalconer

Keith said:
.... snip ...


To be clear, a and b are PRN objects, and A and B are real values,
yes? And those real values are not exactly representable as PRNs
(there are no PRN values whose corresponding real values exactly
match A and B). Therefore it is simply not possible to store A
in a, or B in b. Only a PRN value can be stored in a PRN object.

What you can do, of course, is store a *close approximation* of
a given real value in a PRN object. Thus we can store the PRN
value ap in the PRN object a, and the PRN value bp in in the PRN
object b. Each of these PRN values has a unique corresponding
real value; call them apr and bpr if you like.

I think the introduction of PRNs has simply complicated the
argument. You are busy denying that fp values represent a range,
while I insist that they do. We also have a problem with what to
call an FP value - one option is the number in the FP object (which
is exact), and another is what numbers that thing can represent,
which is a range of values. We hang the 'value' stamp on both.

I am always considering the fp object in isolation - i.e. there is
no examination of the code that created a value to store in it.

I maintain that your PRN description omits some of the fundamentals
of fp descriptions, and that the result is unusable. This is tied
into the actual 'ranges' implied in fp values, in that no real
value can belong to more that one fp 'range'. An fp 'range' is the
range associated with an fp exact value.

I am clear in my own mind about what I am talking about. I may not
be expressing it in a suitable manner to impress others.

Consider one process of generating a fp object. One operation is
inputing integral values from a text file, such as stdin. Another
is assigning those results to fields, by examining spaces, decimal
points, etc. in the input. We pass that mess to some routine, that
creates the fp value. We may do this several times, and let fp
operations combine the fp values to form a final fp value. We
can't define all this exactly without a detailed description of the
fp system. When we are done, we have the final fp value. Now the
question is 'what does this represent'.

I maintain the best possible answer is that it represents some
value in the fp 'range' associated with that fp value. It may be
worse, but it can't be better known.

....
 
B

BartC

Ike Naar said:
Just out of curiosity I tried that on my desktop (x86/OpenBSD/gcc+glibc).
Floatingpoint type used is float.

fpav = 1.5
fpav is exactly representable as a float, with bit pattern 0x3fc00000

FLT_EPSILON = 0.00000011920928955078125

The range for fpav, the way you define it,
is fpav*(1-FLT_EPSILON) through fpav*(1+FLT_EPSILON)
is 1.5 * 0.99999988079071044921875 through 1.5 * 1.00000011920928955078125
is 1.499999821186065673828125 through 1.500000178813934326171875

This "range" has an unexpected feature:

prev(fpav), the previous representable float smaller than fpav,
has bit pattern 0x3fbfffff and value 1.49999988079071044921875

next(fpav), the next representable float larger than fpav,
has bit pattern 0x3fc00001 and value 1.50000011920928955078125

It seems this increment (EPSILON?) is the difference corresponding to
+0x00000001.

However this is going to be the same for all values 0x3f800000 to 0x3fffffff
(1.0 to just under 2.0).

When the value hits 2.0, then it will double to 0.000000238..., staying
constant until 4.0 then doubling again, and so on.

This means the 'range' of floating point values is modulated in a peculiar
way (increasing in steps while values increase linearly), and is not very
easy to work out.

Assuming that anyone would want to, floating point arithmetic has enough
problems already without worrying about this particular nonsense.
 
C

CBFalconer

Keith said:
.... snip ...


Yes, I am looking at the value of the fp object. Are you now
acknowledging that there is such a thing, and that it's a real
value, not a range?

If that will satisfy you, yes. But that value is useless, in that
it tells us nothing. What it does is give a name to the 'range'
represented by that fp object. Note that that 'range' includes
that fp value.
Your concept of "what the value represents" is not something
that exists in a C program or is defined by the C standard.


Right. To know what the value represents, all you have to do is
make something up.

No, all we have to do is think about it. The results of the
thoughts may or may not be useful in any particular case, but they
remain true. Claiming the dogs tail is a leg doesn't work.

....
 
C

CBFalconer

Keith said:
.... snip ...

Was that intended as a correction? Your statement is true, but
it's a very different statement from the one Flash made. Flash's
statement is also true.

Between two consecutive floating-point numbers, there are
infinitely many real values. Between two consecutive floating-
point numbers, there are no other floating-point numbers (by
definition of the word "consecutive").

I was simply trying to clear up the confused use of 'floating point
number'.
 
C

CBFalconer

Ike said:
Just out of curiosity I tried that on my desktop
(x86/OpenBSD/gcc+glibc). Floatingpoint type used is float.

fpav = 1.5
fpav is exactly representable as a float, with bit pattern 0x3fc00000

FLT_EPSILON = 0.00000011920928955078125

The range for fpav, the way you define it,
is fpav*(1-FLT_EPSILON) through fpav*(1+FLT_EPSILON)
is 1.5 * 0.99999988079071044921875 through 1.5 * 1.00000011920928955078125
is 1.499999821186065673828125 through 1.500000178813934326171875

If I didn't mention that I was not worrying about equalities in the
limit, I should have. That higher value for fpav range is actually
outside the range. The lower value is also outside the range. You
can see this by examining the hex representation. i.e. 1.5 is
0x3fc00000. The surrounding values are exactly 0x3fbfffff and
0x3fc00001. This hex usage eliminates the (possible) confusion
involved in translating to and from decimal.

I am guessing at the exact representation on your system from your
published hex values.
This "range" has an unexpected feature:

prev(fpav), the previous representable float smaller than fpav,
has bit pattern 0x3fbfffff and value 1.49999988079071044921875

next(fpav), the next representable float larger than fpav,
has bit pattern 0x3fc00001 and value 1.50000011920928955078125

That means that prev(fpav) and next(fpav) are well inside fpav's range!
fpav's range, as defined by you, contains three representable floats:
prev(fpav), fpav itself, and next(fpav).

This means that my definition was flawed. As you can see, the flaw
is not very large. You will find other confusions by looking at
the range of 1.0, because the hex representation will change as the
value becomes smaller than 1.0.
 
K

Keith Thompson

CBFalconer said:
I think the introduction of PRNs has simply complicated the
argument. You are busy denying that fp values represent a range,
while I insist that they do. We also have a problem with what to
call an FP value - one option is the number in the FP object (which
is exact), and another is what numbers that thing can represent,
which is a range of values. We hang the 'value' stamp on both.

The introduction of PRNs was intended to simplify the argument, by
demonstrating that it's perfectly possible to define a number-like
entity where each value represents a unique real number, not a range,
and that such an entity would be useful. If you refuse to accept that
possibility, there's nothing I can do about it.

And no, *we* don't hang the 'value' stamp on both the single value and
the range; you're the only one doing that.
I am always considering the fp object in isolation - i.e. there is
no examination of the code that created a value to store in it.

So am I, so we might as well stop mentioning the code that created the
value.
I maintain that your PRN description omits some of the fundamentals
of fp descriptions, and that the result is unusable. This is tied
into the actual 'ranges' implied in fp values, in that no real
value can belong to more that one fp 'range'. An fp 'range' is the
range associated with an fp exact value.

This is the claim you keep making. You seem to think it's obvious.
I say it's wrong.

The C standard agrees with me. C99 5.2.4.2.2p2 *clearly* says that a
floating-point number *is* a unique real number. How can you ignore
that?

And again, how are PRNs unusable? Please provide a concrete example.
(In your last attempt to do so, you claimed that ">" couldn't be
defined consistently for PRNs; you claimed this before and after I
demonstrated how it could.)
I am clear in my own mind about what I am talking about. I may not
be expressing it in a suitable manner to impress others.

I'd be impressed if you could base your claims on the C standard. You
obviously can't.
Consider one process of generating a fp object.
[snip]

No. My concern is only with the meaning of an fp value, regardless of
how it was generated.

If an int object i contains the value 42, it doesn't matter whether
it was generated via
int i = 42;
or
int i = 6 * 7;
or
int i = 429 / 10;
The meaning of the stored value 42 is independent of how it was
generated.

If a double object x contains the value 42.0, it doesn't matter how it
was generated. The meaning of 42.0 is independent of how it was
generated.

And that meaning isn't what you think it is.
When we are done, we have the final fp value. Now the
question is 'what does this represent'.

That question is answered by 5.2.4.2.2, in a manner that is simply
inconsistent with your claims.
I maintain the best possible answer is that it represents some
value in the fp 'range' associated with that fp value. It may be
worse, but it can't be better known.

A stored value of 1.0 means the real number 1.0. That's better (more
precise) than this range you keep insisting on.
 
C

CBFalconer

Keith said:
.... snip ...

If an int object i contains the value 42, it doesn't matter
whether it was generated via
int i = 42;
or
int i = 6 * 7;
or
int i = 429 / 10;
The meaning of the stored value 42 is independent of how it was
generated.

If a double object x contains the value 42.0, it doesn't matter
how it was generated. The meaning of 42.0 is independent of how
it was generated.

No, it isn't. To quote the standard again:

5.2.4.2.2:

... snip ...

[#10] The values given in the following list shall be
replaced by implementation-defined constant expressions with
(positive) values that are less than or equal to those
shown:

-- the difference between 1 and the least value greater
than 1 that is representable in the given floating
point type, b1-p

FLT_EPSILON 1E-5
DBL_EPSILON 1E-9
LDBL_EPSILON 1E-9

and your double object can be representing anything fom
42*(1-DBL_EPSILON) though 42*(1+DBL_EPSILON). (typical). Without
detailed examination of the code you can't tell anything more. The
standard specifies the makeup of a fp value in terms of
significand, sign, and exponent. That forces all the 'ranges' to
consist of 'touching' real values, and ensures that no real value
can belong to more than one 'range'. That is the magic that
enables the sensible use of <, >, <=, >= conditions, but not ==.

Yes, I am creating another critical name, 'touching'.

Note that a 'range' consists of real values, and any interval of
real values contains an infinite number of values. There is only
one value of the fp object, so the odds of its accuracy are
negligible. This has nothing to do with the C standard, but with
mathematics.

There is continuous confusion here from confusing the value
represented by the fp object, and the number stored in the fp
object. They are probably different.

....
 
K

Keith Thompson

CBFalconer said:
Keith said:
... snip ...

If an int object i contains the value 42, it doesn't matter
whether it was generated via
int i = 42;
or
int i = 6 * 7;
or
int i = 429 / 10;
The meaning of the stored value 42 is independent of how it was
generated.

If a double object x contains the value 42.0, it doesn't matter
how it was generated. The meaning of 42.0 is independent of how
it was generated.

No, it isn't. To quote the standard again:

5.2.4.2.2:

... snip ...

[#10] The values given in the following list shall be
replaced by implementation-defined constant expressions with
(positive) values that are less than or equal to those
shown:

-- the difference between 1 and the least value greater
than 1 that is representable in the given floating
point type, b1-p

Incidentally, that "b1-p" is actually "b**(1-p)". The standard uses a
superscript to denote exponentation. For some purposes, a plain-text
copy of (a draft of) the standard really isn't adequate.
FLT_EPSILON 1E-5
DBL_EPSILON 1E-9
LDBL_EPSILON 1E-9

Yes, yes, we all know what the *_EPSILON constants mean -- and it's
not what you seem to think they mean. FLT_EPSILON has a very specific
meaning, and you've just quoted it.
FLT_EPSILON == nextafter(1.0, 2.0) - 1.0
(See C99 7.12.11.1 for the nextafter() function.)

You've seized on these constants and assumed that they have some deep
significance in the meaning of all floating-point values. They really
aren't that fundamental. They're just the difference between 1.0 and
the smallest floating-point number greater than 1.0.

1.0 is representable in type double. 1.0+DBL_EPSILON is representable
in type double. There are no values of type double between those two
values.

double x = 1.0;
double y = 1.0 + DBL_EPSILON;

There really is a gap between the values of x and y, and x and y do
not have "ranges" that fill in that gap. An object of type double can
only represent one of a relatively few distinct values. Real values
between those distinct values are simply not covered. (Perhaps that's
the idea that you're unable to accept.)

In fact, the *_EPSILON values can be derived from the model of
floating-point numbers given in the first two paragraphs of 5.2.4.2.2,
along with the values of b, emin, emax, and p. If your idea about
"ranges" can be derived from the description of *_EPSILON, you should
be able to derive it from the model.

[...]
There is continuous confusion here from confusing the value
represented by the fp object, and the number stored in the fp
object. They are probably different.

They're *probably* different?

The "value represented" and the "number stored" are the same thing.
What you're referring to as the "number stored", if I understand you
correctly, does not exist. Given:
double frac = 1.0 / 3.0;
I think you're saying that the "number stored" is the real value
one-third. That value never exists during the compilation or
execution of the C program. The value that does exist is a rational
number, a multiple of a power of 2.0 (or, more generally, of
FLT_RADIX) that is a close approximation of one-third. And that is
the value represented by frac after it's been initialized, and it's
the value you'll see if you examine the object later.

There is no magical inaccessible memory of the real value one-third
existing in or near the program. It's just not there.

And if the "number stored" is something other than that, then what
exactly does C99 6.2.4p2 ("... and retains its last-stored value
throughout its lifetime.") mean?
 
K

Keith Thompson

William Pursell said:
There are many different reasonable interpretations of the
value stored. One reasonable interpretation is that the stored
value is a representative of an interval in the reals. However,
that is NOT the only reasonable interpretation, and it is
certainly not an interpretation that is mandated by the
C standard.

It is, in fact, an interpretation that is flatly contradicted by the C
standard.
 
I

Ike Naar

This means that my definition was flawed. As you can see, the flaw
is not very large.

In this particular case, your range was 1.5 times too wide.
For numbers slightly smaller than a power of 2, your ranges are almost
two times too wide.
You will find other confusions by looking at
the range of 1.0, because the hex representation will change as the
value becomes smaller than 1.0.

For me this whole "range" idea makes little sense.
Suppose you manage to provide a proper definition for the range,
what can you do with it that you can't do without?
 
F

Flash Gordon

Keith said:
There was nothing confused about it.

I agree completely with what Keith has stated in this sub-thread. I
believe I was entirely consistent about using real to refer to the set
in maths known as reals and floating point to refer to numbers in a
floating point number system (i.e. one where there is a finite limit on
the number of digits and a method of specifying the position of the
"point" (which could be a decimal point, a binary point, or a point in
any other number base) relative to the to the position of those digits
within a finite range of positions. I.e., I used "floating point number"
to refer to what is commonly called "floating point number" and "real"
for what is commonly called "real".

Keith's paraphrase of what I said is accurate. It only mentions the
special case (where I gave the general), but that special case is the
important one. Nothing important was lost in the rephrasing and the
terminology was unchanged.
 
F

Flash Gordon

CBFalconer wrote:

and your double object can be representing anything fom
42*(1-DBL_EPSILON) though 42*(1+DBL_EPSILON). (typical). Without
detailed examination of the code you can't tell anything more. The
standard specifies the makeup of a fp value in terms of
significand, sign, and exponent. That forces all the 'ranges' to
consist of 'touching' real values, and ensures that no real value
can belong to more than one 'range'. That is the magic that
enables the sensible use of <, >, <=, >= conditions, but not ==.

Yes, I am creating another critical name, 'touching'.

Given two numbers x and y whose values are "touching ranges" with y
being the next greater range than x then in your model we have:

x represents xmin to xmax
y represents ymin to ymax
x < y
xmax equal to ymin (there can be no intervening real numbers since if
there were any there would be an infinite number)

Now suppose there is a sequence of characters which is an accurate
finite representation of the mathematical value of xmax (and therefore
also of ymin).

Now supports we have a C fragment...

double var1 = mathematically_exact_representation_of_xmax_and_ymin;

Will x or y be stored in var?
Remember that FLT_ROUNDS could be -1

Now suppose we have some C code that mathematically should generate the
same value and stores it in var2. Which range will be stored there? How
can you tell without examining the rest of the code to see whether
fesetround() has been called?

As far as I can see this means that the ranges are not well defined.
Being not well defined your model cannot tell us whether var1==var2
should yield 0 or 1 since there is a real value that could be in either
or both ranges.
Note that a 'range' consists of real values, and any interval of
real values contains an infinite number of values. There is only
one value of the fp object, so the odds of its accuracy are
negligible. This has nothing to do with the C standard, but with
mathematics.

It is also inaccurate. There are lots of lines of C in the world of the
form:
double total = 0.0;
For each of those lines the odds of accuracy are as near to 100% as
makes no odds (there could be a bug in the compiler, so the probability
is not quite 100%).
There is also a lot of code which is carefully designed so that the
calculations always yield numbers which can be represented exactly
(doing integer arithmetic in a double to get extended range using the
fact that on the implementations of interest the range of integers
exactly representable in a double is greater than the range of long).
There is continuous confusion here from confusing the value
represented by the fp object, and the number stored in the fp
object. They are probably different.

The only person showing confusion between the number stored in an fp
object and anything else is you.

Oh, and I know that I have thought about it (sufficiently that I had an
interesting discussion else-thread about groups where points you have
not addressed were raised), and I'm sure Keith and most of the others
who have given good reasons for you being wrong have thought about it.
So your claim else-thread that what you are claiming is obvious "if you
think about it" (that might not be your exact words) is evidently wrong.
 
S

Spiros Bousbouras

They're *probably* different?

The "value represented" and the "number stored" are the same thing.
What you're referring to as the "number stored", if I understand you
correctly, does not exist. Given:
double frac = 1.0 / 3.0;
I think you're saying that the "number stored" is the real value
one-third.

I think he's saying the opposite, that the number represented
is 1/3 while the value stored is whatever you can read back.
 
C

CBFalconer

Keith said:
CBFalconer said:
Keith said:
... snip ...

If an int object i contains the value 42, it doesn't matter
whether it was generated via
int i = 42;
or
int i = 6 * 7;
or
int i = 429 / 10;
The meaning of the stored value 42 is independent of how it was
generated.

If a double object x contains the value 42.0, it doesn't matter
how it was generated. The meaning of 42.0 is independent of how
it was generated.

No, it isn't. To quote the standard again:

5.2.4.2.2:

... snip ...

[#10] The values given in the following list shall be
replaced by implementation-defined constant expressions with
(positive) values that are less than or equal to those
shown:

-- the difference between 1 and the least value greater
than 1 that is representable in the given floating
point type, b1-p

FLT_EPSILON 1E-5
DBL_EPSILON 1E-9
LDBL_EPSILON 1E-9

Yes, yes, we all know what the *_EPSILON constants mean -- and
it's not what you seem to think they mean. FLT_EPSILON has a
very specific meaning, and you've just quoted it.
FLT_EPSILON == nextafter(1.0, 2.0) - 1.0
(See C99 7.12.11.1 for the nextafter() function.)

No, you misunderstand the fundamentals. With the normal fp
implementation, the value 1+EPSILON will not be represented as
1.0. nextafter(1.0, 0.5) will be the first value less than 1.0 not
represented as 1.0. (I was not aware the nextafter function
existed - it makes explanations easier.) That 'range' that is
represented by 1.0 is the fundamental characteristic of the fp
value 1.0. A proper implementation of nextafter will handle the
funny results due to rounding policies, etc.
You've seized on these constants and assumed that they have some
deep significance in the meaning of all floating-point values.
They really aren't that fundamental. They're just the difference
between 1.0 and the smallest floating-point number greater than
1.0.

1.0 is representable in type double. 1.0+DBL_EPSILON is representable
in type double. There are no values of type double between those two
values.

double x = 1.0;
double y = 1.0 + DBL_EPSILON;

But you miss the fact that all values greater than 1.0 and less
than 1.0+EPSILON (I am dropping the DBL - it is just a typing
nuisance) are _represented_ by the value 1.0. If you go through
the same process to find the first value less than 1.0 that has a
separate value, you again have a portion of the 'range', i.e. that
portion that is less than 1.0. No matter what you do, using that
fp system, you cannot represent any real value in that 'range' as
anything other than 1.0. The necessary consequence is that when
you read 1.0 you don't know what it represents, other than
something in the 'range'.
There really is a gap between the values of x and y, and x and y do
not have "ranges" that fill in that gap. An object of type double can
only represent one of a relatively few distinct values. Real values
between those distinct values are simply not covered. (Perhaps that's
the idea that you're unable to accept.)

In fact, the *_EPSILON values can be derived from the model of
floating-point numbers given in the first two paragraphs of 5.2.4.2.2,
along with the values of b, emin, emax, and p. If your idea about
"ranges" can be derived from the description of *_EPSILON, you should
be able to derive it from the model.

And we can, easily. Look at the hex representation of 1.0. Find
the significand portion. Increment that by one least significant
bit. You have just formed nextafter(1.0, 2.0).
[...]
There is continuous confusion here from confusing the value
represented by the fp object, and the number stored in the fp
object. They are probably different.

They're *probably* different?

Yes, because there are an infinite number of reals in the 'range',
and only one object value. Ignoring the programming that set the
value, why should you prefer one result to another?
The "value represented" and the "number stored" are the same thing.
What you're referring to as the "number stored", if I understand you
correctly, does not exist. Given:
double frac = 1.0 / 3.0;
I think you're saying that the "number stored" is the real value
one-third. That value never exists during the compilation or
execution of the C program. The value that does exist is a rational
number, a multiple of a power of 2.0 (or, more generally, of
FLT_RADIX) that is a close approximation of one-third. And that is
the value represented by frac after it's been initialized, and it's
the value you'll see if you examine the object later.

No, you are imposing knowledge of the programming. I am looking
_solely_ at what is stored in the fp object. It can equally well
represent any value in the 'range'. Since there are an infinity of
such values, and only one so-called object value, the probability
that they are identical is extremely small.

For example, assume we are talking about the float type. We do
some calculations in double, and store the result in a float. I
know that the EPSILON for double is smaller than that for float (in
general). From the float I can deduce a range of possible values
for the original double. I cannot deduce the actual original
value. I need know nothing (such as DBL_EPSILON) about the double
to do this. I do need to know FLT_EPSILON.
 
C

CBFalconer

Flash said:
CBFalconer wrote:



Given two numbers x and y whose values are "touching ranges" with
y being the next greater range than x then in your model we have:

x represents xmin to xmax
y represents ymin to ymax
x < y
xmax equal to ymin (there can be no intervening real numbers since
if there were any there would be an infinite number)

However x represents >xmin to <xmax
and y represents >ymin to <ymax

eliminating the equal condition, and thus the annoying missing
reals.

....
 
C

CBFalconer

Flash said:
I agree completely with what Keith has stated in this sub-thread.
I believe I was entirely consistent about using real to refer to
the set in maths known as reals and floating point to refer to
numbers in a floating point number system (i.e. one where there
is a finite limit on the number of digits and a method of
specifying the position of the "point" (which could be a decimal
point, a binary point, or a point in any other number base)
relative to the to the position of those digits within a finite
range of positions. I.e., I used "floating point number" to refer
to what is commonly called "floating point number" and "real" for
what is commonly called "real".

Keith's paraphrase of what I said is accurate. It only mentions
the special case (where I gave the general), but that special
case is the important one. Nothing important was lost in the
rephrasing and the terminology was unchanged.

I am now completely confused about what this particular argument is
about. :) I see nothing objectionable in the quoted portion
above.
 
C

CBFalconer

Ike said:
In this particular case, your range was 1.5 times too wide. For
numbers slightly smaller than a power of 2, your ranges are
almost two times too wide.

Not so. Carefully reread the following paragraph in my reply.

---------------
| If I didn't mention that I was not worrying about equalities in
the
| limit, I should have. That higher value for fpav range is
actually
| outside the range. The lower value is also outside the range.
You
| can see this by examining the hex representation. i.e. 1.5 is
| 0x3fc00000. The surrounding values are exactly 0x3fbfffff and
| 0x3fc00001. This hex usage eliminates the (possible) confusion
| involved in translating to and from decimal.
|
| I am guessing at the exact representation on your system from
your
| published hex values.
---------------

If you look at things carefully you will see that the EPSILON
involved does not change for fp values from greater than 1.0 to
less than 2.0. The formulas stated will take care of this by the
way they round. You can also get some clues from the nextafter()
function, which I didn't realize existed.
For me this whole "range" idea makes little sense. Suppose you
manage to provide a proper definition for the range, what can you
do with it that you can't do without?

You can keep track of the accuracy of your answers, and when the fp
system is or is not adequate for the job.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,433
Messages
2,571,683
Members
48,796
Latest member
Greg L.

Latest Threads

Top