Non-strictly Weak Sorting using STL?

J

James Kanze

James said:
Andy Champ wrote:
[...]
The results of floating point arithmetic are by and large
implementation defined (e.g., there is no guarantee that
addition of two floats yields a best possible approximation
to the sum of the represented numbers), but one could argue
that the standard requires addition to yield the actual sum
if the sum is representable).
I don't think it does.
The standard says in [5.7/3]:
The result of the binary + operator is the sum of the operands.
...
So, _if_ the involved floats are rational numbers (i.e. not
NAN or something like that) _and_ their sum is a float, then I
don't see where the standard grants an exception. Surely,
[5/5] doesn't apply:
If during the evaluation of an expression, the result is not
mathematically defined or not in the range of representable
values for its type, the behavior is undefined, ...
since the hypotheses are designed to avoid that case.

I'm not sure. As far as I can tell, the definition of 'sum' and
'product' for a float is implementation defined. It's certainly
not the definition which applies for real numbers, since this
would make the first statement you quote unimplementable.

The C++ standard actually says very little about the required
representation: "The value representation of floating-point
types is implementation defined." And nothing, as far as I can
see, about the signification of addition over this value
representation. The requirements on <limits> and <climits>
introduce additional requirements concerning the representation,
but not (as far as I can see) concerning what happens when you
add two values.

The C standard is a lot more specific, imposing a classical
floating point representation (more or less---the as if rule
still applies); it specifically states (§5.2.4.2.2/4 in
C99---although I don't have access to a copy of C90 here, I'm
pretty sure that this text is unchanged):

The accuragcy of the floating point operations (+, -, *,
/) and of the library functions in <math.h> and
<complex.h> that return floating-point results is
implementation defined. The implementation may state
that the accuracy is unknown.

Note that this section of the C standard is explicitly included
by reference in C++, in §18.2.2/4. Not where you'd particularly
expect to find it:), but it is there.

Personally, I'd consider any differences between C and C++ with
regards to the behavior of the basic types a defect in the C++
standard.

I also think that we can generally count on more than what the
standard guarantees. Most people today are working on machines
with IEEE floats, and IEEE does define and require a lot more.
In particular, it defines exactly what the results should be for
all of the four basic operations, for all possible values
(at least as long as they do not involve overflow or
underflow---I think an implementation is allowed to make the
behavior there programmable).

There's also the fact that other languages (e.g. Fortran) have
stricter requirements. (I remember the first Unix machine I
worked on. The people writing the Fortran compiler implemented
the Fortran intrinsics using the C math library...and found that
they didn't pass the Fortran validation suites we had. Some of
the library functions only had about four decimal digits
precision.)
True, but that only applies when the sum is _not_
representable as a float. In that case, you usually want a
best approximation and the standard does not go there.

See above. C quite explicitly allows a large degree of leeway.
As far as compatibility with the standard is concerned, I
think that a floating point format where each value is
irregular (i.e., not a real number but NAN or the like) would
be compatible with the standard. However, in that case the
if-part of my statement is never satisfied.

I think you could make a pretty strong argument (based on the
minimum requirements of §18.2 and §5.2.4.2.2 in the C standard)
that a float must be capable of representing at least 150
million different numeric values: FLT_DIG must be at least 6
(one million different values for a given exponent), and the
difference between FLT_MAX_EXP and FLT_MIN_EXP implies at least
150 different values for the exponent.
 
J

James Kanze

James Kanze wrote:
I did indeed mean that. And you know it.

Yes. I intentionally "misunderstood" to draw attention to the
fact that there are different arithmetics. I so happens that in
C and C++, the expression 1/3 is precisely defined to result in
exactly 0. And the value of 1.0/3.0 is implementation defined,
but it is highly unlikely that it will correspond to the
rational number 1/3. (According to the C standard, the results
of 1.0/3.0 may even be "unknown". I don't know of any
implementation which goes quite that far, however.)
To answer you and Kai:
Unlike numbers of infinite accuracy two different ways of
calculating a sum that should give the same answer may give
different results.
Back to my original numbers: (1/3)*2 (no apology for omitting
the .0 in any of these) is not likely to be the same as
1-(1/3);

Exactly. That's because you're not dealing with real numbers,
or rational numbers, but floating point numbers or integers.
With different definitions for the four basic operators than
real numbers or rationals.
the former is likely to be 0.66666666 (or something
of that ilk), and the latter 0.66666667. Neither of these is
a precise representation of a division of the value 1 by the
value 3.

The precise result of 2/3 in integer arithmetic is 0. IEEE also
defined a precise result for IEEE float and IEEE double. The
results are precise. They're not the same as they would be for
rationals or reals, but that's exactly my point.
I could rework that example in binary, but I think you can
understand the principle of my point.
I also understand that these values will be consistent on any
given architecture.

Interestingly enough, the standard doesn't require it. And in
practice, on an Intel architecture, they may actually vary in
certain contexts.

FWIW: I think we're disagreeing partially on vocabulary, or
perhaps more a philosophical how to view the question issue.
It's certainly true that the most frequent use of floating point
is to represent values which would normally be thought of as
real numbers in the real world. My point is that floating point
arithmetic isn't real arithmetic; the two don't obey the same
rules. I'm pretty sure you agree with that. My point is also,
however, that floating point arithmetic does precisely obey a
set of rules; they may be different than those of real numbers,
but they are just as precise, in their own way, as those for
real numbers. And experts in the field use these rules to
obtain very good results---when their floating point result is
reconverted into a real number (by the person reading the output
of the program), that real number is very close to the actual
value. I think it very important, however, to stress that
floating point doesn't obey the rules of real arithmetic,
because thinking it does will lead to too many errors. And the
problem isn't just one of "accuracy"---in floating point
arithmetic (a+b)+c != a+(b+c), for example.
 
J

James Kanze

This is a bit of a philosophical conundrum, because it comes
down to the definitions of the words like "actual,"
"specific," "number," and "value."

Well, mathematicians have a definition for "number". Or at
least, I think they do---said definition must cover natural
numbers, rational numbers, real numbers, complex numbers, and a
lot of other numbers as well. Floating point numbers included.
An IEEE floating point representation has only so many bits.
Sometimes, there would be no additional information in those
bits, anyway. In this case, we can think of the fp number as
corresponding to a single point on the real number line. For
example, we may literally specify 0.0, or 0.5, with no loss in
precision.
Other times, meaningful digits are lost by the encoding into
IEEE fp representation. This effectively introduces noise.
[1] In this case, the fp number corresponds to a range of
values, and a contiguous region on the real number line. For
example, pi isn't really equal to
11.00100100001111110110101010001000100001011010001100001, but
is within some epsilon of that value.

There is no number pi in floating point arithmetic. Nor in
integral arithmetic, nor in rational arithmetic, for that
matter.
11.00100100001111110110101010001000100001011010001100001 (I'm
assuming you are using binary notation here---otherwise, your
reference to pi doesn't make sense) is an exact number, however.
It's also a number in the set of values over which IEEE floating
point arithmetic is defined (unlike pi).
As an engineering undergraduate, I was taught to represent
noise as a margin of error (MoE), usually represented as a
percent of the original value. Keeping track of the MoE in a
result is straight-forward, if you know the MoE of your
inputs. Fortunately, IEEE specifies exactly the epsilon
(maximum amount of noise) introduced by fp encoding, so
precise calculations certainly are possible, in the sense that
the MoE is bounded and calculable. [2] The problem we now
have in industry is that some folks just do not seem willing
to keep track of accumulated noise values, so they just
blindly hope the noise is not more than some apparently large
value, e.g. 0.1%.

The problem is that you can't view floating point arithmetic as
real arithmetic with "noise". (The abstraction for arithmetic
with noise is interval arithmetic.) Real arithmetic has a
certain number of properties which don't hold for floating point
arithmetic. For anything but the simplest calculations, you
have to take those properties into account. Thus, for example,
an equation which converges in real arithmetic may not converge
in floating point arithmetic, but rather end up oscillating
between several values. On the other hand, experts in the field
do write equations which provably converge in floating point
arithmetic. Those equations are different from the ones used in
real arithmetic, however.
Any two arbitrary fp numbers a and b, with positive epsilons e
and f respectively, really represent the ranges [a-e,a+e] and
[b-f,b+f].

That's intervale arithmetic. That's not what a floating point
number means. A floating point number has an exact value;
there's no + or - about it.
If we perform an operation on them, the resulting region is
not hard to calcluate; e.g., for multiplication, just
distribute the terms:
[a-e,a+e] * [b-f,b+f] = [ab-af-be-ef,ab+af+be+ef]
As you can see, the result is ab +- (af+be+ef).

I'm not sure that always holds for floating point
multiplication, and I'm certain that it doesn't hold for
floating point addition. (Assuming you're thinking in terms of
real values.) In fact, given two precise values a and b, in the
set of floating point numbers, a*b will result in a precise
value (supposing no overflow, of course) in the set of floating
point values. That value may not be the same as you would get
if you multiplied two real values, but that's exactly my point.
 
K

Kai-Uwe Bux

James said:
James said:
Andy Champ wrote:
[...]
The results of floating point arithmetic are by and large
implementation defined (e.g., there is no guarantee that
addition of two floats yields a best possible approximation
to the sum of the represented numbers), but one could argue
that the standard requires addition to yield the actual sum
if the sum is representable).
I don't think it does.
The standard says in [5.7/3]:
The result of the binary + operator is the sum of the operands.
...
So, _if_ the involved floats are rational numbers (i.e. not
NAN or something like that) _and_ their sum is a float, then I
don't see where the standard grants an exception. Surely,
[5/5] doesn't apply:
If during the evaluation of an expression, the result is not
mathematically defined or not in the range of representable
values for its type, the behavior is undefined, ...
since the hypotheses are designed to avoid that case.

I'm not sure. As far as I can tell, the definition of 'sum' and
'product' for a float is implementation defined. It's certainly
not the definition which applies for real numbers,

As far as I can see, the standard does not contain a definition of "sum". In
[5/5] it refers to some expressions whose value could be "mathematically
defined", which indicates that it imports the notion of "sum" from there
for those types whose values (in part) include mathematical entities.
since this would make the first statement you quote unimplementable.

Why? The sum of floats representing real numbers is a float representing
their sum provided that float exists (i.e., the sum is representable).
Otherwise [5/5] applies and the result is undefined. I don't see why this
kind of addition would be "unimplementable".

I guess the difference of our interpretations is the following: you think of
the values of floating point types as someting where mathematical notions
do not apply so that "sum" has an implementation defined meaning. I think
of floating point types so that at least some values represent real numbers
so that "sum" has its ordinary mathematical meaning (at least in these
cases).

To illustrate the differences, let us first consider [5/5] if the
mathematical sum is not representable. According to your interpretation the
result is implementation defined, according to my interpretation, the
result is undefined. That is an outcome in favor of your interpretation.
However, I think, your interpretation cannot allow for floating point
overflow, i.e., the result of "+" will _always_ be implementation defined
never undefined since you interpret "sum" in such a way that [5/5] can
never apply.

Also, _if_ floating point values never can be treated like real numbers,
then it doesn't stop with operator+. You get the same for operator<, and
there wouldn't be a requirement at all that you have any kind of order.

The C++ standard actually says very little about the required
representation: "The value representation of floating-point
types is implementation defined." And nothing, as far as I can
see, about the signification of addition over this value
representation. The requirements on <limits> and <climits>
introduce additional requirements concerning the representation,
but not (as far as I can see) concerning what happens when you
add two values.

True. Note that I only made a conditional statment: "_if_ two floats
represent real numbers whose sum is representable as a float, then ...". At
that stage of the interpretation, it could very well be that the hypothesis
is never satisfied.

However, I think, the requirements in <limits> and <climits> are phrased in
such a way that they imply that at least some values of floating point
types _are_ real numbers. I could be wrong here, but I think that is the
intent.
The C standard is a lot more specific, imposing a classical
floating point representation (more or less---the as if rule
still applies); it specifically states (§5.2.4.2.2/4 in
C99---although I don't have access to a copy of C90 here, I'm
pretty sure that this text is unchanged):

The accuragcy of the floating point operations (+, -, *,
/) and of the library functions in <math.h> and
<complex.h> that return floating-point results is
implementation defined. The implementation may state
that the accuracy is unknown.

Note that this section of the C standard is explicitly included
by reference in C++, in §18.2.2/4. Not where you'd particularly
expect to find it:), but it is there.

Personally, I'd consider any differences between C and C++ with
regards to the behavior of the basic types a defect in the C++
standard.

That's a very good point. I will have to think about it.

Does it say anything about <, ==, ... ?


[snip]


Best

Kai-Uwe Bux
 
K

Kai-Uwe Bux

James said:
Well, mathematicians have a definition for "number". Or at
least, I think they do---said definition must cover natural
numbers, rational numbers, real numbers, complex numbers, and a
lot of other numbers as well. Floating point numbers included.

That's one reason why mathematicians don't :)

What mathematicians have is

(a) definitions for various algebraic structures, such as fields, domains,
rings, groups, ...

(b) constructions for the set of natural numbers, integers, rationals, real
numbers, complex numbers, ...

One should note that taking these constructions literally, e.g., the natural
numbers are _not_ a subset of the integers, the integers are _not_ a subset
of the rationals, etc. So, to make intuition works, one also constructs 1-1
maps

naturals numbers --> integers
integers --> rationals
...

that _preserve the rules of arithmetic_, i.e, 0 (natural number) goes to 0
(integer) and addition is preserved.


Then mathematicians can note that the set of integers is a domain and the
sets of rationals, reals, and complex numbers are fields, ...

Also, there are different constructions, say, for the reals. And these
constructions yield different but canonically isomorphic fields.


A unified notion of "number" was tried and rejected. (BTW: There is a
hilarious piece by Frege: "Ueber die Zahlen des Herrn Schubert", which does
away with one such attempt.) That does not mean that such a notion cannot
be found, it just means that (a) no one has found it yet and (b) no one is
actively searching.


Best

Kai-Uwe Bux
 
C

Christopher Dearlove

Kai-Uwe Bux said:
Then mathematicians can note that the set of integers is a domain and the
sets of rationals, reals, and complex numbers are fields, ...

One difference between floating point numbers and real or rational
numbers is that floating point numbers do not form a field. Even aside
from overflow/underflow issues, the basic operations of addition and
multiplication are not associative, i.e. (a+b)+c is not always equal to
a+(b+c), and the distributive rule can fail.

That doesn't mean that a well-defined set of floating point numbers
(with NaNs,, infinities or whatever you need) doesn't form a useful
mathematical structure, it's just not one of the usual ones found in most
abstract algebra texts. For detailed numerical analysis you probably
need to know about how much the associative etc. rules can fail by.
 
A

Andrew Koenig

Actually, I do have a relevant point. He said "Specifically, I wish to
sort these objects by a double field unless they are equal". I wish to
point out that "equal" is not a reliable measure for floating point
numbers.

I am having trouble understanding what meaning you are giving to the word
"reliable" here in order to make this sentence true.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,165
Latest member
JavierBrak
Top