Comparisons of incompatible types

J

John Nagle

Or a list that contains unhashable objects.

If you can't hash it, and it doesn't have some definition of
comparison associated with the object, you probably can't order
it properly, either.

"<" can't be some random function. For sorting to work,

a < b and b < c implies a < c

must hold.

John Nagle
 
S

Steven D'Aprano

As a sometime numerical person, I've been screaming at this from
the other side. The problem with comparing non-signalling NaNs is that
eventually, the program has to make a control flow decision, and it may
not make it correctly.

Then use signalling NANs. Nobody is suggesting that quiet NANs should be
compulsory, or are the solution for all problems. But they're a solution
for some problems, which is why people use them.

[...]
I personally think that comparing NaN with numbers or other
NaNs should raise an exception. There's no valid result for such
comparisons.

If NAN and 1 are unordered, then NAN is not less or equal to 1, nor is it
larger than 1. Hence both NAN <= 1 and NAN >= 1 are false. The problem
only comes when the caller mistakenly thinks that floats are real
numbers, and tries to reason about floats like they would reason about
real numbers.
 
G

geremy condra

  As a sometime numerical person, I've been screaming at this from
the other side.   The problem with comparing non-signalling NaNs is that
eventually, the program has to make a control flow decision, and it
may not make it correctly.

  I used to do dynamic simulation engines for animation.  I was
probably the first person to get ragdoll physics to work right,
back in 1996-1997.  In hard collisions, the program would get
floating point overflows, and I had to abort the interation, back
up, cut the time step down, and go forward again, until the time
step was small enough to allow stable integration.  This was
under Windows on x86, where it's possible, in a Windows-dependent
way, to catch signalling NaNs and turn the hardware exception into
a C++ exception.  If the computation just plowed ahead with
non-signalling NaNs, with a check at the end, it could go wrong
and produce bad results, because incorrect branches would be taken
and the final bogus results might not contain NaNs.

  I personally think that comparing NaN with numbers or other
NaNs should raise an exception.  There's no valid result for
such comparisons.

This, in big letters.

Geremy Condra
 
J

John Nagle

Here's an example where this issue produces invalid results in Python.
>>> NaN = float("nan")
>>> arr = [1.0, 4.0, 3.0, 2.0, 5.0, NaN, 6.0, 3.0, NaN, 0.0, 1.0, 4.0, 3.0, 2.0, 5.0, NaN, 6.0, 3.0, NaN, 0.0]
>>> sorted(arr)
[0.0, 0.0, 1.0, 1.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 4.0, 5.0, nan, 5.0,
6.0, nan, 4.0, nan, 6.0, nan]

The sorted numerical values aren't in order. Note the 4.0 near the
end, after the 6.0. "sort" has failed because it assumes that
a < b and b < c implies a < c. But that's not a valid assumption here.

It's not good to break trichotomy.

John Nagle
 
S

Steven D'Aprano

Here's an example where this issue produces invalid results in
Python.
NaN = float("nan")
arr = [1.0, 4.0, 3.0, 2.0, 5.0, NaN, 6.0, 3.0, NaN, 0.0, 1.0, 4.0, 3.0, 2.0, 5.0, NaN, 6.0, 3.0, NaN, 0.0]
sorted(arr)
[0.0, 0.0, 1.0, 1.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 4.0, 5.0, nan, 5.0,
6.0, nan, 4.0, nan, 6.0, nan]

The sorted numerical values aren't in order. Note the 4.0 near the end,
after the 6.0. "sort" has failed because it assumes that a < b and b <
c implies a < c. But that's not a valid assumption here.

It's not good to break trichotomy.

It's perfectly fine to break trichotomy. People do it all the time --
they prefer chicken to pizza, pizza to steak, but prefer steak to
chicken. (Modulo whatever foods you prefer. Or sports, or politicians, or
any one of many things which don't make up an equivalence relationship.)
Equivalence relationships make up only a tiny portion of relationships,
and it is both useful and conventional to use the same operators in both
situations.

What's not good is to assume trichotomy when it doesn't apply, but that's
no different from any other faulty assumption, e.g. a coder who assumes
multiplication is always commutative may be puzzled why his matrix
calculations are frequently wrong.

Python's sort assumes trichotomy, even when sorting floats. Perhaps it
shouldn't. But one way or another, that's an issue with sort, not with
the idea that there are data types where trichotomy doesn't apply. And
like it or not, floats are one of those data types, just as neither
commutativity nor associativity applies to floats -- even excluding NANs
and INFs.
 
T

Terry Reedy

Here's an example where this issue produces invalid results in
Python.
NaN = float("nan")
arr = [1.0, 4.0, 3.0, 2.0, 5.0, NaN, 6.0, 3.0, NaN, 0.0, 1.0, 4.0,
3.0, 2.0, 5.0, NaN, 6.0, 3.0, NaN, 0.0]
sorted(arr)
[0.0, 0.0, 1.0, 1.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 4.0, 5.0, nan, 5.0,
6.0, nan, 4.0, nan, 6.0, nan]

The sorted numerical values aren't in order. Note the 4.0 near the end,
after the 6.0. "sort" has failed because it assumes that a< b and b<
c implies a< c. But that's not a valid assumption here.

This is transitivity.

I believe that is that exactly one of <,=.> are true.
 
M

Mark Wooding

John Nagle said:
NaN = float("nan")
arr = [1.0, 4.0, 3.0, 2.0, 5.0, NaN, 6.0, 3.0, NaN, 0.0, 1.0, 4.0, 3.0, 2.0, 5.0, NaN, 6.0, 3.0, NaN, 0.0]
sorted(arr)
[0.0, 0.0, 1.0, 1.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 4.0, 5.0, nan, 5.0, 6.0,
nan, 4.0, nan, 6.0, nan]

The sorted numerical values aren't in order.

Indeed. You failed to provide a valid ordering to `sorted'. By failing
to satisfy its precondition, you relieved it of its obligation to
satisfy its postcondition.
"sort" has failed because it assumes that a < b and b < c implies a <
c. But that's not a valid assumption here.

It's not good to break trichotomy.

You're confused. The property

a < b and b < c => a < c

is called `transitivity'. But the `float' ordering /is/ transitive.
Note that a < b implies that neither a nor b is a NaN.

Also, trichotomy is unnecessary for sorting, and Python's `sort' method
doesn't require it; it does require a total preorder, though. What
properties does a total preorder require?

1. Transitivity: a <= b and b <= c => a <= c

2. Totality: a <= b or b <= a

The above list sorting goes wrong because the `float' ordering isn't
total. In particular, x </= NaN and NaN </= x for all x (including x =
NaN!).

So, your last remark is in the right direction (though strict trichotomy
is unnecessary, as I've mentioned), but apparently only by accident
since the preceding discussion is completely wrong.

-- [mdw]
 
J

John Nagle

I believe that is that exactly one of <,=.> are true.

Not for NaNs.
False

That's IEEE 754 compliant. NaN is not equal to NaN.
That's by design. But it really messes up sorting.

Python "dict" types, however, treat "NaN" as a unique value,
because they're hash based on the underlying representation.

(The best coverage of this whole topic was the Apple Numerics Manual
for the original Mac. Apple hired the floating point expert from
Berkeley to get this right. Then Apple went from the M68xxx series
to the IBM PowerPC, and 80-bit floats to 64-bit floats, breaking all
the engineering applications, most of which were never ported to the
PowerPC.)

John Nagle
 
S

Steven D'Aprano

You're confused. The property

a < b and b < c => a < c

is called `transitivity'.

Yes, but I believe that John is referring to the trichotomy (like a
dichotomy, only there are three options instead of two) that exactly one
of these relations are true:

a < b
a == b
a > b

It is true for real numbers, but not for IEEE floats, since none of the
three are true if either a or b are a NAN. (Non-IEEE floats could do
anything...)


[...]
2. Totality: a <= b or b <= a

The above list sorting goes wrong because the `float' ordering isn't
total. In particular, x </= NaN and NaN </= x for all x (including x =
NaN!).

I believe this is equivalent to trichotomy.

So, your last remark is in the right direction (though strict trichotomy
is unnecessary, as I've mentioned), but apparently only by accident
since the preceding discussion is completely wrong.

"Completely" is surely a tad strong -- John might not be using the exact
same terminology as you, but he's clearly talking about the equivalent
concepts. He wants and expects all data types to either meet a total
order, or raise an exception to any of the < > <= and >= operators.
 
S

Steven D'Aprano

(The best coverage of this whole topic was the Apple Numerics Manual
for the original Mac. Apple hired the floating point expert from
Berkeley to get this right. Then Apple went from the M68xxx series to
the IBM PowerPC, and 80-bit floats to 64-bit floats, breaking all the
engineering applications, most of which were never ported to the
PowerPC.)

I second John's recommendation re the Apple Numerics Manual. Even if the
Apple specific stuff is obsolete, it's a great resource for understanding
floating point issues.
 
M

Mark Wooding

Steven D'Aprano said:
Yes, but I believe that John is referring to the trichotomy (like a
dichotomy,

Then why did he say `it assumes that a < b and b < c implies a < c'?
This assumption is transitivity, not trichotomy.
I believe this is equivalent to trichotomy.

No, it isn't. In particular, the definition of totality I gave above
allows a <= b and b <= a and a /= b. You gave a perfectly good
definition of trichotomy, which I have moved out of its original order:
exactly one of these relations are true:

a < b
a == b
a > b

A total preorder (as defined above) doesn't exhibit this property -- but
can be described in terms of a total order (which /does/ exhibit proper
trichotomy) applied to a set of equivalence classes.
"Completely" is surely a tad strong -- John might not be using the
exact same terminology as you, but he's clearly talking about the
equivalent concepts. He wants and expects all data types to either
meet a total order, or raise an exception to any of the < > <= and >=
operators.

No. He was hopelessly confused, describing the problem in terms of a
failure of transitivity (which doesn't, in fact, fail), but using the
word `trichotomy', apparently more by luck than judgement.

I don't want to insist on a total order. Efficient sorting requires a
total preorder, and that's all I want.

-- [mdw]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top