When is min(a, b) != min(b, a)?

S

Steven D'Aprano

That doesn't follow. The problem is not that x < nan returns False
because that is correct since x isn't smaller than nan.

Comparisons between things which are not comparable risk being terribly
misleading, and depend very much on how you define "less than" and
"greater than". If you insist that everything must have a boolean yes/no
answer ("Does the colour red have a better chance of becoming President
than a kick to the head?") then False is not an entirely unreasonable
result to return.

But if you consider that having "x is not smaller than y" be equivalent
to "x is greater than or equal to y" is more important than forcing a
boolean answer in the first place, then you need something to signal
Undefined, and an exception is the right solution unless you have multi-
valued logic system (True, False, Maybe, Undefined, ...)

SANE (Standard Apple Numerics Environment) explicitly states that it
signals an exception when doing ordered comparisons against NaNs because
to return False would be misleading. Apple went on to use the same rule
in their PowerPC Numerics. That's straight out of the Zen: Practicality
Beats Purity.
 
S

Steven D'Aprano

I disagree with your last sentence. We are dealing with quiet NaNs
which should not raise exceptions. x < nan is well defined in IEEE, it
is false.

I question that. The IEEE standard states that comparisons involving NaNs
are unordered and should signal INVALID. What that means at the high
level of x < NAN (etc.) isn't clear.

I'm having a lot of trouble finding the canonical IEEE-754 standard, so
I'm forced to judge by implementations and third party accounts. For
example, this is what IBM says:

http://publib.boulder.ibm.com/infocenter/lnxpcomp/v8v101/index.jsp?topic=/
com.ibm.xlf101l.doc/xlfopg/fpieee.htm

The IEEE standard defines several exception conditions that can occur:

....

INVALID
Operations are performed on values for which the results are not defined.
These include:
Operations on signaling NaN values
...
Comparisons involving NaN values
[end quote]


Note that *only* sNaNs signal invalid on arithmetic operations, but
*both* types of NaNs signal invalid on comparisons.

The same page also lists a table showing the result of such signals. I
won't reproduce the whole table, but it states that the INVALID signal
results in a NaN if exceptions are disabled, and no result (naturally) if
exceptions are enabled.

SANE (Standard Apple Numerics Environment) and Apple's PowerPC Numerics
also do the same. See for example:
http://developer.apple.com/documentation/mac/PPCNumerics/
PPCNumerics-37.html

....when x or y is a NaN, x < y being false might tempt you to conclude
that x >= y, so PowerPC Numerics signals invalid to help you avoid the
pitfall.
[end quote]

Regardless of deep philosophical questions about truth, that's a great
example of Practicality Beats Purity. And some of us think that raising
an exception would not only be more practical, but also more pure as well.
 
M

Mark Dickinson

I'm having a lot of trouble finding the canonical IEEE-754 standard, so
I'm forced to judge by implementations and third party accounts. For
example, this is what IBM says:

There's a recent draft of IEEE 754r (the upcoming revision to IEEE
754) at

http://www.validlab.com/754R/drafts/archive/2007-10-05.pdf

Sections 5.11 and 5.3.1 deal with comparison predicates and the maxnum/
minnum operations, respectively.

One of the problems here is that Python's == operator is used for two
different purposes: first, for numeric comparisons (where in some
sense it's semi-reasonable for NaN == NaN to return False, or raise an
exception), and second, for structural operations like checking set or
dict membership. Most of the time there's not too much conflict here,
but while I fully expect 2 == Decimal('2.0') to return True, it still
occasionally feels funny to me that I can't have Decimal("2.0") and
the integer 2 be different keys in a dict: certainly they're
numerically equal, but they're still fundamentally different objects,
dammit!

Any change to Python that made == and != checks involving NaNs raise
an exception would have to consider the consequences for set, dict,
list membership testing.

Mark


and if Python had separate operators for these two purposes it
wouldn't be Python any more.


Mark
 
A

Antoon Pardon

Comparisons between things which are not comparable risk being terribly
misleading, and depend very much on how you define "less than" and
"greater than". If you insist that everything must have a boolean yes/no
answer ("Does the colour red have a better chance of becoming President
than a kick to the head?") then False is not an entirely unreasonable
result to return.

But if you consider that having "x is not smaller than y" be equivalent
to "x is greater than or equal to y" is more important than forcing a
boolean answer in the first place, then you need something to signal
Undefined, and an exception is the right solution unless you have multi-
valued logic system (True, False, Maybe, Undefined, ...)

Why should we consider that? The world is full of partial orders. In
python we have sets and a partial order is perfectly mathematically
sound.
SANE (Standard Apple Numerics Environment) explicitly states that it
signals an exception when doing ordered comparisons against NaNs because
to return False would be misleading. Apple went on to use the same rule
in their PowerPC Numerics. That's straight out of the Zen: Practicality
Beats Purity.

What is misleading and what is not depends on the context. One could
argue that if comparisons against NaN should signal an exception that
the exception comes late and should already have been signaled at the
moment the NaN was produced because producing a NaN is already
misleading. It gives the impression you have a (meaningfull) result
until you later try to do something with it that is illegal for a
NaN but for legal for a float.

If you allow NaN's as a result then you have extended the floats to
a mathematical set that is now only partially ordered. I see nothing
wrong in having the comparion operators give the result that belongs
to such a partial order.
 
P

Pete Forman

Mark Dickinson said:
> Any change to Python that made == and != checks involving NaNs raise
> an exception would have to consider the consequences for set, dict,
> list membership testing.


> and if Python had separate operators for these two purposes it
> wouldn't be Python any more.

There are separate Python operators, "==" and "is".

The C99 standard, which Python defers to for its implementation, says
in 6.2.6.1.4: Two values (other than NaNs) with the same object
representation compare equal, but values that compare equal may have
different object representations.

In 7.12.13, the fmax and fmin functions treat NaNs as missing
arguments. Most other operations return NaN if an argument is NaN, or
for a domain error.

7.12.14 specifies comparison macros that are quiet versions of the
relational operators.

BTW floating-point exceptions in C and IEEE are not the same as
exceptions in higher level languages. The behavior of signalling NaNs
are not defined in C. Only quiet NaNs are returned from operations.
An invalid floating-point exception may well just set a status flag.
That may be tested after a set of calculations. With pipelining the
exact cause of the exception will be unknown.
 
M

Marc 'BlackJack' Rintsch

Mark Dickinson said:
Any change to Python that made == and != checks involving NaNs raise
an exception would have to consider the consequences for set, dict,
list membership testing.
[…]
and if Python had separate operators for these two purposes it
wouldn't be Python any more.

There are separate Python operators, "==" and "is".

So what? ``==`` is used for both ``==`` and set/dict/list membership
testing and "nested comparison" of those structures. There is no ``is``
involved.

Ciao,
Marc 'BlackJack' Rintsch
 
S

Steven D'Aprano

Why should we consider that?

It's a value judgement. Ask the Apple and IBM engineers and
mathematicians.

Personally, I think it is more useful to be able to assume that if x is
not less than y, it must be greater or equal to instead ("completeness"),
than it is to have a guarantee that x < y will never raise an exception.

Having said that, I think the opposite holds for sorting and calculating
the min() and max() of floats. Sorting should push the NaNs to one end of
the list (I don't care which) while min() and max() should ignore NaNs
and only raise an exception if all the arguments are NaNs.


The world is full of partial orders. In
python we have sets and a partial order is perfectly mathematically
sound.

Sure, we could define floats to have any sort of order we want. We could
define them to be ordered by their English spelling so that "five million
point three" would be less than "zero point one". But is it useful?

Putting aside sorting and max/min, what is the use-case for having
comparisons with NaN succeed? What benefit will it give you?
 
R

Robert Kern

Marc said:
Mark Dickinson said:
Any change to Python that made == and != checks involving NaNs raise
an exception would have to consider the consequences for set, dict,
list membership testing.
[…]
and if Python had separate operators for these two purposes it
wouldn't be Python any more.

There are separate Python operators, "==" and "is".

So what? ``==`` is used for both ``==`` and set/dict/list membership
testing and "nested comparison" of those structures. There is no ``is``
involved.

Sure, there is. ``is`` is checked before ``==``. But that's just an optimization
for the usual case where "x is x" implies "x == x", to bring this full circle,
something that is not true for NaNs.


In [48]: class A(object):
def __init__(self, x):
self.x = x
def __eq__(self, other):
print '%r.__eq__(%r)' % (self, other)
return self.x == other.x
def __ne__(self, other):
return not (self == other)
def __hash__(self):
print '%r.__hash__()' % (self,)
return hash(self.x)
def __repr__(self):
return 'A(%r)' % (self.x,)
....:
....:

In [61]: a = A(1)

In [62]: b = A(2)

In [63]: c = A(1)

In [64]: a is c
Out[64]: False

In [65]: a == c
A(1).__eq__(A(1))
Out[65]: True

In [66]: a == b
A(1).__eq__(A(2))
Out[66]: False

In [67]: L = [a, b, c]

In [68]: a in L
Out[68]: True

In [69]: b in L
A(2).__eq__(A(1))
Out[69]: True

In [70]: c in L
A(1).__eq__(A(1))
Out[70]: True

In [71]: S = set([a, b, c])
A(1).__hash__()
A(2).__hash__()
A(1).__hash__()
A(1).__eq__(A(1))

In [72]: a in S
A(1).__hash__()
Out[72]: True

In [73]: b in S
A(2).__hash__()
Out[73]: True

In [74]: c in S
A(1).__hash__()
A(1).__eq__(A(1))
Out[74]: True

In [75]: D = {a: 1, b: 2, c: 1}
A(1).__hash__()
A(2).__hash__()
A(1).__hash__()
A(1).__eq__(A(1))

In [76]: D
Out[76]: A(2).__eq__(A(1))
{A(1): 1, A(2): 2}

In [77]: a in D
A(1).__hash__()
Out[77]: True

In [78]: b in D
A(2).__hash__()
Out[78]: True

In [79]: c in D
A(1).__hash__()
A(1).__eq__(A(1))
Out[79]: True


--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
A

Antoon Pardon

It's a value judgement. Ask the Apple and IBM engineers and
mathematicians.

Personally, I think it is more useful to be able to assume that if x is
not less than y, it must be greater or equal to instead ("completeness"),
than it is to have a guarantee that x < y will never raise an exception.

Having said that, I think the opposite holds for sorting and calculating
the min() and max() of floats. Sorting should push the NaNs to one end of
the list (I don't care which) while min() and max() should ignore NaNs
and only raise an exception if all the arguments are NaNs.

My personal preference would be that python would allow people the
choice, with the default being that any operation that resulted
in a non numeric result would throw an exception.

People who somehow made it clear they know how to work with inf, and
NaN results, would get silent NaN where no exceptions would be thrown.

Maybe an intermediate level like what the Apple and IBM engineers
use, could be usefull too.
Sure, we could define floats to have any sort of order we want. We could
define them to be ordered by their English spelling so that "five million
point three" would be less than "zero point one". But is it useful?

Putting aside sorting and max/min, what is the use-case for having
comparisons with NaN succeed? What benefit will it give you?

I guess the same benefit it gives to those that have operations with
NaN succeed. If 3 * NaN should succeed and all sort of other stuff,
why suddenly make an exception for 3 < NaN.
 
M

Mark Dickinson

My personal preference would be that python would allow people the
choice, with the default being that any operation that resulted
in a non numeric result would throw an exception.

People who somehow made it clear they know how to work with inf, and
NaN results, would get silent NaN where no exceptions would be thrown.

I also think this would be the ideal situation. Getting there would
require a lot of thought, planning, a PEP or two, and some hard work
and tricky coding to deal with all the different ways that the C
compiler, libm and hardware might try to mess things up. Right now
I don't have time for this :-( Anyone else?
I guess the same benefit it gives to those that have operations with
NaN succeed. If 3 * NaN should succeed and all sort of other stuff,
why suddenly make an exception for 3 < NaN.

Right. This is especially true when the result of the comparison is
treated as number (i.e. 0 or 1) rather than as a boolean,
and goes back into the calculation in some way. On the other hand,
there are certainly situations where you want a comparison with a
NaN to raise an exception. I guess this is why IEEE-754r provides
two sets of comparison operators: signaling and non-signaling.

Mark
 
M

Mark Dickinson

People who somehow made it clear they know how to work with inf, and
NaN results, would get silent NaN where no exceptions would be thrown.

One other thing: there's an excellent starting point for considering
how things should work, in the form of the Decimal type. This does
exactly what you suggest: by default, users get Python exceptions
instead of NaNs and/or infinities, but the user can override the
defaults to switch off the various traps (Overflow, Invalid, etc.) and
work directly with NaNs and infinities if (s)he prefers.

Note also that decimal.py runs to over 5000 lines of (Python) code!
And that's without having to deal with the vagaries of the C-compiler/
library/OS/hardware, since everything's implemented directly in
Python. A surprisingly small amount of that code is actually 'real'
mathematics: most of it is devoted to dealing correctly with
infinities, NaNs, signed zeros, subnormals, `ideal' exponents, traps,
flags, etc.

Mark
 
A

Albert van der Horst

Comparisons between things which are not comparable risk being terribly
misleading, and depend very much on how you define "less than" and
"greater than". If you insist that everything must have a boolean yes/no
answer ("Does the colour red have a better chance of becoming President
than a kick to the head?") then False is not an entirely unreasonable
result to return.

But if you consider that having "x is not smaller than y" be equivalent
to "x is greater than or equal to y" is more important than forcing a
boolean answer in the first place, then you need something to signal
Undefined, and an exception is the right solution unless you have multi-
valued logic system (True, False, Maybe, Undefined, ...)

SANE (Standard Apple Numerics Environment) explicitly states that it
signals an exception when doing ordered comparisons against NaNs because
to return False would be misleading. Apple went on to use the same rule
in their PowerPC Numerics. That's straight out of the Zen: Practicality
Beats Purity.

This is the more so, keeping in mind that the original motivation for
Nan's is to avoid exceptions. In principle to keep algorithms clean,
you would like to have exceptions as soon as you divide by zero, or
overflow. This is extremely costly on high efficiency floating point
hardware, (you may have a pipeline stall to allow the point of
exception to be defined, even if the exception doesn't occur)
so IEEE allows to propagate the exception via Nan to occur at
a convenient time, e.g. when the output of a matrix multiplication
is inspected.

Comparisons mark the moment that a decision is made.
Now masking the problem, by taking an invalid hence arbitrary
decision based on an invalid result, is insane. The Nan is
forever gone, something not allowed by IEEE only in code that

[A case can be made however that min and max just propagate a
Nan and don't throw an exception, yet. ]

So python should throw. That is practicality *and* purity.

Groetjes Albert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top