max(NaN,0) should be NaN

J

jsavard

Steven said:

Well, because a NaN *could* be plus infinity, or a number too large to
be represented.

If one wants to *implement* NaNs *at all*, one's _reason_ for doing so
is because one wants computer arithmetic to produce accurate results -
rather than just plugging in the best representable number that fits,
and then giving a result that may not be valid.

So going to the trouble of bothering with NaNs, and then deciding that
treating them pessimistically is just too much bother in a few cases,
vitiates the entire enterprise!

John Savard
 
W

William Hughes

Steven said:
Well, because a NaN *could* be plus infinity, or a number too large to
be represented.

If one wants to *implement* NaNs *at all*, one's _reason_ for doing so
is because one wants computer arithmetic to produce accurate results -
rather than just plugging in the best representable number that fits,
and then giving a result that may not be valid.

So going to the trouble of bothering with NaNs, and then deciding that
treating them pessimistically is just too much bother in a few cases,
vitiates the entire enterprise!

And you join the list of people who are willing to state on the
basis of a few minutes thought and no research whatsoever, that the
people behind the IEEE 754 standard are idiots.


-William Hughes
 
T

Terje Mathisen

William said:
And you join the list of people who are willing to state on the
basis of a few minutes thought and no research whatsoever, that the
people behind the IEEE 754 standard are idiots.

There is at least one good reason for the current standard behavior:

It maintains the maximum amount of information.

I.e. doing the opposite which would be to require max(...,NaN,...) to
always be NaN simply discards everything we know about the representable
numbers in the array, even in the case where the NaN simply means Not
Applicable, i.e. 'skip this value'.

What's needed in this case is a side channel which can provide the fact
that at least one of the inputs were NaN, since this is critical when
the NaN is a result of a series of previous calculations which have
blown up.

OTOH there is an equally good reason for requiring the opposite
behavior, i.e. max(...) is NaN if any input value is NaN: This would
obey 'the principle of last astonishment', i.e. in all other IEEE fp
operations any NaN input will propagate into the output to make sure
that this critical piece of information cannot be hidden.

Terje
 
?

=?ISO-8859-1?Q?Jan_Vorbr=FCggen?=

What's needed in this case is a side channel which can provide the fact
that at least one of the inputs were NaN, since this is critical when
the NaN is a result of a series of previous calculations which have
blown up.

I think this apporach would only serve to confuse the issue.

Clearly, there are two current interpretations of the semantics of NaN:
one is that a NaN means some prior computation yielded a non-representable
and exceptional result, the other is that this data value should be ignored.
These different meanings imply different processing in certain situations,
but not in all: It appears to me that this applies in particular to all
reduction operators, with a correction to be applied whereever the count
of operands appears (e.g., for the average value reduction).

Given current facilites in Fortran, I think the reduction operators should
always obey the "exception" semantics - i.e., if ANY(ISNAN(input_data)) is
true, the result should be a NaN (perhaps define a new one?). My reason is
that the "ignore" semantics can be achieved easily by the user: instead of,
for example, writing

X = MAX(input_data)

she can write

X = MAX(input_data(WHERE(.NOT. ISNAN(input_data)))

Of course, if you define the operator semantics the other way, you can
get the exception semantics by writing

IF (ANY(ISNAN(input_data)) THEN
X = some_NaN
ELSE
X = MAX (input_data)
ELSEIF

so it perhaps boils down to frequency of use and to consistency of user
expectations.

The only other alternative would be to define some particular value - which
could be an IEEE 754 NaN - to signify "value not available", and to modify
the semantics of the reduction operators based on whether this value is
present or not. However, one still needs to define the behaviour of what to
do when _both_ an "exception" NaN _and_ an "ingore" NaN are present in the
input :cool:. My choice: the "exception" NaN takes precendence.

Jan
 
N

Nick Maclaren

|> > Steven G. Kargl wrote:
|> > > In article <[email protected]>,
|> > > (e-mail address removed) writes:
|> >
|> > > > This makes no sense, as the outcome of the operation is undefined and
|> > > > should be NaN.
|> > > > max(NaN,0.) = NaN
|> >
|> > > Why?
|> >
|> > Well, because a NaN *could* be plus infinity, or a number too large to
|> > be represented.
|> >
|> > If one wants to *implement* NaNs *at all*, one's _reason_ for doing so
|> > is because one wants computer arithmetic to produce accurate results -
|> > rather than just plugging in the best representable number that fits,
|> > and then giving a result that may not be valid.
|> >
|> > So going to the trouble of bothering with NaNs, and then deciding that
|> > treating them pessimistically is just too much bother in a few cases,
|> > vitiates the entire enterprise!
|>
|> And you join the list of people who are willing to state on the
|> basis of a few minutes thought and no research whatsoever, that the
|> people behind the IEEE 754 standard are idiots.

You are wrong on three counts:

max/min are not part of IEEE 754, and are not even in the appendix;
they are proposed as part of IEEE 754R.

He did not state that they were idiots - merely misguided - and you
have no evidence that he has done no research.

I am pretty sure that he knows of my analysis of the matter (and
document on it), where I do explain why he is correct and the IEEE 754R
people are wrong. And, in THIS respect, I believe that I have more
experience than any of the people on that group.

OK?


Regards,
Nick Maclaren.
 
N

Nick Maclaren

|>
|> There is at least one good reason for the current standard behavior:
|>
|> It maintains the maximum amount of information.
|>
|> I.e. doing the opposite which would be to require max(...,NaN,...) to
|> always be NaN simply discards everything we know about the representable
|> numbers in the array, even in the case where the NaN simply means Not
|> Applicable, i.e. 'skip this value'.

Sorry, Terje, but you have missed the point. By providing ways to lose
the NaN state quietly, you are rendering them useless as an error value.
50 years of experience shows that requiring programmers to check every
operation that might fail for possible failure simply does not work; if
checking isn't fail-safe, it is pointless.

As my document explains, there are numerous possible meanings of "not a
number", all of which imply different semantics. In particular, allowing
a missing value indication is one of the oldest and best, but is NOT what
IEEE 754 does and is NOT what IEEE 754R is proposing. The arguments
used for this behaviour by IEEE 754R have been copied from C99 and are
both false have are known to be false.


Regards,
Nick Maclaren.
 
N

Nick Maclaren

|>
|> The only other alternative would be to define some particular value - which
|> could be an IEEE 754 NaN - to signify "value not available", and to modify
|> the semantics of the reduction operators based on whether this value is
|> present or not. However, one still needs to define the behaviour of what to
|> do when _both_ an "exception" NaN _and_ an "ingore" NaN are present in the
|> input :cool:. My choice: the "exception" NaN takes precendence.

Correct. And, as I have pointed out to both C99 and IEEE 754R, but have
been ignored on purely dogmatic grounds, there are NO uses where ONLY the
"missing value" semantics are wanted, IEEE 754 doesn't support them in the
first place, and the max/min operations aren't the most important anyway.
There ARE uses where only the "error value" semantics are wanted.

In statistics, which is one of the claimed uses, missing value semantics
are wanted ONLY for specific reduction operations, and the ranking of
importance is:

Counting (i.e. number of non-NaNs)
Summation
Derivative operations (mean, variance etc.) |
Max/min | In some order
Multiplication |


Regards,
Nick Maclaren.
 
?

=?ISO-8859-1?Q?Jan_Vorbr=FCggen?=

Counting (i.e. number of non-NaNs)
Summation
Derivative operations (mean, variance etc.) |
Max/min | In some order

Those are all reductions.
Multiplication |

Do you mean scaling? If so, you probably want to cover any functional
transform of the values.

Jan
 
N

Nick Maclaren

|> > Counting (i.e. number of non-NaNs)
|> > Summation
|> > Derivative operations (mean, variance etc.) |
|> > Max/min | In some order
|>
|> Those are all reductions.

Precisely.

|> > Multiplication |
|>
|> Do you mean scaling? If so, you probably want to cover any functional
|> transform of the values.

Sorry, I meant product. I.e. reduction by multiplication.


The basic rules with both error and missing values are:

If any operand is erroneous, or the result is mathematically undefined,
the result is erroneous. Missing/0.0 is erroneous, for example.

Otherwise, if it is a reduction, only the non-missing values are used.

If it is not a reduction and any operand is missing, the result is
missing.

In practice, "a reduction" isn't just operations that are reductions, but
specific calls to reduction functions that allow for missing values.


Regards,
Nick Maclaren.
 
T

Terje Mathisen

Nick said:
|>
|> There is at least one good reason for the current standard behavior:
|>
|> It maintains the maximum amount of information.
|>
|> I.e. doing the opposite which would be to require max(...,NaN,...) to
|> always be NaN simply discards everything we know about the representable
|> numbers in the array, even in the case where the NaN simply means Not
|> Applicable, i.e. 'skip this value'.

Sorry, Terje, but you have missed the point. By providing ways to lose
the NaN state quietly, you are rendering them useless as an error value.

Nick!

You always complain about people who quote you selectively! In this case
you have snipped the paragraph where I agree with you, specifically that
dropping NaN information is a _very_ surprising behavior.

I.e. we're in violent agreement, I was just trying to think of a
possible use for the defined (IMHO broken) specification.

Terje
 
N

Nick Maclaren

|>
|> You always complain about people who quote you selectively! In this case
|> you have snipped the paragraph where I agree with you, specifically that
|> dropping NaN information is a _very_ surprising behavior.

Mea culpa. I apologise.

I did actually read what you said, and misunderstood it.


Regards,
Nick Maclaren.
 
W

William Hughes

Nick said:
|> > Steven G. Kargl wrote:
|> > > In article <[email protected]>,
|> > > (e-mail address removed) writes:
|> >
|> > > > This makes no sense, as the outcome of the operation is undefined and
|> > > > should be NaN.
|> > > > max(NaN,0.) = NaN
|> >
|> > > Why?
|> >
|> > Well, because a NaN *could* be plus infinity, or a number too large to
|> > be represented.
|> >
|> > If one wants to *implement* NaNs *at all*, one's _reason_ for doing so
|> > is because one wants computer arithmetic to produce accurate results -
|> > rather than just plugging in the best representable number that fits,
|> > and then giving a result that may not be valid.
|> >
|> > So going to the trouble of bothering with NaNs, and then deciding that
|> > treating them pessimistically is just too much bother in a few cases,
|> > vitiates the entire enterprise!
|>
|> And you join the list of people who are willing to state on the
|> basis of a few minutes thought and no research whatsoever, that the
|> people behind the IEEE 754 standard are idiots.

You are wrong on three counts:

max/min are not part of IEEE 754, and are not even in the appendix;
they are proposed as part of IEEE 754R.

A quibble.
He did not state that they were idiots - merely misguided

More quibbling. He said that the proposed approach
"vitiates the entire enterprise". This is a lot stronger than
"misguided".
- and you
have no evidence that he has done no research.

Since he has no idea of why NAN's are used
and is incorrect about the reasons behind
the proposed behaviour of max(NAN,0.) , I concluded he
has done no research.
I am pretty sure that he knows of my analysis of the matter (and
document on it),

I see no evidence of this other than he agrees in conclusion.
How do you account for his lack of knowledge if he is familiar
with your analysis.
where I do explain why he is correct and the IEEE 754R
people are wrong. And, in THIS respect, I believe that I have more
experience than any of the people on that group.

OK?

Beside the point. Your opinion is obviously informed and I have
not claimed otherwise. However, I stongly suspect that the
opinions of other IEEE 754R people are also informed.


-William Hughes
 
H

Herman Rubin

William Hughes said:
Steven said:
And you join the list of people who are willing to state on the
basis of a few minutes thought and no research whatsoever, that the
people behind the IEEE 754 standard are idiots.

One can see that problems that something can cause almost
immediately, and without doing any real "research".

I would say that the ones who produced that standard did
not fully examine the adverse consequences that their
actions could cause, even when they were pointed out to
them.

This applies to other standards as well; those who think
they can provide "what is necessary or appropriate" are
only fooling themselves, and often harming others.
 
W

William Hughes

Herman said:
William Hughes said:
Steven G. Kargl wrote:


One can see that problems that something can cause almost
immediately, and without doing any real "research".

Yes, but one may not be able to see problems that a change
would cause without doing research. In this case it is clear that
setting max(NAN,0.)=0. will cause problems. It is less clear why
setting max(NAN,0.)=NAN will cause problems. It is not at all
clear which solution should be preferred. John Savard seems to
have based his conclusion only on the fact that setting
max(NAN,0.)=0. will cause problems.
I would say that the ones who produced that standard did
not fully examine the adverse consequences that their
actions could cause, even when they were pointed out to
them.

Perhaps, and perhaps not. Since we have little information
about the consequences of setting max(NAN,0.)=0 (and John
Savard appears to have none) how do you justify this statment.
This applies to other standards as well; those who think
they can provide "what is necessary or appropriate" are
only fooling themselves, and often harming others.

I agree, but you have set up a straw man. Yes, those
who think they can provide "what is necessary or appropriate"
are only fooling themselves, but no, I don't put automatically
put people who create standards in this catagory.

Normally, there is no "what is necessary or appropriate"
to be provided, tradeoffs and compromises must be made.
Despite this standards are very useful. Problems arise when
people believe that no compromises are necessary.
They then note of some standard that it
causes some specific problem and conclude that
the people who created the standard were incompetent because
they did not notice this.

This is very different from knowing the background of the
compromise and deciding that the wrong choice was made
(e.g. your complaint about speed being unduly emphasized at
the expense of accuracy)


-William Hughes
 
G

glen herrmannsfeldt

(snip)
Clearly, there are two current interpretations of the semantics of NaN:
one is that a NaN means some prior computation yielded a non-representable
and exceptional result, the other is that this data value should be ignored.

Note that the S and R languages used in statistics have both NA and NaN.
NA for data that should be ignored, usually data that didn't exist in
the input data set, such as someone not answering a survey question.

As an interpreted language it is fairly easy to do, though I believe
they use an IEEE NaN value with different values in the low order bits.

It might be an interesting feature to add to other languages and/or
hardware.

-- glen
 
T

Terje Mathisen

Nick said:
|>
|> You always complain about people who quote you selectively! In this case
|> you have snipped the paragraph where I agree with you, specifically that
|> dropping NaN information is a _very_ surprising behavior.

Mea culpa. I apologise.

I did actually read what you said, and misunderstood it.

OK, no problem.

OTOH, having a standard which requires silent removal of NaNs _is_ a
problem. :-(

Terje
 
K

Ken Hagan

William said:
In this case it is clear that setting max(NAN,0.)=0. will cause
> problems. It is less clear why setting max(NAN,0.)=NAN will cause
> problems. It is not at all clear which solution should be preferred.

As I understand Nick's point, the problem is the conflation of two
meanings for NaN, so it wouldn't be at all surprising to me if there
were *no* definite right answer for max(NaN,0).

Now, I probably have less experience and knowledge in this area than
anyone who has so far contributed to this thread, but if I may be
indulged a little, what is wrong with...

max(QNaN,0) = 0
max(SNaN,0) = SNaN
 
N

Nick Maclaren

|>
|> OTOH, having a standard which requires silent removal of NaNs _is_ a
|> problem. :-(

I quite agree. C99 Annex F - just say "no".


Regards,
Nick Maclaren.
 
N

Nick Maclaren

|>
|> As I understand Nick's point, the problem is the conflation of two
|> meanings for NaN, so it wouldn't be at all surprising to me if there
|> were *no* definite right answer for max(NaN,0).

There IS a definite right answer, using the meaning of NaN that is implied
by IEEE 754, and it is NaN. To get the other answer, you need a meaning
of NaNs that is not currently in IEEE 754.

|> Now, I probably have less experience and knowledge in this area than
|> anyone who has so far contributed to this thread, but if I may be
|> indulged a little, what is wrong with...
|>
|> max(QNaN,0) = 0
|> max(SNaN,0) = SNaN

BAD idea. Sorry. Firstly, IEEE 754 requires max(SNaN,0) to raise the
invalid exception, secondly, that would imply that QNaN+0.0 = 0.0 and,
thirdly, the only languages that 'support' IEEE 754 use only QNaNs.


Regards,
Nick Maclaren.
 
J

jacko

hi

Had the inspiration when doing data bases a while back, that as well as
null, void is also needed.

null=unknown quantity
void=no quantity

max(null,0)=null
max(void,0)=0

NaN appears like a null so max(nan,0)=nan ;-) curry shop ahoy!!

NaD would be Not a Datum

cheers.

jacko
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,266
Messages
2,571,075
Members
48,772
Latest member
Backspace Studios

Latest Threads

Top