Testing for very small doubles with DBL_EPSILON and _isnan()

D

Dik T. Winter

> On Tue, 9 Jun 2009 12:40:40 GMT, "Dik T. Winter" <[email protected]>
> wrote: ....
>
> But that is not what the "C" standard says, just for starters.

You can derive it for the C standard when you know the properties of IEEE.
The C standard states that it is the difference between 1.0 and the next
higher foating point number. Using the 53 bit mantissa of IEEE double,
we see that for such a processor DBL_EPSILON must be 2^(-52).
> BTW
> are quoting from the 1985 IEEE-754 or some other standard?

I am not quoting fro any IEEE standard. The IEEE standard does not define
something called "DBL_EPSILON".
 
J

JosephKK

The standard has not changed at all, but the 8087 (and 80187) is not standard
compliant (it was based on a preliminary version of it).

True that the 8087 was not quite compliant. However the Cyrix
mimicked those errors, but not the Weitek.
As far as I know,
Cyrix and Weitek were standard compliant. Yhe 80287 was the first standard
compliant Intel FP processor.

Somewhere somebody said that storing the X87 80 bit temporary values
required a conversion that was time costly, it was not. Moreover you
could store the 80-bit register contents as well. This as also true
for the 68881 series FPU as well.
Do you think I do not know that standard?

I would not be surprised if you had studied it recently.

I checked its current form dated 2008 and they incorporated decimal
floating point from IEEE-854. Among other things this made many more
software implementations compliant and the hardware of IBM s/390
compliant.
Ah yes, the implementation value may be different from the "C"
standard limit value. Sloppy on my part.
 
J

JosephKK

My question was about what your expectations were; I can't get the
answer to that question by writing a test program; not unless that test
program is hooked up to mind-reading hardware.

It is still embedded in the quoted text above. "a NaN and an
overflow indication"
 
J

JosephKK

Citation, please? And don't quote the upper limit of 1E-9 again; 2^-52
easily satisfies that limit. If the smallest representable value greater
than 1.0 is 1.0+pow(2,-52), then the C standard says that DBL_EPSILON
must be pow(2,-52).

While some implementations may have that value, the "C" standard sets
a maximum requirement of 1E-9. I just got reminded of the difference.
 
J

JosephKK

You can derive it for the C standard when you know the properties of IEEE.
The C standard states that it is the difference between 1.0 and the next
higher foating point number. Using the 53 bit mantissa of IEEE double,
we see that for such a processor DBL_EPSILON must be 2^(-52).
I don't think that accurately reflects the bit encoding, they steal
the MSB, remember?
I am not quoting fro any IEEE standard. The IEEE standard does not define
something called "DBL_EPSILON".

True. 754 defined other things that reflect the mathematics more
accurately.
 
B

Ben Bacarisse

JosephKK said:
I don't think that accurately reflects the bit encoding, they steal
the MSB, remember?

It does accurately reflect the bit encoding. IEEE double has 53 bits
of precision (encoded using only 52 bits, of course). The C standard
dictates what DBL_EPSILON must then be: 2^-52. It gives an expression
just in case the words are not enough: b^(1-p) where p is the
precision (53) and b the base (2).

<snip>
 
B

Ben Bacarisse

JosephKK said:
While some implementations may have that value, the "C" standard sets
a maximum requirement of 1E-9. I just got reminded of the
difference.

James know this difference. He talks about it in the text you quote.
His post includes both the upper limit and a typical value. He also
explains exactly which implementations will have this value because
the last statement is conditional.
 
K

Keith Thompson

JosephKK said:
I don't think that accurately reflects the bit encoding, they steal
the MSB, remember?

True. 754 defined other things that reflect the mathematics more
accurately.

What exactly is inaccurate about the C standard's definition of
DBL_EPSILON?
 
J

James Kuyper

JosephKK said:
It is still embedded in the quoted text above. "a NaN and an
overflow indication"

Since that phrase occurs in the quoted material, only as part of a
question about whether you still expected those results, an actual
answer was in fact requested; an actual answer to my question could not
be inferred from anything you've previously said in response to. I was
expecting that you were addressing the more general case, where a could
be negative, and not the actual case being discussed, where a is
actually positive.

OK, so the fact that 'a' is positive, and that pow(a,n) is therefore not
allowed to return a NaN, has not had any impact on your expectations for
the behavior of pow(a,n). Odd. Not a good way to think while writing C
or C++ code - but to each his own.
 
D

Dik T. Winter

> On Wed, 10 Jun 2009 10:29:44 GMT, "Dik T. Winter" <[email protected]>
> wrote: ....
>
> True that the 8087 was not quite compliant.

Yes, Intel wanted the chip design to be complete, although the standard was
not yet finalized. So the 8087 did have only one infinity.
>
> Somewhere somebody said that storing the X87 80 bit temporary values
> required a conversion that was time costly, it was not. Moreover you
> could store the 80-bit register contents as well. This as also true
> for the 68881 series FPU as well.

What is the relevance?
>
> I would not be surprised if you had studied it recently.

Well, I have not. The last time I really studied it was some 20 years ago.
>
> Ah yes, the implementation value may be different from the "C"
> standard limit value. Sloppy on my part.

Rather, on most implementations the implementation value *must* be different
from the "C" standard limit value. So in IEEE DBL_EPSILON *must* be 2^(-52).

I do not think there are implementations of C around that get it wrong.
Although I know it is sometimes hard for compiler writers to get such things
right. When Ada compilers started arriving I wrote a program that verified
all floating-point attributes that the language defined. This program was
distributed widely and to my surprise, there was *no* implementation that had
everything correct. Even on our Data General MV4000 the compiler said that
the machine-radix was 2 while it actually was 16.
 
D

Dik T. Winter

> On Wed, 10 Jun 2009 10:32:38 GMT, "Dik T. Winter" <[email protected]>
> wrote: ....
>
> I don't think that accurately reflects the bit encoding, they steal
> the MSB, remember?

No, they do not steal it, they hide it, but it is still there. The idea of
the hidden bit comes from the VAX. (And the idea of gradual underflow comes
from the Electrologica X8.)
>
> True. 754 defined other things that reflect the mathematics more
> accurately.

In what way is DBL_EPSILON not mathematically accurately defined? Actually
the definition is very useful in numerical analysis.
 
J

JosephKK

Since that phrase occurs in the quoted material, only as part of a
question about whether you still expected those results, an actual
answer was in fact requested; an actual answer to my question could not
be inferred from anything you've previously said in response to. I was
expecting that you were addressing the more general case, where a could
be negative, and not the actual case being discussed, where a is
actually positive.

OK, so the fact that 'a' is positive, and that pow(a,n) is therefore not
allowed to return a NaN, has not had any impact on your expectations for
the behavior of pow(a,n). Odd. Not a good way to think while writing C
or C++ code - but to each his own.

Should you carefully read it all again there was at least one
pow(deltaX,n) involved and delta x could have been negative. It is a
separate issue from the computation of r2.
 
J

James Kuyper

JosephKK said:
As much as i don't like to argue with you i expect a NaN and an
overflow indication (which may be going unhandled). But then i am
coming from the hardware side.

In a much more recent message, you explained your response as follows:
> Should you carefully read it all again there was at least one
> pow(deltaX,n) involved and delta x could have been negative. It is a
> separate issue from the computation of r2.

The above comments by Olumide and Dik were about a^b, where a is
placeholder representing an arbitrary expression with a small positive
value. You're disagreeing with an assertion that generation of a NaN
implies that 'a has become negative'. How does the possibility that
deltaX might be negative justify disagreeing with that assertion? If you
were agreeing with him, I would not have expected to see words like "I
don't like to argue with you"; instead, I would have expected something
more like
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,277
Latest member
VytoKetoReview

Latest Threads

Top