Cosine algo, was Re: sqrt algo

D

Dik T. Winter

> Precision being a parameter to IEEE(), and "rounding mode" being
> irrelevant, as far as I can tell --- is there any case in which a
> programmer would want a cosine implementation that rounded down, or toward
> positive infinity, or whatever, instead of simply taking the closest
> representable value?

Interval arithmetic.
 
D

Dik T. Winter

> In article
> <[email protected]>,
....
> There would be a remote possibility that a floating point number x is so
> close to (n + 1/2) pi that the absolute value of cos (x) is less than
> DBL_MIN; I am sure that someone has tested this for the IEEE formats
> this is not the case.

[On to some mathematics...]

It is fairly easy to estimate if you know the tools. In the double
representation of (n + 1/2) pi, the order of magnitude of the result is
equal to the order of magnitude of the absolute error in the argument, if
they are close enough.

As a double is actually a rational number, we are asking for the closeness
of a rational number to (n + 1/2) * pi. Or alternatively (using that
a double is numerator/denominator), the closeness of a rational number
to pi (put the value of 2n+1 in the denominator and 2 in the numerator).
The new denominator is at most on the order of 2^54.

So how close can we get at pi with a rational with a denominator of at
most 2^54? It is known that the closeness is approximately (1/d)^2,
where d is the denominator. So we can get as close as about 2^(-108).
There is quite some distance from 2^(-1022) == DBL_MIN.
 
G

Googmeister

Dik said:
...
There would be a remote possibility that a floating point number x is so
close to (n + 1/2) pi that the absolute value of cos (x) is less than
DBL_MIN; I am sure that someone has tested this for the IEEE formats
this is not the case.

[On to some mathematics...]

It is fairly easy to estimate if you know the tools. In the double
representation of (n + 1/2) pi, the order of magnitude of the result is
equal to the order of magnitude of the absolute error in the argument, if
they are close enough.

As a double is actually a rational number, we are asking for the closeness
of a rational number to (n + 1/2) * pi. Or alternatively (using that
a double is numerator/denominator), the closeness of a rational number
to pi (put the value of 2n+1 in the denominator and 2 in the numerator).
The new denominator is at most on the order of 2^54.

However, many rational numbers with small numerator and denominator
are representable using IEEE, including 1/3 and 1/10.
So how close can we get at pi with a rational with a denominator of at
most 2^54?

You can compute this exactly using the Stern-Brocot tree, but
I'm not sure how this is relevant.
It is known that the closeness is approximately (1/d)^2,
where d is the denominator. So we can get as close as about 2^(-108).
There is quite some distance from 2^(-1022) == DBL_MIN.

Floating point numbers are not evenly spaced, so I don't see how
this last statement is relevant to the discussion.
 
D

Dik T. Winter

> Dik T. Winter wrote: .... ....
> However, many rational numbers with small numerator and denominator
> are representable using IEEE, including 1/3 and 1/10.

You mean not of course. But that is irrelevant.
>
>
> You can compute this exactly using the Stern-Brocot tree, but
> I'm not sure how this is relevant.

See above. When you are close enough to (n + 1/2) pi the true cosine is
approximately the error in the approximation of (n + 1/2) pi. For that
to be small you need a double that is very close to the true value. Or,
as I showed, you must have a rational number that is very close to the
true value of pi. The wish is that the rounded value of the true cosine
is 0.0, requires that the absolute value of the true cosine is less than
DBL_MIN, which in turn means that the rounded argument and the true
value of (n + 1/2).pi is approximately DBL_MIN. The question is, are
there doubles close enough to that. Or (approximately the same), are
there rationals with a denominator bounded by 2^54 close enough to pi?
If not, the rounded value of the cosine function will never be 0.0.
Stern-Brocot trees are overkill. Continued fractions work much better
because they provide (in some sense) best rational approximations.
>
> Floating point numbers are not evenly spaced, so I don't see how
> this last statement is relevant to the discussion.

See above. This means that the cosine function should never deliver 0.0.
 
P

P.J. Plauger

You mean not of course. But that is irrelevant.

See above. When you are close enough to (n + 1/2) pi the true cosine is
approximately the error in the approximation of (n + 1/2) pi. For that
to be small you need a double that is very close to the true value. Or,
as I showed, you must have a rational number that is very close to the
true value of pi. The wish is that the rounded value of the true cosine
is 0.0, requires that the absolute value of the true cosine is less than
DBL_MIN,

Uh, no. There are *lots* of representable values between DBL_MIN and zero.
which in turn means that the rounded argument and the true
value of (n + 1/2).pi is approximately DBL_MIN.

No again.
The question is, are
there doubles close enough to that. Or (approximately the same), are
there rationals with a denominator bounded by 2^54 close enough to pi?
If not, the rounded value of the cosine function will never be 0.0.
Stern-Brocot trees are overkill. Continued fractions work much better
because they provide (in some sense) best rational approximations.


See above. This means that the cosine function should never deliver 0.0.

It is extremely *unlikely* that a representable argument x will be
close enough to an actual multiple of pi/2 that the nearest internal
representation of cos(x) will be zero, but it is *possible*. And,
as I showed in an earlier posting, there are several cases where
cos(n*pi/2) is smaller in magnitude than *DBL_MIN.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
G

Googmeister

P.J. Plauger said:
Uh, no. There are *lots* of representable values between DBL_MIN and zero.
And,
as I showed in an earlier posting, there are several cases where
cos(n*pi/2) is smaller in magnitude than *DBL_MIN.

Perhaps you are confusing DBL_MIN with machine epsilon. In IEEE, by
definition
it is the smallest positive nonzero value of type double, 2^(-1074). I
think
this takes into account denormalized numbers.

PS - Dik T. Winter, thanks for the clarification. I definitely
misunderstood
what you were trying to say the first time.
 
M

Michael Mair

Googmeister said:
Perhaps you are confusing DBL_MIN with machine epsilon.
Unlikely.

In IEEE, by definition
it is the smallest positive nonzero value of type double, 2^(-1074).

No, it is not:
C99, 5.2.4.2.2#11 ("minimum normalized positive floating-point number")
I
think
this takes into account denormalized numbers.

Nope.
Be careful not to confuse C floating point types with IEEE floating
point types.


Cheers
Michael
 
P

P.J. Plauger

Perhaps you are confusing DBL_MIN with machine epsilon. In IEEE, by
definition
it is the smallest positive nonzero value of type double, 2^(-1074). I
think
this takes into account denormalized numbers.

Yes, I was, thanks. But there are *still* lots of representable
values between DBL_MIN and zero.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
G

Googmeister

Michael said:
No, it is not:
C99, 5.2.4.2.2#11 ("minimum normalized positive floating-point number")

Why are you referring to the C standard? Perhaps in C, DBL_MIN is
2^-1022,
but I was clearly talking about IEEE.
Nope.
Be careful not to confuse C floating point types with IEEE floating
point types.

I was never talking about C. Perhaps my mistake was using "DBL_MIN"
to describe the minimum positive IEEE number; apparently it has a
different meaning in C.
 
C

Christian Bau

"Dik T. Winter" <[email protected]> said:
...
There would be a remote possibility that a floating point number x is so
close to (n + 1/2) pi that the absolute value of cos (x) is less than
DBL_MIN; I am sure that someone has tested this for the IEEE formats
this is not the case.

[On to some mathematics...]

It is fairly easy to estimate if you know the tools. In the double
representation of (n + 1/2) pi, the order of magnitude of the result is
equal to the order of magnitude of the absolute error in the argument, if
they are close enough.

As a double is actually a rational number, we are asking for the closeness
of a rational number to (n + 1/2) * pi. Or alternatively (using that
a double is numerator/denominator), the closeness of a rational number
to pi (put the value of 2n+1 in the denominator and 2 in the numerator).
The new denominator is at most on the order of 2^54.

So how close can we get at pi with a rational with a denominator of at
most 2^54? It is known that the closeness is approximately (1/d)^2,
where d is the denominator. So we can get as close as about 2^(-108).
There is quite some distance from 2^(-1022) == DBL_MIN.

I would have argued like this: With 64 bit doubles, there are roughly
2^62 double values >= 1 (half of the doubles are negative, so we don't
care for the cosine function, and roughly another quarter is less than
1). Each of these is at most pi/2 away from (n + 1/2) * pi. We could
expect one of those values within (pi/2) / 2^62 of (n + 1/2) pi.
 
C

Christian Bau

"P.J. Plauger said:
It is extremely *unlikely* that a representable argument x will be
close enough to an actual multiple of pi/2 that the nearest internal
representation of cos(x) will be zero, but it is *possible*. And,
as I showed in an earlier posting, there are several cases where
cos(n*pi/2) is smaller in magnitude than *DBL_MIN.

No, you showed cases where the cosine is less than DBL_EPSILON, not
DBL_MIN.
 
C

Christian Bau

"P.J. Plauger said:
Yes, I was, thanks. But there are *still* lots of representable
values between DBL_MIN and zero.

.... because DBL_MIN is the smallest _normalised_ number, and
denormalised numbers can be still smaller.
 
K

Keith Thompson

Googmeister said:
Michael Mair wrote: [...]
No, it is not:
C99, 5.2.4.2.2#11 ("minimum normalized positive floating-point number")

Why are you referring to the C standard? Perhaps in C, DBL_MIN is
2^-1022,
but I was clearly talking about IEEE.

Probably because this discussion is cross-posted to comp.lang.c, where
the C standard is pretty much all we talk about. (He may not have
noticed the cross-post to comp.programming.)

[...]
I was never talking about C. Perhaps my mistake was using "DBL_MIN"
to describe the minimum positive IEEE number; apparently it has a
different meaning in C.

C's <limits.h> header defines DBL_MIN as the "minimum normalized
positive floating-point number" of type double; it's required to be
less than or equal to 1E-37.
 
J

Joe Wright

Keith said:
Googmeister said:
Michael Mair wrote:
[...]
No, it is not:
C99, 5.2.4.2.2#11 ("minimum normalized positive floating-point number")

Why are you referring to the C standard? Perhaps in C, DBL_MIN is
2^-1022,
but I was clearly talking about IEEE.


Probably because this discussion is cross-posted to comp.lang.c, where
the C standard is pretty much all we talk about. (He may not have
noticed the cross-post to comp.programming.)

[...]
I was never talking about C. Perhaps my mistake was using "DBL_MIN"
to describe the minimum positive IEEE number; apparently it has a
different meaning in C.


C's <limits.h> header defines DBL_MIN as the "minimum normalized
positive floating-point number" of type double; it's required to be
less than or equal to 1E-37.

DBL_MIN is defined in float.h at my house. Printing it "%.16e" gives..

2.2250738585072014e-308

...considerably smaller than 1e-37. It is FLT_MIN with "%.8e" which is..

1.17549435e-38
 
D

Dik T. Winter

> news:[email protected]... ....
>
> Uh, no. There are *lots* of representable values between DBL_MIN and zero.

You are right, I had forgotten some numbers.
>
> It is extremely *unlikely* that a representable argument x will be
> close enough to an actual multiple of pi/2 that the nearest internal
> representation of cos(x) will be zero, but it is *possible*.

Anyhow, here the proof that it can not be close enough to zero, and so
is not possible (in the remainder ^ is exponentiation):

Set: d_n = (n + 1/2) * pi rounded to 53-bit double precision. d_n is a
rational, we have to determine the denominator. If d_n has k_n binary
digits (removing trailing zero digits) after the binary point, the
denominator is 2^k_n. Rewrite (exact mathematics) as 2*d_n/(2n + 1) ~ pi,
The rational on the left has a slightly larger denominator. You can
verify that the denominator is always smaller than 2^54. When k = 0,
the value is about 1.5, so there are at most 52 bits after the binary
point. Increasing n decreases the number of bits after the binary
point, multiplying it by 2n + 1 again increases it, but not enough
to get more than 54 bits (possibly not even more than 53, but that
does not matter).

Now go to the continued fraction expansion of pi. The 30-th
convergent is: 66627445592888887/21208174623389167, where the
denominator is larger than 2^54. The error is larger than
2e-34 (absolute error). A property of convergents is that there
are *no* rational numbers with smaller denominater that are closer
to the number to approximate. As the denominator of 2*d_n/(2n+1) is
less than 21208174623389167, we have |2*d_n/(2n+1) - pi| > 2e-34. Or
|d_n - (n + 1/2) * pi| > (2n + 1) * 1e-34. And so:
cos((n + 1/2) * pi) > (2n + 1) * 1e-34, which is, eh, much larger
than DBL_MIN.

The bound is not sharp, but it is sufficient to show that the cosine
should never deliver 0.0. A similar reasoning will show the same
with long double, and actually will show the same with almost all
finite precision floating point systems. It will only not be true
if either there is a coefficient in the continued fraction expansion
that is extremely large compared to the denominator of the convergent.
So it might be true on some 3-digit decimal f-p system, but none
else (the first 20,000,000 coefficients contain only one relatively
large coefficient, it is used in the approximation 355/133).
Another possibility is when the range of exponents is "too small"
compared to the size of the mantissa.
> And,
> as I showed in an earlier posting, there are several cases where
> cos(n*pi/2) is smaller in magnitude than *DBL_MIN.

You did show examples with DBL_EPSILON. But I said that cos((n + 1/2).pi)
is in the order of magnitude of DBL_EPSILON. Not that it is larger.

With respect to a comparison with DBL_EPSILON, I would say that the
likelyhood that it is larger than the cosine is reasonable, but that
depends on the distribution of the 1 and 0 bits in the binary
expansion of (n + 1/2) * pi, which is essentially "random". And it
is only possible for small values of n. If (for instance) IEEE had
chosen a 50 bit mantissa for double, the value of cos(pi/2) would
have been the same as you have gotten with standard double but that
is because there is a run of 4 zero bits in the binary expansion of
pi/2 from bit 51 to bit 54. (Runs of identical bits make the
approximation smaller than normal.) In general, when printed in
decimal, the exponent of cos((n + 1/2).pi) should be *at most*
two less than the exponent of the associated _MIN. The first
occurrence of two might be with a floating point system with a
mantissa of 107 bits. (There is an occurrence of a row of 7 zero bits
for pi/2 at that place.)
 
D

Dik T. Winter

> In article <[email protected]>, (e-mail address removed) says... ....
>
> When I enter Cos(Pi/2) the calculator responds with 0. Surely that is
> remarkably precise?

If and only if the pi you enter is used exactly by your calculator, which
is only possible if it is doing symbolic manipulation. If the pi your
calculator uses is not the exact value of pi, it shows that the
implementation of the cosine function is imprecise. It may work out
perfect in this formula, but will give suboptimal results for other
arguments.
 
K

Keith Thompson

Joe Wright said:
Keith Thompson wrote: [...]
C's <limits.h> header defines DBL_MIN as the "minimum normalized
positive floating-point number" of type double; it's required to be
less than or equal to 1E-37.

DBL_MIN is defined in float.h at my house.

Yes, float.h, thanks for catching my error.
Printing it "%.16e" gives..

2.2250738585072014e-308

..considerably smaller than 1e-37. It is FLT_MIN with "%.8e" which is..

1.17549435e-38

I said the standard requires DBL_MIN to be less than or equal to
1E-37. 2.2250738585072014e-308 certainly qualifies.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top