sqrtl(), etc. for Cygwin

glen herrmannsfeldt · Nov 8, 2012

(snip, I wrote)

Mathematically, ((r + x/r)/2) and (x/r + (r - x/r)/2) are equivalent. Is
there a numerical reason to prefer one over the other? (Obviously, if
there are no such concerns, one might as well use the simpler one.)

In base 2 floating point, either one works. The final divide just
changes the exponent. In any other base, though, the simpler one can
lose precision.

In two digit decimal, (32+34)/2 is 33, but (64+68)/2 is 65.

64+68 is 13e1 and 13e1/2 is 65.

It is used for IBM base 16 floating point and, will be needed
for the now part of the IEEE standard, decimal float.

-- glen

Phil Carmody · Nov 12, 2012

Philip Lantz said:
Mathematically, ((r + x/r)/2) and (x/r + (r - x/r)/2) are equivalent. Is
there a numerical reason to prefer one over the other? (Obviously, if
there are no such concerns, one might as well use the simpler one.)

Given that r and x/r are very close (on the assumption that there's a good
initial estimate, which holds in this case), the (r - x/r) expression is a
case of catastrophic loss of precision, which is normally something you
want to avoid like the plague. Gut feel would make me prefer the original.

Phil

glen herrmannsfeldt · Nov 12, 2012

Phil Carmody said:
(snip)

(snip, then I wrote)

(snip)

Given that r and x/r are very close (on the assumption that there's a good
initial estimate, which holds in this case), the (r - x/r) expression is a
case of catastrophic loss of precision, which is normally something you
want to avoid like the plague. Gut feel would make me prefer the original.

That is certainly true in a large number of algorithms, but you should
look more carefully at this one.

Yes the difference can have a large error relative to its (possibly
small) magnitude, but the absolute error is the same as the two
operands if the exponents are the same.

Note that the more cancellation of low digits, means that you are
closer to the correct square root, and so is better, not worse!

Unless the sum is kept with extra precision, in non-binary (base 2)
floating point, a low digit can be lost in the sum that can't be
restored in the divide by two.

It is the lower (relative) precision of the difference that allows for
no loss on the divide by two.

-- glen

Phil Carmody · Nov 13, 2012

glen herrmannsfeldt said:
(snip, then I wrote)

That is certainly true in a large number of algorithms, but you should
look more carefully at this one.

Yes the difference can have a large error relative to its (possibly
small) magnitude, but the absolute error is the same as the two
operands if the exponents are the same.

Note that the more cancellation of low digits, means that you are
closer to the correct square root, and so is better, not worse!

Agreed. Hence my "normally".

Unless the sum is kept with extra precision, in non-binary (base 2)
floating point, a low digit can be lost in the sum that can't be
restored in the divide by two.

Anyone not using binary FP deserves all ills that come to them!

It is the lower (relative) precision of the difference that allows for
no loss on the divide by two.

The thing I like about the sum is its symmetry.
With the half-the-difference method, there are two ways of approaching the
middle.
(r + (x/r-r))/2
(x/r + (r-x/r))/2
I don't see why one should chose one rather the other. And in particular
why you should have chosen the one with the longer expression.

Phil

glen herrmannsfeldt · Nov 13, 2012

(snip, I wrote)

(snip)
(snip)

Anyone not using binary FP deserves all ills that come to them!

The thing I like about the sum is its symmetry.
With the half-the-difference method, there are two ways of approaching the
middle.

(r + (x/r-r))/2
(x/r + (r-x/r))/2

I don't see why one should chose one rather the other. And in particular
why you should have chosen the one with the longer expression.

Once the values are in registers, it doesn't make so much difference,
but yes, I probably would do the former, to avoid the possibility of
doing the divide twice.

I first learned about this from the OS/360 routines written in S/360
Assembler. IBM S/360 uses base 16 floating point. All the techniques
to avoid precision loss were well studied at the time, but maybe
now forgotten. Now, IEEE has standardized decimal float, so all
these will be brought back out again.

-- glen

Cujo 1.4 review	3	May 6, 2010
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006
FAQ update (roundup of pending requests - for comment)	6	Jan 7, 2004
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	1	Feb 1, 2004
python-dev Summary for 2004-08-01 through 2004-08-15	17	Aug 24, 2004
PEP 350: Codetags	20	Sep 26, 2005

sqrtl(), etc. for Cygwin

glen herrmannsfeldt

Phil Carmody

glen herrmannsfeldt

Phil Carmody

glen herrmannsfeldt

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads