higher precision doubles

Jan Burse · Aug 5, 2011

Dear All

IEEE allows to internally calculate with additional bits.
Is it possible to have awailable these ops? Like an add
with this higher precision? How could we store such a
result? Would there be a wrapper like Double?

Bye

Stefan Ram · Aug 5, 2011

Jan Burse said:
IEEE allows to internally calculate with additional bits.
Is it possible to have awailable these ops? Like an add
with this higher precision? How could we store such a
result? Would there be a wrapper like Double?

See also:

http://en.wikipedia.org/wiki/Strictfp

Jan Burse · Aug 6, 2011

Patricia said:
to do so. Even without strictfp, the mantissa sizes and therefore the
precision are fixed.

I was thinking about double extended, where the precision does
not stay fixed but is increased.

http://de.wikipedia.org/wiki/IEEE_754#Zahlenformate_und_andere_Festlegungen_des_IEEE-754-Standards

Double extended has in minimum 79 bit whereby 63 bit are mantissa
and 15 bit are exponent.

I was wondering whether I can take advantage of an AMD64. Since
it has 80bit IEEE floating point registers:

http://people.freebsd.org/~lstewart/references/amd64.pdf
Paragraph 1.2

BGB · Aug 6, 2011

I was thinking about double extended, where the precision does
not stay fixed but is increased.

http://de.wikipedia.org/wiki/IEEE_754#Zahlenformate_und_andere_Festlegungen_des_IEEE-754-Standards

Double extended has in minimum 79 bit whereby 63 bit are mantissa
and 15 bit are exponent.

I was wondering whether I can take advantage of an AMD64. Since
it has 80bit IEEE floating point registers:

http://people.freebsd.org/~lstewart/references/amd64.pdf
Paragraph 1.2

note: all major x86 chips have 80 bit FPU registers (this is not new).

however, no, one can't directly use them from Java AFAIK, but would
instead likely need to use JNI calls into C land or similar, even then,
it would depend some on the C compiler (many C compilers use an 80-bit
"long double" type, whereas MSVC treats "long double" as an alias for
"double").

also possible is, of course, to use BigDecimal or similar (as others
have noted), or implement ones' own higher-precision floating point
numbers in-software (say, implementing 128 bit or more floating-point
numbers).

or implement a float256 format, with a 31-bit exponent and 224 bit mantissa.

yes, granted, implementing larger floating-point values is kind of a
pain, and performance would be worse than with any natively-supported
formats.

or such...

Jan Burse · Aug 6, 2011

Stefan said:
See also:

http://en.wikipedia.org/wiki/Strictfp

.

Motivating example, in Go we have:

func main() {
x := math.Sin(2*math.Pi)
fmt.Printf("x = %.30f, is zero = %v\n", x, x == 0)
}

x = 0.000000000000000000000000000000, is zero = true

In Java we have:

public static double zero1() {
return Math.sin(2*Math.PI);
}

public static double zero2() {
return StrictMath.sin(2*StrictMath.PI);
}

public static void main(String[] args) {
System.out.println("zero1="+zero1());
System.out.println("zero2="+zero2());
}

zero1=-2.4492935982947064E-16
zero2=-2.4492935982947064E-16

markspace · Aug 6, 2011

Motivating example, in Go we have: ....
In Java we have: ....
zero1=-2.4492935982947064E-16
zero2=-2.4492935982947064E-16

I honestly don't find this very motivating. Zero is what you make of it.
In engineering I learning that 6 digits are sufficient for almost any
practical task. 30 is way overkill.

As far as printing is concerned, don't get confused by the funny
numbers. It's just a matter of understanding and selecting the correct
format.

Counter motivating example:

public class FpPrint {

public static void main( String[] args )
{
System.out.printf( "%6.6f\n", Math.sin( Math.PI * 2 ) );
System.out.printf( "%6.6f\n", StrictMath.sin( StrictMath.PI * 2
) );
}
}

run:
-0.000000
-0.000000
BUILD SUCCESSFUL (total time: 0 seconds)

Here we understand what significant digits mean, and we print the
correct and desired number of digits. We're good.

There's many times many accumulated errors between 0 and 2 x 10^-16 .
Just because the default Java printer doesn't print a "0" doesn't mean
we should panic. That's really all your demonstrating here: the default
printing behavior of System.out.println.

Jan Burse · Aug 6, 2011

Hi

According to ISO LIA it is desired NOT to print eps or a multiple
of eps as zero. This doesn't bother me at all, it rather shows
the high quality of the Java float writing algorithm.

So if you add x == 0 to your solution, you would get:

-0.000000, false
-0.000000, false

And BTW the negative sign in front of the zero anyway indicates
that some rounding was going on, and not a true zero was returned
by the computation. And you could try x == 0 || x == -0, if this
is possible, and you would still get false.

Bye

Motivating example, in Go we have: ...
In Java we have: ...
zero1=-2.4492935982947064E-16
zero2=-2.4492935982947064E-16

Click to expand...

I honestly don't find this very motivating. Zero is what you make of it.
In engineering I learning that 6 digits are sufficient for almost any
practical task. 30 is way overkill.

As far as printing is concerned, don't get confused by the funny
numbers. It's just a matter of understanding and selecting the correct
format.

Counter motivating example:

public class FpPrint {

public static void main( String[] args )
{
System.out.printf( "%6.6f\n", Math.sin( Math.PI * 2 ) );
System.out.printf( "%6.6f\n", StrictMath.sin( StrictMath.PI * 2 ) );
}
}

run:
-0.000000
-0.000000
BUILD SUCCESSFUL (total time: 0 seconds)

Here we understand what significant digits mean, and we print the
correct and desired number of digits. We're good.

There's many times many accumulated errors between 0 and 2 x 10^-16 .
Just because the default Java printer doesn't print a "0" doesn't mean
we should panic. That's really all your demonstrating here: the default
printing behavior of System.out.println.

Jan Burse · Aug 6, 2011

BGB said:
also possible is, of course, to use BigDecimal or similar (as others
have noted), or implement ones' own higher-precision floating point

BigDecimal Ops with MathContext would probably do, although I
have never tried it. But would need to do my own DECIMAL80,
since there is only DECIMAL64 and DECIMAL128. And would not
cover any exponent restriction, only mantissa restriction.

But wait, would BigDecimal really do? I am asking here for
pi and sin. But when I look at BigDecimal I only see some
basic arithmetic and some number theory.

So where is the trigonomy package for BigDecimal?

Bye

http://download.oracle.com/javase/1,5.0/docs/api/java/math/BigDecimal.html
http://download.oracle.com/javase/1,5.0/docs/api/java/math/MathContext.html

Eric Sosman · Aug 6, 2011

Stefan said:
Stefan said:

See also:

http://en.wikipedia.org/wiki/Strictfp

.

Click to expand...

Motivating example, in Go we have:

func main() {
x := math.Sin(2*math.Pi)
fmt.Printf("x = %.30f, is zero = %v\n", x, x == 0)
}

x = 0.000000000000000000000000000000, is zero = true

In Java we have:

public static double zero1() {
return Math.sin(2*Math.PI);
}

public static double zero2() {
return StrictMath.sin(2*StrictMath.PI);
}

public static void main(String[] args) {
System.out.println("zero1="+zero1());
System.out.println("zero2="+zero2());
}

zero1=-2.4492935982947064E-16
zero2=-2.4492935982947064E-16

(Shrug.) Since {Mm}ath.P{Ii} is inexact in both languages --
and in any language whatsoever that uses base-N floating-point
with any precision you care to name -- the "correct" answer from
Go is more a matter of coincidence than of anything important.

markspace · Aug 6, 2011

So if you add x == 0 to your solution, you would get:

"x == 0" is kind of a known rube-ism. You can't compare floating point
directly with any scalar, it just doesn't work. You have to implement
some sort of range check. It is a bit of a bummer that Java doesn't
provide such a method by default, but it's also not hard to implement.

public class FpPrint {

public static void main( String[] args )
{
System.out.printf( "%6.6f\n", Math.sin( Math.PI * 2 ) );
System.out.printf( "%6.6f\n", StrictMath.sin( StrictMath.PI * 2
) );
DoubleComparator c = new DoubleComparator( 0.000001 );
System.out.println( c.compare(
StrictMath.sin( StrictMath.PI * 2 ), 0.0 ) );
}
}

class DoubleComparator {
private final double constraint;

public DoubleComparator( double constraint )
{
this.constraint = constraint;
}

public boolean compare( double d1, double d2 ) {
return Math.abs(d1-d2) < constraint;
}
}

Jan Burse · Aug 6, 2011

Patricia said:
In this case the difference is most likely in the reduction step - if it
had produced 0, the sin result would have been 0.

I guess 2*pi is "exact" in the sense that the operation * itself
has no loss, since we only need to adjust the exponent.

I forget to mention, we also get a non-zero value for sin(pi)
in Java, and not only for sin(2*pi).

Easiest reduction is sin(x) = cos(x-pi/2). But this reduction
seems not to be used:

double x=Math.PI;
System.out.println("pi="+x+", sin(pi)="+Math.sin(x));
x=Math.PI-Math.PI/2;
System.out.println("pi-pi/2="+x+", cos(pi-pi/2)="+Math.cos(x));
x=Math.PI/2;
System.out.println("pi/2="+x+", cos(pi/2)="+Math.cos(x));

Gives:
pi=3.141592653589793, sin(pi)=1.2246467991473532E-16
pi/2=1.5707963267948966, cos(pi/2)=6.123233995736766E-17
pi-pi/2=1.5707963267948966, cos(pi-pi/2)=6.123233995736766E-17

So it seems that the error in PI representation seems the
only reason to be not to return zero. So we have sin(x) with
x approximating pi. The error is:

sin(x) - sin(pi) = sin(pi+(x-pi)) - 0.
= - sin(x-pi)
~ - (x - pi)
= pi - x

So I guess the PI representation is below the real pi. Since
we get a positive result for sin(x). Lets check:

Java PI:
3.141592653589793
real pi:
3.141592653589793238462643383...
Difference:
0.000000000000000238462643383...

Well the above idea does also not work, we only get:

x=0.000000000000000238462643383;
System.out.println("PI-pi="+x+", sin(PI-pi)="+Math.sin(x));

PI-pi=2.38462643383E-16, sin(PI-pi)=2.38462643383E-16

But not 1.2246467991473532E-16. But we can try a fully exact decimal
development of the approximate machine PI. Using the double to
BigDecimal conversion we get (a thing I like most with the
BigDecimal package):

System.out.println("PI="+new BigDecimal(Math.PI));

PI=3.141592653589793115997963468544185161590576171875

Therefore:

Java PI:
3.141592653589793115997963468...
real pi:
3.141592653589793238462643383...
Difference:
0.000000000000000122464679915...

Everything fits perfectly.

Question is why Go gets zero?

Bye

Small-angle Approximation taken from here:
http://en.wikipedia.org/wiki/Small-angle_approximation

Additional PI Digits taken from here:
http://en.wikipedia.org/wiki/Pi

Jan Burse · Aug 7, 2011

markspace said:
"x == 0" is kind of a known rube-ism.

Then take x==0.0 . That is what I want to
know, and not |x|<eps (this is not
the issue of this post).

Actually I trust the runtime to make an
exact widening of 0 to 0.0. Or when the
compiler already does it for me, I also
trust them.

Bye

BGB · Aug 7, 2011

BigDecimal Ops with MathContext would probably do, although I
have never tried it. But would need to do my own DECIMAL80,
since there is only DECIMAL64 and DECIMAL128. And would not
cover any exponent restriction, only mantissa restriction.

But wait, would BigDecimal really do? I am asking here for
pi and sin. But when I look at BigDecimal I only see some
basic arithmetic and some number theory.

So where is the trigonomy package for BigDecimal?

I am not aware of any (but, Java is not my main language FWIW), but it
is possible for one to write these themselves.

for example, see here:
http://en.wikipedia.org/wiki/Taylor_series
and:
http://en.wikipedia.org/wiki/Trigonometric_function

I had actually done this when implementing some of my own math functions
(mostly in the case of implementing quaternions and also a float128
type), however, this was in C.

Jan Burse · Aug 7, 2011

Patricia said:
What is your general strategy for dealing with rounding error in your

The same strategy like ISO LIA is following. You have
the mathematical definition of sin, pi, etc.. And then
you have the machine objects SIN, PI, etc..

And then you have some requirements between these two
things. So each machine real x maps directly to a
mathematical real, but a mathematical real maps to
its nearest machine real if one exists, or if there
are two such machine reals, i.e. if the mathematical
real is exactly between two machine reals, then further
rules might be postulated.

So SIN, the machine function should ideally work as the
mathematical sin, and then before returning its result
pick a machine real as described above. Thats all I
expect from any trigonometric package. And it is
demonstrable (I guess, I didn't verify) that the ideal
can be turned into a practice. Modern FPUs exactly need
to do this in case they comply with some ISO
LIA standard.

So the requirements for sin are very high. One cannot
go on an just set SIN(PI)=0. Would not pass any LIA
ISO compliance test.

Jan Burse · Aug 7, 2011

BGB said:
for example, see here:
http://en.wikipedia.org/wiki/Taylor_series
and:
http://en.wikipedia.org/wiki/Trigonometric_function

I had actually done this when implementing some of my own math functions
(mostly in the case of implementing quaternions and also a float128
type), however, this was in C.

What was your SIN(PI)?

Jan Burse · Aug 7, 2011

Patricia said:
However, I am curious about why you care about exactness in this

People might be interested in realiable calculation of
sin(x)/x and the like, since sin(x)/x is a fourier
transform of the rectangle function. But I must admit
that I am not working on some pressing stuff concerning
sin, the question arose more out of curiosity.

http://en.wikipedia.org/wiki/Sinc_function

Bye

BGB · Aug 7, 2011

What was your SIN(PI)?

probably somewhere near 0?...

Jan Burse · Aug 7, 2011

Patricia said:
I was actually asking you what you were doing in your own program that
requires Math.sin(2*Math.PI) to be exactly 0, and how that requirement
fits with a general strategy for handling rounding error in your program.

The relevant standards for this are the Java Language Specification, for
the rounding rules, and the Java API, for the definition of Math.sin.
The IEEE 754 standard is useful background.

I could have equally well asked for lower precision floats,
things like tiny floats packed into 16 bit or 8 bit. They
are useful for graphic processing.

In both cases, lower precision or higher precision floats,
the JLS is not relevant since JLS does not support these.
Also some packages or ways to access them might have totally
different names from java.lang.Math.

Lower or higher precisions might be covered by IEEE 754,
but there are also a couple of other standards around,
like IEEE 854 which is closer to BigDecimal, since the exponent
is base 10.

I gave Math.sin(2*Math.PI) only as an example of what I
eventually want to do with the higer precision floats.
But since I do not have the higher precision floats, I
showed how the myHighPrecPackage.sin(2*myHighPrecPackage.PI)
would work with normal precision.

So pointing me to JLS is like turning cycles, only confirming
that Java has only float and double. Except with the possibility
that normal arithmetic is done with higher precision when
the strictfp modifier is not used. But this doesn't help for
trigonometric functions.

Bye

Jan Burse · Aug 7, 2011

Jan said:
Patricia Shanahan schrieb:

I gave Math.sin(2*Math.PI) only as an example of what I
eventually want to do with the higer precision floats.
But since I do not have the higher precision floats, I
showed how the myHighPrecPackage.sin(2*myHighPrecPackage.PI)
would work with normal precision.

Well I probably would need another test case for whether
strictfp influences Math.sin. Since I guess Math.PI is
always the same, with or without strictfp. Will try
something..

Jan Burse · Aug 7, 2011

Patricia said:
One can often make some particular combination of calculations give a
more precise answer by some variations in the arithmetic. However, this
does not give much indication of the real requirements.

Do calculations involving trig functions of integer multiples of pi need
to be exact in some application, even if trig functions of other angles
have normal rounding behavior? If so, why? Or do you need trig functions
in general to be more precise?

I guess you looking too far. The higher precision is just
parameter of the representation. From a far angel we have
the requirement that float fit into 32 bit and double into
64 bit. In my post 06.08.2011 13:03 I was writing:

I was wondering whether I can take advantage of an AMD64. Since
it has 80bit IEEE floating point registers:

http://people.freebsd.org/~lstewart/references/amd64.pdf
Paragraph 1.2

So no need to show you any application. I am just interested
in the 80 bit. Whether I CAN work with them or not. I am
not interested in the question whether I SHOULD work
with them. In my post 06.08.2011 00:20, I was asking this
as follows:

IEEE allows to internally calculate with additional bits.
Is it possible to have awailable these ops? Like an add
with this higher precision? How could we store such a
result? Would there be a wrapper like Double?

We can also dig a little bit further what the representation
parameters are. In particular we have:

Mantissa Exponent
double 64bit: 52 bit 11 bit
double 80bit: 64 bit 15 bit

So here I can restate my question:

I would like for whatever reason work with 80bit
floats as defined above in Java. I am interested
in the full set of arithmetic functions, I/O and
trigonometric functions. How could I do that?

Best Regards

Rounding doubles	6	Feb 12, 2008
Choosing the right epsilon for comparing doubles	8	Feb 2, 2014
I need help in understanding these files on my phone, Could someone help me understand these files? Urgent help needed. Please help.	1	Jun 4, 2023
Precision issue in python	5	Feb 20, 2010
Multi precision floating point	159	Nov 28, 2007
Higher-Order Procedures Tutorial (long)	13	Dec 28, 2006
ctypes and misaligned doubles	1	Dec 11, 2008
Weird Behavior with Rays in C and OpenGL	4	Feb 13, 2024

higher precision doubles

Jan Burse

Stefan Ram

Jan Burse

BGB

Jan Burse

markspace

Jan Burse

Jan Burse

Eric Sosman

markspace

Jan Burse

Jan Burse

BGB

Jan Burse

Jan Burse

Jan Burse

BGB

Jan Burse

Jan Burse

Jan Burse

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads