float vs double : speed on P4 machine

Charles Fox · Jul 6, 2004

I'm writing some real-time numerical code which needs to be pretty
nippy. Should I be using float or doubles for my calculations? I
know my P4 is a 32bit processor, but I don't know if the JVM will end
up putting the 32bits of the floats into the 32bits of my processor,
or if it does something else. In C the 64double would take much
longer to run on a 32bit processor -- is this still the case with the
virtual machine?

Liz · Jul 6, 2004

very simple to write a loop and test it yourself
let us know your results
for me, there was hardly any difference
i have celeron chip

Lordy · Jul 6, 2004

(e-mail address removed) (Charles Fox) wrote in

I'm writing some real-time numerical code which needs to be pretty
nippy. Should I be using float or doubles for my calculations?

Also depending on require precision you may be able to use integers and
scale calculations. -- maybe. Dunno if its worh it these days.

Lordy

Roedy Green · Jul 6, 2004

I'm writing some real-time numerical code which needs to be pretty
nippy. Should I be using float or doubles for my calculations? I
know my P4 is a 32bit processor, but I don't know if the JVM will end
up putting the 32bits of the floats into the 32bits of my processor,
or if it does something else. In C the 64double would take much
longer to run on a 32bit processor -- is this still the case with the
virtual machine?

the floating point engine in a Pentium is 80 bits. Java chops that
back to 64 for doubles and 32 for float.

The calculation themselves are done at 80 bits.

You would think it would take slightly longer to convert from internal
form to float than to double since it is more different. Yet on the
original Pentium the FLD (load) is 1 clock for float and double and 3
clocks for extended. FST (store) is 2 clocks for float and double and
3 for extended. You would have to get the Intel manuals for a given
chip to find out for the more recent chips. So it makes no difference.

On the other hand, you can pack twice as many floats into RAM or
SRAM/onchip cache as doubles. If you have more than a handful of them,
this effect will likely prevail.

It is pretty easy to write your code then try it both ways after you
are done.

John C. Bollinger · Jul 6, 2004

Charles said:
I'm writing some real-time numerical code which needs to be pretty
nippy. Should I be using float or doubles for my calculations? I
know my P4 is a 32bit processor, but I don't know if the JVM will end
up putting the 32bits of the floats into the 32bits of my processor,
or if it does something else. In C the 64double would take much
longer to run on a 32bit processor -- is this still the case with the
virtual machine?

The best advice about optimization is almost invariably to test the
performance first, then adjust as necessary.

Floating-point math on x86 is performed in a FPU that uses 80-bit
operands and creates 80-bit results. This is true whether you start
with a 32-bit or a 64-bit value. In general, depending on whether
intermediate results can be held in the FPU instead of transferred back
to memory, you might find that there is no significant difference in
speed between 32-bit and 64-bit operands. If you do have to transfer
intermediate results then it takes longer to transfer a 64-bit value
than it does to transfer a 32-bit value on a 32-bit machine.

There are other factors as well. For instance, you can hold twice as
many 32-bit values in the CPU's cache; if your application holds a lot
of data or intermediate values in memory then this could be a
consideration. Of course, if the code is not structured so as to be
able to make efficient use of the cache then that in itself will
probably cause a bigger performance impact than 32-bitness vs. 64-bitness.

Don't take these for an exhaustive enumeration of the issues -- you can
find much more comprehensive analyses. But don't overlook my very first
comment: if your code is not running fast enough then you need to
determine _why_ by observation and measurement.

John Bollinger
(e-mail address removed)

Liz · Jul 6, 2004

John C. Bollinger said:
The best advice about optimization is almost invariably to test the
performance first, then adjust as necessary.

Floating-point math on x86 is performed in a FPU that uses 80-bit
operands and creates 80-bit results. This is true whether you start
with a 32-bit or a 64-bit value. In general, depending on whether
intermediate results can be held in the FPU instead of transferred back
to memory, you might find that there is no significant difference in
speed between 32-bit and 64-bit operands. If you do have to transfer
intermediate results then it takes longer to transfer a 64-bit value
than it does to transfer a 32-bit value on a 32-bit machine.

There are other factors as well. For instance, you can hold twice as
many 32-bit values in the CPU's cache; if your application holds a lot
of data or intermediate values in memory then this could be a
consideration. Of course, if the code is not structured so as to be
able to make efficient use of the cache then that in itself will
probably cause a bigger performance impact than 32-bitness vs. 64-bitness.

Don't take these for an exhaustive enumeration of the issues -- you can
find much more comprehensive analyses. But don't overlook my very first
comment: if your code is not running fast enough then you need to
determine _why_ by observation and measurement.

John Bollinger
(e-mail address removed)

<snip>
for(long i=0; i<100000000; i++) {
float1 = (float) i / 17.0f;
}
for(long i=0; i<100000000; i++) {
double1 = (double) i / 17.0;
}
<snip>

Starting FloatDoubleSpeedTest...
Float Time: 5358
Double Time: 6129

Thomas Weidenfeller · Jul 7, 2004

Charles said:
I'm writing some real-time numerical code which needs to be pretty
nippy. Should I be using float or doubles for my calculations?

Implement a typical example, measure it. For your environment, for your
algorithms.

/Thomas

Charles Fox · Jul 7, 2004

Thanks for the demo, Liz

On my P4 (running a bunch of other stuff
at the same time) I get:

float: 6718
double: 7954

Interesting point was mentioned about fitting more in the cache -
hadn't thought of that.

I didnt realise the FP unit was 80 bit -- so why aren't languages
using 80bit FP numbers instead of 32 and 64?

Does the 32bit/64bit processor make any different to big numerical
caluclations then, since most such calculations will be done with FP?

Grant Wagner · Jul 7, 2004

Charles said:
Thanks for the demo, Liz On my P4 (running a bunch of other stuff
at the same time) I get:

float: 6718
double: 7954

Interesting point was mentioned about fitting more in the cache -
hadn't thought of that.

I didnt realise the FP unit was 80 bit -- so why aren't languages
using 80bit FP numbers instead of 32 and 64?

If languages used 80-bit FP values, wouldn't the FPU would have to be 96
(or more) bits, so as not to lose precision when doing calculations on FP
values? Aren't the "extra" bits in the FPU so that more precision then is
needed for the final result is retained during calculations? Then the
final result can be rounded (truncated?) and returned as a 32 or 64 bit
value?

Roedy Green · Jul 7, 2004

I didnt realise the FP unit was 80 bit -- so why aren't languages
using 80bit FP numbers instead of 32 and 64?

Because the way the FP units were designed for use in assembler. You
would load the FP stack up with values, compute away in 80 bit, then
store the results when you were done in 32 or 64.

The more you could avoid storing, the more accurate your results. This
was too loosey goosey for Java, so they effectively insist on a store
after every operation to get consistent results no matter how the code
were compiled.

jme · Sep 13, 2006

java float versus double benchmark on Athlon 64

I don't think that we can deduce real performance characteristics using a simple benchmark like this. But I tried this benchmark anyway. Here are the results I observed using an AMD Athlon 64 with 2 GHz:

JDK 6 beta2, compiler: eclipse 3.2

100 % = 2.479,84 ms using the float version
112 % = 2.777,76 ms using the double version
102 % = 2.521,11 ms using the double version with 17.0d instead of 17.0

JDK 6 beta2, compiler: sun

100 % = 2.358,75 ms using the float version
108 % = 2.555,78 ms using the double version
108 % = 2.547,22 ms using the double version with 17.0d instead of 17.0

StdDev double about 7ms
StdDev float about 31ms

---
http://vienna.in-a-nutshell.net/category/en/en_java/

convert 32bit numbers to 64bit (or float to double)	5	Jun 18, 2010
Working on mobile css menu with plenty of frustration!	2	Dec 29, 2022
Why Ruby 1.9.2 double the speed on 2.53GHz vs 2.2GHz Core 2 Duo?	2	Nov 1, 2010
Why is SciMark benchmark test so low on my machine ?	4	Aug 22, 2004
java virtual machine is big endin processor	7	Apr 8, 2008
C99 float variants of math.h functions	6	Aug 1, 2009
[semi OT] - Lack of long double implementation in VS	10	Oct 23, 2011
Array for speed or...	11	Aug 23, 2006

float vs double : speed on P4 machine

Charles Fox

Liz

Lordy

Roedy Green

John C. Bollinger

Liz

Thomas Weidenfeller

Charles Fox

Grant Wagner

Roedy Green

jme

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads