float vs double : speed on P4 machine

C

Charles Fox

I'm writing some real-time numerical code which needs to be pretty
nippy. Should I be using float or doubles for my calculations? I
know my P4 is a 32bit processor, but I don't know if the JVM will end
up putting the 32bits of the floats into the 32bits of my processor,
or if it does something else. In C the 64double would take much
longer to run on a 32bit processor -- is this still the case with the
virtual machine?
 
L

Liz

very simple to write a loop and test it yourself
let us know your results
for me, there was hardly any difference
i have celeron chip
 
L

Lordy

(e-mail address removed) (Charles Fox) wrote in
I'm writing some real-time numerical code which needs to be pretty
nippy. Should I be using float or doubles for my calculations?

Also depending on require precision you may be able to use integers and
scale calculations. -- maybe. Dunno if its worh it these days.

Lordy
 
R

Roedy Green

I'm writing some real-time numerical code which needs to be pretty
nippy. Should I be using float or doubles for my calculations? I
know my P4 is a 32bit processor, but I don't know if the JVM will end
up putting the 32bits of the floats into the 32bits of my processor,
or if it does something else. In C the 64double would take much
longer to run on a 32bit processor -- is this still the case with the
virtual machine?

the floating point engine in a Pentium is 80 bits. Java chops that
back to 64 for doubles and 32 for float.

The calculation themselves are done at 80 bits.

You would think it would take slightly longer to convert from internal
form to float than to double since it is more different. Yet on the
original Pentium the FLD (load) is 1 clock for float and double and 3
clocks for extended. FST (store) is 2 clocks for float and double and
3 for extended. You would have to get the Intel manuals for a given
chip to find out for the more recent chips. So it makes no difference.

On the other hand, you can pack twice as many floats into RAM or
SRAM/onchip cache as doubles. If you have more than a handful of them,
this effect will likely prevail.

It is pretty easy to write your code then try it both ways after you
are done.
 
J

John C. Bollinger

Charles said:
I'm writing some real-time numerical code which needs to be pretty
nippy. Should I be using float or doubles for my calculations? I
know my P4 is a 32bit processor, but I don't know if the JVM will end
up putting the 32bits of the floats into the 32bits of my processor,
or if it does something else. In C the 64double would take much
longer to run on a 32bit processor -- is this still the case with the
virtual machine?

The best advice about optimization is almost invariably to test the
performance first, then adjust as necessary.

Floating-point math on x86 is performed in a FPU that uses 80-bit
operands and creates 80-bit results. This is true whether you start
with a 32-bit or a 64-bit value. In general, depending on whether
intermediate results can be held in the FPU instead of transferred back
to memory, you might find that there is no significant difference in
speed between 32-bit and 64-bit operands. If you do have to transfer
intermediate results then it takes longer to transfer a 64-bit value
than it does to transfer a 32-bit value on a 32-bit machine.

There are other factors as well. For instance, you can hold twice as
many 32-bit values in the CPU's cache; if your application holds a lot
of data or intermediate values in memory then this could be a
consideration. Of course, if the code is not structured so as to be
able to make efficient use of the cache then that in itself will
probably cause a bigger performance impact than 32-bitness vs. 64-bitness.

Don't take these for an exhaustive enumeration of the issues -- you can
find much more comprehensive analyses. But don't overlook my very first
comment: if your code is not running fast enough then you need to
determine _why_ by observation and measurement.


John Bollinger
(e-mail address removed)
 
L

Liz

John C. Bollinger said:
The best advice about optimization is almost invariably to test the
performance first, then adjust as necessary.

Floating-point math on x86 is performed in a FPU that uses 80-bit
operands and creates 80-bit results. This is true whether you start
with a 32-bit or a 64-bit value. In general, depending on whether
intermediate results can be held in the FPU instead of transferred back
to memory, you might find that there is no significant difference in
speed between 32-bit and 64-bit operands. If you do have to transfer
intermediate results then it takes longer to transfer a 64-bit value
than it does to transfer a 32-bit value on a 32-bit machine.

There are other factors as well. For instance, you can hold twice as
many 32-bit values in the CPU's cache; if your application holds a lot
of data or intermediate values in memory then this could be a
consideration. Of course, if the code is not structured so as to be
able to make efficient use of the cache then that in itself will
probably cause a bigger performance impact than 32-bitness vs. 64-bitness.

Don't take these for an exhaustive enumeration of the issues -- you can
find much more comprehensive analyses. But don't overlook my very first
comment: if your code is not running fast enough then you need to
determine _why_ by observation and measurement.


John Bollinger
(e-mail address removed)

<snip>
for(long i=0; i<100000000; i++) {
float1 = (float) i / 17.0f;
}
for(long i=0; i<100000000; i++) {
double1 = (double) i / 17.0;
}
<snip>

Starting FloatDoubleSpeedTest...
Float Time: 5358
Double Time: 6129
 
T

Thomas Weidenfeller

Charles said:
I'm writing some real-time numerical code which needs to be pretty
nippy. Should I be using float or doubles for my calculations?

Implement a typical example, measure it. For your environment, for your
algorithms.

/Thomas
 
C

Charles Fox

Thanks for the demo, Liz :) On my P4 (running a bunch of other stuff
at the same time) I get:

float: 6718
double: 7954

Interesting point was mentioned about fitting more in the cache -
hadn't thought of that.

I didnt realise the FP unit was 80 bit -- so why aren't languages
using 80bit FP numbers instead of 32 and 64?

Does the 32bit/64bit processor make any different to big numerical
caluclations then, since most such calculations will be done with FP?
 
G

Grant Wagner

Charles said:
Thanks for the demo, Liz :) On my P4 (running a bunch of other stuff
at the same time) I get:

float: 6718
double: 7954

Interesting point was mentioned about fitting more in the cache -
hadn't thought of that.

I didnt realise the FP unit was 80 bit -- so why aren't languages
using 80bit FP numbers instead of 32 and 64?

If languages used 80-bit FP values, wouldn't the FPU would have to be 96
(or more) bits, so as not to lose precision when doing calculations on FP
values? Aren't the "extra" bits in the FPU so that more precision then is
needed for the final result is retained during calculations? Then the
final result can be rounded (truncated?) and returned as a 32 or 64 bit
value?
 
R

Roedy Green

I didnt realise the FP unit was 80 bit -- so why aren't languages
using 80bit FP numbers instead of 32 and 64?

Because the way the FP units were designed for use in assembler. You
would load the FP stack up with values, compute away in 80 bit, then
store the results when you were done in 32 or 64.

The more you could avoid storing, the more accurate your results. This
was too loosey goosey for Java, so they effectively insist on a store
after every operation to get consistent results no matter how the code
were compiled.
 

jme

Joined
Sep 13, 2006
Messages
1
Reaction score
0
java float versus double benchmark on Athlon 64

I don't think that we can deduce real performance characteristics using a simple benchmark like this. But I tried this benchmark anyway. Here are the results I observed using an AMD Athlon 64 with 2 GHz:

JDK 6 beta2, compiler: eclipse 3.2

100 % = 2.479,84 ms using the float version
112 % = 2.777,76 ms using the double version
102 % = 2.521,11 ms using the double version with 17.0d instead of 17.0


JDK 6 beta2, compiler: sun

100 % = 2.358,75 ms using the float version
108 % = 2.555,78 ms using the double version
108 % = 2.547,22 ms using the double version with 17.0d instead of 17.0

StdDev double about 7ms
StdDev float about 31ms

---
http://vienna.in-a-nutshell.net/category/en/en_java/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top