Anyone have a 1.5 server vs client vm comparison?

Scott Ellsworth · Feb 26, 2005

Hi, all.

I am looking for a very simple class with different performance
characteristics under 1.5.0 server than under 1.5.0 client. I am hoping
for perhaps a factor of two in speed, but I will take what I can get.

I have a few such microbenchmarks that varied under 1.4.x, but Sun seems
to have fixed up the 1.5 client vm, at least for these.

(Why, you ask? Because someone here contended that there are no
differences between the two now. I wanted to measure it myself.)

Scott

Kevin McMurtrie · Feb 26, 2005

Scott Ellsworth said:
Hi, all.

I am looking for a very simple class with different performance
characteristics under 1.5.0 server than under 1.5.0 client. I am hoping
for perhaps a factor of two in speed, but I will take what I can get.

I have a few such microbenchmarks that varied under 1.4.x, but Sun seems
to have fixed up the 1.5 client vm, at least for these.

(Why, you ask? Because someone here contended that there are no
differences between the two now. I wanted to measure it myself.)

Scott

As far as I've seen, the only _performance_ difference is during HotSpot
compilation. The final performance is the same. Which is better
depends on whether your code has hot spots or an broad distribution of
activity. You can play with these:

-XX:CICompilerCount=n
Sets the maximum concurrent compilations. My extensive testing shows
that 1.4.2 and 1.5.0 HotSpot aren't thread safe so increasing this
number rapidly increases the risk of a HotSpot crash. Java 1.5.0 is
especially prone to crashing on transitions from high CPU load to low
CPU load because it queues up methods to be compiled during idle time.

-XX:CompileThreshold=n
Controls how many times a method is executed before compiling or
re-optimizing. Don't set this too low or you'll waste lots of memory
and CPU time compiling startup code, static initializers, and exception
handlers. Even 50000 isn't too high for very large apps.

-XX:ReservedCodeCacheSize=size
Heap size for compiled code. It's one of Sun's many brain-dead heaps in
that can't size itself correctly. Do any of them work reliably?

-XX:+PrintCompilation
Prints handy information about HotSpot compilation.

The client/server switch tweaks some other parameter defaults too. I
forget where that page is of all the secret Sun JVM options.

Chris Uppal · Feb 26, 2005

Scott said:
I am looking for a very simple class with different performance
characteristics under 1.5.0 server than under 1.5.0 client. I am hoping
for perhaps a factor of two in speed, but I will take what I can get.

This thread:

http://groups.google.co.uk/[email protected]&rnum=1

has a fairly good example of the optimisation in the server JVM being
significantly better than that in the client; even to the extent that hand
optimisation in the code submitted to the client can't claw back the
difference.

I'm not sure whether I still have the code I used for measuring. If you want,
I'll try to find it, but you can probably re-create it from the original
example in that thread.

-- chris

Scott Ellsworth · Feb 28, 2005

[request]

Thanks, Chris and Kevin. I am busily playing with the sample code, and
with the command line flags. I find I say smarter things if I have done
some experimentation on my own.

Scott

Scott Ellsworth · Mar 2, 2005

Chris Uppal said:
This thread:

http://groups.google.co.uk/[email protected]
&rnum=1

has a fairly good example of the optimisation in the server JVM being
significantly better than that in the client; even to the extent that hand
optimisation in the code submitted to the client can't claw back the
difference.

I created code based on some of the changes, and found that 1.5 had very
different performance characteristics on one of our quad processor
2.8GHz linux machines.

# /mnt/java/jdk15/bin/java -cp . -XX:+PrintCompilation -server
MatrixTestDirty
creating matrices a and b
1 MatrixTestDirty::main (330 bytes)
2 java.lang.Math::random (16 bytes)
3* sun.misc.Unsafe::compareAndSwapLong (0 bytes)
1% MatrixTestDirty::main @ 57 (330 bytes)
multiplying them
Var[123] = -535
Result: 46511
# /mnt/java/jdk15/bin/java -cp . -XX:+PrintCompilation -server
MatrixTestDirty
creating matrices a and b
1 MatrixTestDirty::main (330 bytes)
2 java.lang.Math::random (16 bytes)
3* sun.misc.Unsafe::compareAndSwapLong (0 bytes)
1% MatrixTestDirty::main @ 57 (330 bytes)
multiplying them
Var[123] = 173
Result: 47814
# /mnt/java/jdk15/bin/java -cp . -XX:+PrintCompilation -client
MatrixTestDirty
1 b java.lang.String::hashCode (60 bytes)
creating matrices a and b
2 b java.util.Random::next (47 bytes)
3 b java.lang.Math::random (16 bytes)
1% b MatrixTestDirty::main @ 57 (330 bytes)
multiplying them
Var[123] = -35 4 !b sun.nio.cs.UTF_8$Encoder::encodeArrayLoop (698
bytes)

Result: 46789
# /mnt/java/jdk15/bin/java -cp . -XX:+PrintCompilation -client
MatrixTestDirty
1 b java.lang.String::hashCode (60 bytes)
creating matrices a and b
2 b java.util.Random::next (47 bytes)
3 b java.lang.Math::random (16 bytes)
1% b MatrixTestDirty::main @ 57 (330 bytes)
multiplying them
Var[123] = -96 4 !b sun.nio.cs.UTF_8$Encoder::encodeArrayLoop (698
bytes)

Result: 46550

In other words, it appears that the server and client vms took almost
the same time on this particular problem.

Code below.

Scott

public class MatrixTestDirty
{
private static final int N = 800;
private static final int M = 800;
private static int a [] = new int [N*M];
private static int b [] = new int [N*M];
private static int c [] = new int [N*M];
public static void main (String args [])
{
System.out.println ("creating matrices a and b");
long startMillis=System.currentTimeMillis();

/* Initialize the two matrices. */
for (int i = 0; i < N; ++i) {
for (int j = 0; j < M; ++j) {

a[i*M+j] = (int) ((Math.random () - 0.5) * 10);
b[i*M+j] = (int) ((Math.random () - 0.5) * 10);
}
}

/* Multiply the matrices. */

System.out.println ("multiplying them");

for (int i = 0; i < N; ++i) {
for (int j = 0; j < M; ++j) {

c[i*M+j] = 0;
}
}

for (int i = 0; i < N; ++i) {
for (int j = 0; j < M; ++j) {
for (int k = 0; k < M; ++k) {

c[i*M+j] += a[i*M+k] * b[k*M+j];
}
}
}
long endMillis=System.currentTimeMillis();
long total=endMillis-startMillis;
System.out.println("An element: "+c[25*M+25]);
System.out.println("Result: "+total);
}
}

Chris Uppal · Mar 7, 2005

Scott said:
I created code based on some of the changes, and found that 1.5 had very
different performance characteristics on one of our quad processor
2.8GHz linux machines.

I suspect that all you are seeing is that, because of the way you have written
the benchmark, the multiply routine is never properly optimised. I don't know
whether the server JVM can do on-stack replacement to update running code with
a faster version, but it seems clear that in this case it does not do so. So
what's happening is that the JVM is all fired up and eager to optimise the
inner loop, as soon as the loops have finished, but your program then exits, so
it doesn't bother...

I tried the same code on a 1.4Gz WinXP laptop, and -- like you -- saw no
important difference between -client and -server, but when I recoded it to pull
the matrix multiply out into a separate method, /and/ called that several
times, the optimisation kicked in quite nicely.

Here is the output from a -client run (with some trivia deleted for clarity):

==========================================
[java -cp . -XX:+PrintCompilation -client MatrixTestDirty2]
1 b java.lang.String::charAt (33 bytes)
2 b java.lang.Math::max (11 bytes)
creating matrices a and b
3 b java.util.Random::next (47 bytes)
4 b java.lang.Math::random (16 bytes)
1% b MatrixTestDirty2::main @ 19 (182 bytes)
multiplying them
2% b MatrixTestDirty2::multiply @ 11 (125 bytes)
Result: 10965
5 b MatrixTestDirty2::multiply (125 bytes)
Result: 10886
Result: 10956
Result: 10966
Result: 10885
Result: 10956
Result: 10956
Result: 10956
Result: 10875
Result: 10966
==========================================

which stabilises at around 11 seconds. Notice how multiply() gets compiled
twice, and that the second time only happens /after/ it has returned.

(BTW, I have no idea why this laptop is able to perform the benchmark so much
faster than your machine -- a factor of 4 seems very odd to me...).

Now, running -server:

==========================================
[java -cp . -XX:+PrintCompilation -server MatrixTestDirty2]
creating matrices a and b
1 MatrixTestDirty2::main (182 bytes)
2 java.lang.Math::random (16 bytes)
3* sun.misc.Unsafe::compareAndSwapLong (0 bytes)
1% MatrixTestDirty2::main @ 19 (182 bytes)
multiplying them
2% MatrixTestDirty2::multiply @ 11 (125 bytes)
Result: 8272
4 MatrixTestDirty2::multiply (125 bytes)
Result: 5538
Result: 5608
Result: 5528
Result: 5588
Result: 5528
Result: 5508
Result: 5608
Result: 5518
Result: 5508
==========================================

which stabilises at around half the execution time compared to -client. Notice
how the execution time plummets after the second compilation of multiply().

I'll append the code for completeness, though it's only a trivial modification
to your own code.

--- chris

===========================================
public class MatrixTestDirty2
{
private static final int N = 800;
private static final int M = 800;

private static int a[] = new int[N * M];
private static int b[] = new int[N * M];
private static int c[] = new int[N * M];

public static void main(String args[])
{
System.out.println("creating matrices a and b");

/* Initialize the two matrices. */
for (int i = 0; i < N; ++i)
{
for (int j = 0; j < M; ++j)
{

a[i * M + j] = (int)((Math.random() - 0.5) * 10);
b[i * M + j] = (int)((Math.random() - 0.5) * 10);
}
}

/* Multiply the matrices. */
System.out.println("multiplying them");
for (int i = 0; i < 10; i++)
{
long startMillis = System.currentTimeMillis();
multiply();
long endMillis = System.currentTimeMillis();
long total = endMillis - startMillis;
System.out.println("An element: " + c[25 * M + 25]);
System.out.println("Result: " + total);
}
}

private static void multiply()
{
for (int i = 0; i < N; ++i)
{
for (int j = 0; j < M; ++j)
{
c[i * M + j] = 0;
}
}

for (int i = 0; i < N; ++i)
{
for (int j = 0; j < M; ++j)
{
for (int k = 0; k < M; ++k)
{
c[i * M + j] += a[i * M + k] * b[k * M + j];
}
}
}
}
}

JDK 1.5 is slow in some machines ???	1	Nov 18, 2004
blanking a java.util.Date in 1.5	10	Sep 18, 2004
client-server parallellised number crunching	15	Apr 26, 2011
Odd performance difference between -client and -server	1	Jun 26, 2005
use of assert in Java [vs. exceptions]	22	May 30, 2009
Client cannot locate Container-Managed Entity Bean Sun Applicaton Server Nine	0	Jun 21, 2006
Creating a multi-tier client/server application	19	Aug 29, 2007
A comparison among six VSS remote tools including SourceOffSite , SourceAnyWhere, VSS Connect, Sourc	0	Aug 23, 2005

Anyone have a 1.5 server vs client vm comparison?

Scott Ellsworth

Kevin McMurtrie

Chris Uppal

Scott Ellsworth

Scott Ellsworth

Chris Uppal

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads