CPUs, thread pools, and wasted time.

K

Ken

I'm trying to build an application that will use multiple cores
efficiently, but I'm running into to some difficulties. I've written a
simple small example to demonstrate the problem and attached it below.
When I run this program on my dual core computer I expect that both cores
should be actively running wasteTime() and my CPU utilization should be
nearly 100% on both, but this isn't the result I'm getting. A single CPU
is maxed out and the other one is sitting idle. This isn't what I want
for obvious reasons.

Can any explain to me why this is happening and what to do about it?

I know that one can fiddle with the amount of work done by each thread and
get it to behave correctly, but I don't know why it isn't behaving
correctly without this tweaking of parameters.

Thanks.

---
/*
* Main.java
*
* Created on Sep 27, 2007, 4:57:59 AM
*
* To change this template, choose Tools | Templates
* and open the template in the editor.
*/

package threadtester;

import java.util.concurrent.Callable;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;

/**
*
* @author kt
*/
public class Main {

/**
* @param args the command line arguments
*/
public static void main(String[] args) {
while (true) {
int threads = 2;

ExecutorService executorService =
Executors.newFixedThreadPool(threads);

Future[] futures = new Future[10000];
for (int index = 0; index < 100; index++) {
final int indexFinal = index;
futures[index] = executorService.submit(new Callable() {
public Object call() throws Exception {
return new Double(wasteTime());
}
});
}

executorService.shutdown();
boolean tryAgain = false;
do {
try {
tryAgain = false;
executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
} catch (InterruptedException ex) {
tryAgain = true;
}
} while (tryAgain);
}
}


public static double wasteTime() {
double sum = 0.0;
for (int index = 0; index < 100; index++) {
sum += Math.cos(index / 1000.0);
}
return sum;
}

}
 
D

Daniel Dyer

I'm trying to build an application that will use multiple cores
efficiently, but I'm running into to some difficulties. I've written a
simple small example to demonstrate the problem and attached it below.
When I run this program on my dual core computer I expect that both cores
should be actively running wasteTime() and my CPU utilization should be
nearly 100% on both, but this isn't the result I'm getting. A single CPU
is maxed out and the other one is sitting idle. This isn't what I want
for obvious reasons.

Can any explain to me why this is happening and what to do about it?

Your code works as you would expect on my Core Duo. Both cores were fully
utilised. What kind of processor are you using?

Dan.
 
J

Joshua Cranmer

Ken said:
I'm trying to build an application that will use multiple cores
efficiently, but I'm running into to some difficulties. I've written a
simple small example to demonstrate the problem and attached it below.
When I run this program on my dual core computer I expect that both cores
should be actively running wasteTime() and my CPU utilization should be
nearly 100% on both, but this isn't the result I'm getting. A single CPU
is maxed out and the other one is sitting idle. This isn't what I want
for obvious reasons.

Thread scheduling is handled at OS-level. It is impossible to bind the
Java threads to separate CPUs using only native code; I do not know
whether it is even possible to bind threads to CPUs at below a process
level.
Can any explain to me why this is happening and what to do about it?

I know that one can fiddle with the amount of work done by each thread and
get it to behave correctly, but I don't know why it isn't behaving
correctly without this tweaking of parameters.

Thanks.

public static double wasteTime() {
double sum = 0.0;
for (int index = 0; index < 100; index++) {
sum += Math.cos(index / 1000.0);
}
return sum;
}

This is hardly going to be wasting fair amounts of time. Assuming that
computing the cosine takes as many as 20 FLOPs, this code would be on
the order of ~2000 FLOPs. Try computing some eigenvalues or solving
simple ordinary differential equations to REALLY waste time.

I am merely hypothesizing here, but it could be that you wasteTime is
sufficiently computationally nonintensive that the OS is deciding that
it is worthwhile to use the one core.
 
K

Ken

Your code works as you would expect on my Core Duo. Both cores were fully
utilised. What kind of processor are you using?

Dan.

A T5250 at 1.5 GHz. Are you using a standard Sun JVM?
 
K

Ken

Ken wrote:
Thread scheduling is handled at OS-level. It is impossible to bind the
Java threads to separate CPUs using only native code; I do not know
whether it is even possible to bind threads to CPUs at below a process
level.

I understand this, but I can I really rest assured that the JVM will make
the right choice here? It doesn't seem like it is to me. My program will
take nearly twice as long to run given this choice by the JVM. Is there
anyway to give the JVM hints as to how to handle this correctly. I know
that for Sun machines there is the -XX:+BoundThreads. Is there something
similar for Linux? Windows? Other platforms?
 
D

Daniel Dyer

A T5250 at 1.5 GHz. Are you using a standard Sun JVM?

No. Apple 1.5.0_07 on OS X. I had to change the reference to
TimeUnit.DAYS to get it to compile (still no Java 6 on the Mac), but other
than that it was unmodified.

Dan.
 
Z

Zig

I'm trying to build an application that will use multiple cores
efficiently, but I'm running into to some difficulties. I've written a
simple small example to demonstrate the problem and attached it below.
When I run this program on my dual core computer I expect that both cores
should be actively running wasteTime() and my CPU utilization should be
nearly 100% on both, but this isn't the result I'm getting. A single CPU
is maxed out and the other one is sitting idle. This isn't what I want
for obvious reasons.

A guess:

Your main() method is an infinite loop of allocations, in each iteration
allocating

1 ExecutorService
1 Future[]
100 Callables
100 Futures
+ all the internal objects associated

While your wasteTime() method is just spinning through some mundane cosines

It's entirely possible that thread spinning main() is pegging your CPU
with memory allocations, and spending a full order of magnitude more time
than both threads running the wasteTime. You can run a profiler to confirm
this.

If that theory doesn't pan out, on Windows: Task Manager / Process
Explorer can be used to set the processor affinity for your java.exe
process. I'm sure google will turn up the command to do the same on Linux.

HTH,

-Zig
 
L

Lew

Zig said:
If that theory doesn't pan out, on Windows: Task Manager / Process
Explorer can be used to set the processor affinity for your java.exe
process. I'm sure google will turn up the command to do the same on Linux.

I find this question (determining processor affinity for JVM threads) very
interesting and relevant, as just about every general-purpose computer being
made any more is multi-processor. I spent a good bit of time googling around,
and even the most Java-esque writers spoke only of OS utilities for setting
processor affinity. I found nothing yet on doing it via the JVM. Well, not
quite nothing - there's a few places that hint that ThreadLocal variables will
improve processor locality of references.

Setting affinity for the EXE probably won't help distribute different threads
among different processors.

At Ye Big Organization where I work and there's a whole lot of IBM WebSphere
they coerce JVMs to a particular processor and load balance among the JVMs. I
don't think they divide threads, though.

Even if you could split threads, every time those threads share data or
synchronize or whatever they need to do in coordination you'd have a non-local
memory reference somewhere. Putting a whole JVM on a core would obviate that,
at least somewhat. You'd have to balance that against the ability to
parallelize your algorithm, hoping that the parallelization ("p13n"?) will
improve things more than inter-processor memory access will hurt them.

Much of the wisdom I encountered about multi-processor Java involved using
parallel garbage collection (GC). Since GC can impose a major drain on
performance, this may be a very useful tip.

Daniel said:
Apple 1.5.0_07 on OS X.

Perhaps either OS X or that particular JVM has its own set of rules about
processor affinity.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,898
Latest member
BlairH7607

Latest Threads

Top