Ideal number of threads for CPU intensive task

  • Thread starter Kenneth P. Turvey
  • Start date
K

Kenneth P. Turvey

When running a CPU intensive task (that is divisible into threads) there
is typically a system dependent ideal number of threads to use depending
on the system load and the number of CPUs and the architecture. There
doesn't seem to be an implementation independent way to get a hint from
the JVM about what this number might be. Is there a good way to do this?

If there isn't a good first bet is to use a number of threads equal to
the number of processors the system provides. Is there a system
independent way to do this?

Thanks.
 
E

E11

My hunch is that if it is a CPU-INTENSIVE task, then a good first bet
for the optimal number of threads would be the number of CPUs in the
system.

I may be very wrong, but my perception is that having more threads than
CPU only enhances performance when non-CPU-intensive tasks are
involved, e.g. user-interaction or blocking-I/O. In those cases, the
CPU can context-switch to a thread doing processing, while the other
thread is waiting for user input or I/O.

In the case of a CPU-intensive task, CPU utilisation should already be
high (> 80%?), and introducing more threads would probably only degrade
performance due to context switching overhead, and the possibility of
thread-starvation.



Regards,
Edwin
 
R

Roedy Green

When running a CPU intensive task (that is divisible into threads) there
is typically a system dependent ideal number of threads to use depending
on the system load and the number of CPUs and the architecture. There
doesn't seem to be an implementation independent way to get a hint from
the JVM about what this number might be. Is there a good way to do this?

that is one instance of a category of problems I call "tweakables".

See http://mindprod.com/jgloss/tweakable.html
 
C

Chris Uppal

Kenneth said:
If there isn't a good first bet is to use a number of threads equal to
the number of processors the system provides. Is there a system
independent way to do this?

Even if there was one, I'm not sure that using it would be better than letting
the user set that parameter (a properties file, command-line argument, setting
in the GUI, etc) since:

a) Not every "multiple CPU" machine actually /has/ multiple usable CPUs -- if
it's a hyperthreaded machine then there's a reasonable chance (IMO) that using
two threads would not achieve a speedup.

b) If the machine is a genuine multiple CPU box, then the user (or someone) has
paid big money for it, and they may well have other uses for at least some of
those CPUs.

It would be nice to have a CPU count to use as a default, of course. So, after
all that ;-) you might want to look at
java.lang.Runtime.availableProcessors()...

-- chris
 
R

Roland

When running a CPU intensive task (that is divisible into threads) there
is typically a system dependent ideal number of threads to use depending
on the system load and the number of CPUs and the architecture. There
doesn't seem to be an implementation independent way to get a hint from
the JVM about what this number might be. Is there a good way to do this?

If there isn't a good first bet is to use a number of threads equal to
the number of processors the system provides. Is there a system
independent way to do this?

Thanks.

<http://java.sun.com/j2se/1.5.0/docs...ingSystemMXBean.html#getAvailableProcessors()>

import java.lang.management.ManagementFactory;
import java.lang.management.OperatingSystemMXBean;

public class OSInfo {
public static void main(String[] args) {

OperatingSystemMXBean osBean = ManagementFactory
.getOperatingSystemMXBean();

String osArch = osBean.getArch();
System.out.println(osArch);

int numOfProcessors = osBean.getAvailableProcessors();
System.out.println(numOfProcessors);
}
}


--
Regards,

Roland de Ruiter
` ___ ___
`/__/ w_/ /__/
/ \ /_/ / \
 
K

Kenneth P. Turvey

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

OK, it seems there are two ways to get the number of processors, one in
Runtime and the other in OperatingSystemMXBean, but there is now way to
get a hint on how many threads are optimal. It's too bad. That's what
you really want from the operating system, not how many CPUs it has.

Anyway, I'll go with that. Thanks.

- --
Kenneth P. Turvey <[email protected]>
http://kt.squeakydolphin.com (not much there yet)
Jabber IM: (e-mail address removed)
Phone: (314) 255-2199
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDRX2Xi2ZgbrTULjoRAtvoAKCwEKKHatKfQp9c9CV54vzp/ngr0gCfUYXs
a7uwbBWp0C0yEILosm7J8SI=
=eyZ+
-----END PGP SIGNATURE-----
 
E

Eric Sosman

Kenneth P. Turvey wrote On 10/06/05 15:40,:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

OK, it seems there are two ways to get the number of processors, one in
Runtime and the other in OperatingSystemMXBean, but there is now way to
get a hint on how many threads are optimal. It's too bad. That's what
you really want from the operating system, not how many CPUs it has.

The processor count can be misleading on systems where
some kind of resource management is running -- just because
a CPU is present and idle doesn't mean the system will
necessarily let you have it. There's also the issue of
dynamic reconfiguration, where CPUs can be added to and
removed from "domains" or "partitions" while programs are
running. Still, CPU count is a reasonable default, provided
you offer a way for the user to override it with his superior
knowledge.

The Holy Grail would be some kind of self-adjusting
scheme. For long-running CPU-bound programs such a scheme
might actually work (I confess I haven't tried it). The
idea would be to start a small number of threads, let them
run for a while, and measure the amount of progress they make
per unit time. Then add a few more threads, let the whole
bunch run a little longer, and measure again. If the rate
of progress has improved in approximate proportion to the
increase in thread count, adding those threads was a step in
the right direction and you can try adding a few more. If
progress has improved less than expected (e.g., going from
ten to eleven threads gave only a 2% boost), or if the progress
actually slowed down, you're running too many threads and
should get rid of a few of them.

Of course, not every CPU-bound program is easy to organize
in this manner. The individual "atoms" of work may be too
coarse-grained to permit easy "progress" measurement, the
overall task may not run long enough for the process to converge
on a good thread count, the problem structure may require that
the number of threads be known in advance, and so on. Still,
for some classes of program I think the approach might work.
If you decide to try it, I'd like to hear about your experiences.
 
K

Kenneth P. Turvey

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thu, 06 Oct 2005 16:08:35 -0400, Eric Sosman wrote:

[Snip]
The Holy Grail would be some kind of self-adjusting
scheme. For long-running CPU-bound programs such a scheme
might actually work (I confess I haven't tried it). The
idea would be to start a small number of threads, let them
run for a while, and measure the amount of progress they make
per unit time. Then add a few more threads, let the whole
bunch run a little longer, and measure again. If the rate
of progress has improved in approximate proportion to the
increase in thread count, adding those threads was a step in
the right direction and you can try adding a few more. If
progress has improved less than expected (e.g., going from
ten to eleven threads gave only a 2% boost), or if the progress
actually slowed down, you're running too many threads and
should get rid of a few of them.

This would really get complicated when you start throwing in
other jobs running and I/O and everything else. I guess the
way to handle all that would be to look only at the big
picture without delving into the details too much.

[Snip]
If you decide to try it, I'd like to hear about your experiences.

Unfortunately, or fortunately depending on how you look at it, at
this time I have more than enough entertaining little projects
that produce little or negative amounts of money. :)

- --
Kenneth P. Turvey <[email protected]>
http://kt.squeakydolphin.com (not much there yet)
Jabber IM: (e-mail address removed)
Phone: (314) 255-2199
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDRg/Di2ZgbrTULjoRAorEAJ0TrFEt93oEG8mOU9HHtLp03eDq3ACeOpMq
8DF5lzNE3eDu03kvJe9GA9c=
=A9aP
-----END PGP SIGNATURE-----
 
E

Eric Sosman

Kenneth P. Turvey wrote On 10/07/05 02:03,:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thu, 06 Oct 2005 16:08:35 -0400, Eric Sosman wrote:

[Snip]
The Holy Grail would be some kind of self-adjusting
scheme. For long-running CPU-bound programs such a scheme
might actually work (I confess I haven't tried it). The
idea would be to start a small number of threads, let them
run for a while, and measure the amount of progress they make
per unit time. Then add a few more threads, let the whole
bunch run a little longer, and measure again. If the rate
of progress has improved in approximate proportion to the
increase in thread count, adding those threads was a step in
the right direction and you can try adding a few more. If
progress has improved less than expected (e.g., going from
ten to eleven threads gave only a 2% boost), or if the progress
actually slowed down, you're running too many threads and
should get rid of a few of them.


This would really get complicated when you start throwing in
other jobs running and I/O and everything else. I guess the
way to handle all that would be to look only at the big
picture without delving into the details too much.

That's why I specified "long-running CPU-bound programs."
Other kinds of programs often have other "goodness" criteria
than simple throughput -- in a server, for example, latency
may also be important. Also, a server usually faces a varying
load from its clients; if they all go to lunch and the server's
throughput drops to zero, it does not mean that the server is
running the wrong number of threads!

A CPU-bound program faces a constant "demand" (some do,
anyhow), so it doesn't confront the issue of variable client
load. Also, latency and throughput are merely inverses of
each other, so optimizing either optimizes both. That makes
the situation a whole lot simpler, perhaps even tractable.
[Snip]
If you decide to try it, I'd like to hear about your experiences.


Unfortunately, or fortunately depending on how you look at it, at
this time I have more than enough entertaining little projects
that produce little or negative amounts of money. :)

You lose money on every project, but you make it up
in volume? ;-)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top