Increase WinXP/jre CPU usage?

Discussion in 'Java' started by Steve Brecher, Nov 13, 2006.

  1. I have a compute-intensive program with a simple console user interface.
    While the program is running (number crunching), WinXP's Task Manager's CPU
    usage for it never goes above 50%. I'd like to use the other half of my CPU
    :) I've tried, via a separate SetPriority utility, setting the java (JVM)
    process priority to 256 (max; real time) and all of its threads' priorities
    to 15 (max). This causes the JVM's Base Prio entry in Task Manager to
    become Real Time -- but CPU usage remains at 50%.

    The code that is running does no I/O, only calculation.

    --
    For mail, please use my surname where indicated:
    (Steve Brecher)
    Steve Brecher, Nov 13, 2006
    #1
    1. Advertising

  2. Steve Brecher wrote on 14.11.2006 00:47:
    > I have a compute-intensive program with a simple console user interface.
    > While the program is running (number crunching), WinXP's Task Manager's CPU
    > usage for it never goes above 50%. I'd like to use the other half of my CPU
    > :) I've tried, via a separate SetPriority utility, setting the java (JVM)
    > process priority to 256 (max; real time) and all of its threads' priorities
    > to 15 (max). This causes the JVM's Base Prio entry in Task Manager to
    > become Real Time -- but CPU usage remains at 50%.
    >
    > The code that is running does no I/O, only calculation.
    >

    Do you happen to have a dual processor/dual core computer? If so, then I suspect
    your calculation is running in a single thread only which will not make use of
    the second processor, and thus your overal CPU load will not exceed 50%

    Thomas
    Thomas Kellerer, Nov 14, 2006
    #2
    1. Advertising

  3. Steve Brecher wrote:
    > I have a compute-intensive program with a simple console user interface.
    > While the program is running (number crunching), WinXP's Task Manager's CPU
    > usage for it never goes above 50%. I'd like to use the other half of my CPU
    > :) I've tried, via a separate SetPriority utility, setting the java (JVM)
    > process priority to 256 (max; real time) and all of its threads' priorities
    > to 15 (max). This causes the JVM's Base Prio entry in Task Manager to
    > become Real Time -- but CPU usage remains at 50%.
    >
    > The code that is running does no I/O, only calculation.
    >


    Is there any possibility that you have a dual processor, possibly two
    cores in one chip?

    Utilization freezing at close to 50%, even at very high priority, for a
    compute intensive job is typical of running a single threaded
    application on a dual processor.

    If that is what is going on, you should be able to run two copies of the
    job (if it does not use too much memory) at the same time almost as fast
    as one copy. If so, look at parallelizing the compute-bound portion of
    the job.

    What dominates the computation? Some algorithms are easier to
    parallelize than others.

    Patricia
    Patricia Shanahan, Nov 14, 2006
    #3
  4. Thomas Kellerer <> wrote:
    > Steve Brecher wrote on 14.11.2006 00:47:
    >> I have a compute-intensive program with a simple console user
    >> interface. While the program is running (number crunching), WinXP's
    >> Task Manager's CPU usage for it never goes above 50%. ...
    >>

    > Do you happen to have a dual processor/dual core computer? If so,
    > then I suspect your calculation is running in a single thread only
    > which will not make use of the second processor, and thus your overal
    > CPU load will not exceed 50%


    (Thanks also to Patricia, who responded similarly.)

    It's a Pentium 4 (3.4G) vintage early 2004. If it's dual, I never knew it!
    Might it be?

    The calculation is definitely single-thread.

    --
    For mail, please use my surname where indicated:
    (Steve Brecher)
    Steve Brecher, Nov 14, 2006
    #4
  5. "Steve Brecher" <see.signature@end> wrote in message
    news:...
    >I have a compute-intensive program with a simple console user interface.
    >While the program is running (number crunching), WinXP's Task Manager's CPU
    >usage for it never goes above 50%. I'd like to use the other half of my
    >CPU :) I've tried, via a separate SetPriority utility, setting the java
    >(JVM) process priority to 256 (max; real time) and all of its threads'
    >priorities to 15 (max). This causes the JVM's Base Prio entry in Task
    >Manager to become Real Time -- but CPU usage remains at 50%.
    >
    > The code that is running does no I/O, only calculation.


    Do you by chance have a dual core system?

    --
    LTP

    :)
    Luc The Perverse, Nov 14, 2006
    #5
  6. Steve Brecher wrote:
    > Thomas Kellerer <> wrote:
    >> Steve Brecher wrote on 14.11.2006 00:47:
    >>> I have a compute-intensive program with a simple console user
    >>> interface. While the program is running (number crunching), WinXP's
    >>> Task Manager's CPU usage for it never goes above 50%. ...
    >>>

    >> Do you happen to have a dual processor/dual core computer? If so,
    >> then I suspect your calculation is running in a single thread only
    >> which will not make use of the second processor, and thus your overal
    >> CPU load will not exceed 50%

    >
    > (Thanks also to Patricia, who responded similarly.)
    >
    > It's a Pentium 4 (3.4G) vintage early 2004. If it's dual, I never knew it!
    > Might it be?
    >
    > The calculation is definitely single-thread.


    No it is single core.

    BUT it has hyperthreading.

    Which in WinXP task manager looks like 2 CPU's !

    And to utilize HT you still need to multithread.

    Arne
    =?ISO-8859-1?Q?Arne_Vajh=F8j?=, Nov 14, 2006
    #6
  7. "Patricia Shanahan" <> wrote in message
    news:YT76h.5963$...
    > Steve Brecher wrote:
    >> I have a compute-intensive program with a simple console user interface.
    >> While the program is running (number crunching), WinXP's Task Manager's
    >> CPU usage for it never goes above 50%. I'd like to use the other half of
    >> my CPU :) I've tried, via a separate SetPriority utility, setting the
    >> java (JVM) process priority to 256 (max; real time) and all of its
    >> threads' priorities to 15 (max). This causes the JVM's Base Prio entry
    >> in Task Manager to become Real Time -- but CPU usage remains at 50%.
    >>
    >> The code that is running does no I/O, only calculation.
    >>

    >
    > Is there any possibility that you have a dual processor, possibly two
    > cores in one chip?
    >
    > Utilization freezing at close to 50%, even at very high priority, for a
    > compute intensive job is typical of running a single threaded
    > application on a dual processor.
    >
    > If that is what is going on, you should be able to run two copies of the
    > job (if it does not use too much memory) at the same time almost as fast
    > as one copy. If so, look at parallelizing the compute-bound portion of
    > the job.
    >
    > What dominates the computation? Some algorithms are easier to
    > parallelize than others.


    Utilizing multiple processors/cores to do tasks which seem to be iterative
    (I'm sure there is probably a more formal/correct way to say this) is a very
    active and fun area of computer science right now!

    --
    LTP

    :)
    Luc The Perverse, Nov 14, 2006
    #7
  8. Patricia Shanahan <> wrote:
    > Is there any possibility that you have a dual processor, possibly two
    > cores in one chip?


    It seems I do, virtually speaking -- Pentium 4, apparently with
    Hyper-Threading Technology (thanks to Arne Vajhøj in
    <45591a39$0$49200$>).

    > Utilization freezing at close to 50%, even at very high priority, for
    > a compute intensive job is typical of running a single threaded
    > application on a dual processor.
    >
    > If that is what is going on, you should be able to run two copies of
    > the job (if it does not use too much memory) at the same time almost
    > as fast as one copy. If so, look at parallelizing the compute-bound
    > portion of the job.
    >
    > What dominates the computation? Some algorithms are easier to
    > parallelize than others.


    It's nested loops enumerating cases; there's a computation for each case,
    i.e., inside the innermost loop, and the computation results are
    accumulated.

    It should be possible to dual-thread it, e.g., one thread doing the odd
    cases, so to speak, and the other the even ones. I could synchronize access
    to the accumulation structures, or perhaps have two of them. For generality
    maybe I can even N-thread it. I'll have to think about this...

    --
    For mail, please use my surname where indicated:
    (Steve Brecher)
    Steve Brecher, Nov 14, 2006
    #8
  9. Steve Brecher wrote:
    ....
    > It's nested loops enumerating cases; there's a computation for each case,
    > i.e., inside the innermost loop, and the computation results are
    > accumulated.
    >
    > It should be possible to dual-thread it, e.g., one thread doing the odd
    > cases, so to speak, and the other the even ones. I could synchronize access
    > to the accumulation structures, or perhaps have two of them. For generality
    > maybe I can even N-thread it. I'll have to think about this...
    >


    Given trends in computer architecture, I suggest N-threading it while
    you are about it. When you go to replace that computer, you may find
    yourself getting something with multiple cores, each multi-threaded.

    For reduction problems (problems that take a long vector and produce a
    single answer, such as adding things up), it is generally better, if
    permitted by the problem, to have an accumulator for each thread, and
    only add them at the end. The less synchronization in the middle of the
    problem, the faster it will go.

    Consider organizing the work so that each thread operates on a
    contiguous chunk of data, in case they get assigned to separate
    processors with their own caches.

    However, I would go for simplicity, within the at-least-dual requirement.

    Patrica
    Patricia Shanahan, Nov 14, 2006
    #9
  10. Steve Brecher

    Chris Uppal Guest

    Arne Vajhøj wrote:

    > No it is single core.
    >
    > BUT it has hyperthreading.
    >
    > Which in WinXP task manager looks like 2 CPU's !
    >
    > And to utilize HT you still need to multithread.


    But don't assume that making the application use the other "cpu" will
    necessarily speed anything up. HT is (for most purposes) better regarded as a
    cheap marketing gimmick than a valid technology.

    Or -- to put it another way -- the CPU usage reported by TaskManager is
    misleading. It suggests that 50% of your available horse-power is
    unused. My bet would be that it's more like 5% -- if not actually zero.

    -- chris
    Chris Uppal, Nov 14, 2006
    #10
  11. Chris Uppal wrote:
    > Arne Vajhøj wrote:
    >
    >> No it is single core.
    >>
    >> BUT it has hyperthreading.
    >>
    >> Which in WinXP task manager looks like 2 CPU's !
    >>
    >> And to utilize HT you still need to multithread.

    >
    > But don't assume that making the application use the other "cpu" will
    > necessarily speed anything up. HT is (for most purposes) better regarded as a
    > cheap marketing gimmick than a valid technology.
    >
    > Or -- to put it another way -- the CPU usage reported by TaskManager is
    > misleading. It suggests that 50% of your available horse-power is
    > unused. My bet would be that it's more like 5% -- if not actually zero.


    Here's a suggestion for a cheap test:

    1. Add, if the application does not already contain it, some performance
    statistics collection keeping track of how much elapsed time it takes to
    do a given quantity of the compute intensive work.

    2. Run one copy of the application. Record the statistics.

    3. Run two copies of the application, simultaneously. Record the statistics.

    If it is likely to benefit from multi-threading, the total work rate
    will be significantly higher with two copies than with one. If it is the
    sort of case Chris is talking about, each copy will run at slightly
    better than half the speed of the single copy.

    This test automatically takes into account questions such as how much
    time a thread of your application spends waiting for memory, which can
    affect how much you gain from hyperthreading.

    Patricia
    Patricia Shanahan, Nov 14, 2006
    #11
  12. "Chris Uppal" <-THIS.org> wrote in message
    news:4559bf95$0$632$...
    > Arne Vajhøj wrote:
    >
    >> No it is single core.
    >>
    >> BUT it has hyperthreading.
    >>
    >> Which in WinXP task manager looks like 2 CPU's !
    >>
    >> And to utilize HT you still need to multithread.

    >
    > But don't assume that making the application use the other "cpu" will
    > necessarily speed anything up. HT is (for most purposes) better regarded
    > as a
    > cheap marketing gimmick than a valid technology.
    >
    > Or -- to put it another way -- the CPU usage reported by TaskManager is
    > misleading. It suggests that 50% of your available horse-power is
    > unused. My bet would be that it's more like 5% -- if not actually zero.


    Hey I heard people were getting upwards of 5% increases in . . .things

    --
    LTP

    :)
    Luc The Perverse, Nov 14, 2006
    #12
  13. Patricia Shanahan <> wrote:
    > Chris Uppal wrote:


    in an article not as yet presented by my news server :( hence quoted
    indirectly...

    >>[...]
    >> But don't assume that making the application use the other "cpu" will
    >> necessarily speed anything up. HT is (for most purposes) better
    >> regarded as a cheap marketing gimmick than a valid technology.


    OK. Actually, I am using my P4 system only for development. The real
    target is some hardware yet to be acquired which will undoubtedly be
    dual-core. The current project is a self-tutorial; it's my first Java and
    Eclipse IDE experience; it's a port from C.

    >> Or -- to put it another way -- the CPU usage reported by TaskManager
    >> is misleading. It suggests that 50% of your available horse-power is
    >> unused. My bet would be that it's more like 5% -- if not actually
    >> zero.


    I'm curious about why that would be, but as implied above it's rather idle
    curiosity.

    [now quoting Patricia]
    > Here's a suggestion for a cheap test:
    >
    > 1. Add, if the application does not already contain it, some
    > performance statistics collection keeping track of how much elapsed
    > time it takes to do a given quantity of the compute intensive work.
    >
    > 2. Run one copy of the application. Record the statistics.


    Already done.

    > 3. Run two copies of the application, simultaneously. Record the
    > statistics.


    How important is (almost) exact simultaneity? Would starting one manually
    via console, then another be sufficient? This would mean a delay of several
    seconds; the run time is 3+ minutes. If not, it would be reasonably easy to
    multi-thread the code if the the total workload didn't have to be
    apportioned among the threads.

    > If it is likely to benefit from multi-threading, the total work rate
    > will be significantly higher with two copies than with one. If it is
    > the sort of case Chris is talking about, each copy will run at
    > slightly better than half the speed of the single copy.
    >
    > This test automatically takes into account questions such as how much
    > time a thread of your application spends waiting for memory, which can
    > affect how much you gain from hyperthreading.


    Would there be reasons other than data caching that each copy would run at
    better than half the speed of a single copy?

    --
    For mail, please use my surname where indicated:
    (Steve Brecher)
    Steve Brecher, Nov 14, 2006
    #13
  14. Steve Brecher wrote:
    > Patricia Shanahan <> wrote:
    >> Chris Uppal wrote:

    >
    > in an article not as yet presented by my news server :( hence quoted
    > indirectly...
    >
    >>> [...]
    >>> But don't assume that making the application use the other "cpu" will
    >>> necessarily speed anything up. HT is (for most purposes) better
    >>> regarded as a cheap marketing gimmick than a valid technology.

    >
    > OK. Actually, I am using my P4 system only for development. The real
    > target is some hardware yet to be acquired which will undoubtedly be
    > dual-core. The current project is a self-tutorial; it's my first Java and
    > Eclipse IDE experience; it's a port from C.


    In that case, you should probably use the opportunity to practice
    parallelizing the job, so that you know how to take advantage of a
    dual-core processor.

    >
    >>> Or -- to put it another way -- the CPU usage reported by TaskManager
    >>> is misleading. It suggests that 50% of your available horse-power is
    >>> unused. My bet would be that it's more like 5% -- if not actually
    >>> zero.

    >
    > I'm curious about why that would be, but as implied above it's rather idle
    > curiosity.
    >
    > [now quoting Patricia]
    >> Here's a suggestion for a cheap test:
    >>
    >> 1. Add, if the application does not already contain it, some
    >> performance statistics collection keeping track of how much elapsed
    >> time it takes to do a given quantity of the compute intensive work.
    >>
    >> 2. Run one copy of the application. Record the statistics.

    >
    > Already done.
    >
    >> 3. Run two copies of the application, simultaneously. Record the
    >> statistics.

    >
    > How important is (almost) exact simultaneity? Would starting one manually
    > via console, then another be sufficient? This would mean a delay of several
    > seconds; the run time is 3+ minutes. If not, it would be reasonably easy to
    > multi-thread the code if the the total workload didn't have to be
    > apportioned among the threads.


    I would think that would be close enough. We are trying to tell the
    difference between a throughput change that would justify programming
    effort and a few percentage point change.

    >> If it is likely to benefit from multi-threading, the total work rate
    >> will be significantly higher with two copies than with one. If it is
    >> the sort of case Chris is talking about, each copy will run at
    >> slightly better than half the speed of the single copy.
    >>
    >> This test automatically takes into account questions such as how much
    >> time a thread of your application spends waiting for memory, which can
    >> affect how much you gain from hyperthreading.

    >
    > Would there be reasons other than data caching that each copy would run at
    > better than half the speed of a single copy?
    >


    Very few jobs really use ALL the cycles when they are "running" on a
    processor, so giving a hyperthreaded processor a second job should
    produce some increase in total throughput.

    Patricia
    Patricia Shanahan, Nov 14, 2006
    #14
  15. Patricia Shanahan <> wrote:
    > Steve Brecher wrote:
    >> Patricia Shanahan <> wrote:

    ....
    >>> Here's a suggestion for a cheap test:
    >>>
    >>> 1. Add, if the application does not already contain it, some
    >>> performance statistics collection keeping track of how much elapsed
    >>> time it takes to do a given quantity of the compute intensive work.
    >>>
    >>> 2. Run one copy of the application. Record the statistics.

    >>
    >> Already done.
    >>
    >>> 3. Run two copies of the application, simultaneously. Record the
    >>> statistics.

    >>
    >> How important is (almost) exact simultaneity? Would starting one
    >> manually via console, then another be sufficient? This would mean a
    >> delay of several seconds; the run time is 3+ minutes. If not, it
    >> would be reasonably easy to multi-thread the code if the the total
    >> workload didn't have to be apportioned among the threads.

    >
    > I would think that would be close enough. We are trying to tell the
    > difference between a throughput change that would justify programming
    > effort and a few percentage point change.


    The single-job calculation takes about 185 sec. Running two of them
    manually took 456 and 461 sec., with Task Manager reporting 49-50% CPU usage
    for each. So the average of the two was 2.5x the single-job!

    --
    For mail, please use my surname where indicated:
    (Steve Brecher)
    Steve Brecher, Nov 15, 2006
    #15
  16. Chris Uppal wrote:
    > Arne Vajhøj wrote:
    >> No it is single core.
    >>
    >> BUT it has hyperthreading.
    >>
    >> Which in WinXP task manager looks like 2 CPU's !
    >>
    >> And to utilize HT you still need to multithread.

    >
    > But don't assume that making the application use the other "cpu" will
    > necessarily speed anything up. HT is (for most purposes) better regarded as a
    > cheap marketing gimmick than a valid technology.
    >
    > Or -- to put it another way -- the CPU usage reported by TaskManager is
    > misleading. It suggests that 50% of your available horse-power is
    > unused. My bet would be that it's more like 5% -- if not actually zero.


    Intel claims HT = 1.3 CPU.

    I have seen code that do show the +30%.

    Arne
    =?ISO-8859-1?Q?Arne_Vajh=F8j?=, Nov 15, 2006
    #16
  17. Steve Brecher wrote:
    > Patricia Shanahan <> wrote:
    >> Steve Brecher wrote:
    >>> Patricia Shanahan <> wrote:

    > ...
    >>>> Here's a suggestion for a cheap test:
    >>>>
    >>>> 1. Add, if the application does not already contain it, some
    >>>> performance statistics collection keeping track of how much elapsed
    >>>> time it takes to do a given quantity of the compute intensive work.
    >>>>
    >>>> 2. Run one copy of the application. Record the statistics.
    >>> Already done.
    >>>
    >>>> 3. Run two copies of the application, simultaneously. Record the
    >>>> statistics.
    >>> How important is (almost) exact simultaneity? Would starting one
    >>> manually via console, then another be sufficient? This would mean a
    >>> delay of several seconds; the run time is 3+ minutes. If not, it
    >>> would be reasonably easy to multi-thread the code if the the total
    >>> workload didn't have to be apportioned among the threads.

    >> I would think that would be close enough. We are trying to tell the
    >> difference between a throughput change that would justify programming
    >> effort and a few percentage point change.

    >
    > The single-job calculation takes about 185 sec. Running two of them
    > manually took 456 and 461 sec., with Task Manager reporting 49-50% CPU usage
    > for each. So the average of the two was 2.5x the single-job!
    >


    I would look at it in terms of throughput. Single thread does one job
    per 185 seconds. Two jobs does about 2 jobs per 460 seconds = 1 job per
    230 seconds.

    The two job throughput is 230/185 = 1.24 times the single thread
    throughput, a 24% gain.

    You need to decide whether that is enough to justify the work of
    parallelizing the job. However, I think you said you will be moving to
    dual core, and the trend seems to be towards multiprocessing, so it
    might be worth the investment even for a relatively small gain.

    Patricia
    Patricia Shanahan, Nov 15, 2006
    #17
  18. Steve Brecher

    Chris Uppal Guest

    Patricia Shanahan wrote:

    > I would look at it in terms of throughput. Single thread does one job
    > per 185 seconds. Two jobs does about 2 jobs per 460 seconds = 1 job per
    > 230 seconds.
    >
    > The two job throughput is 230/185 = 1.24 times the single thread
    > throughput, a 24% gain.


    Loss ;-)

    -- chris
    Chris Uppal, Nov 15, 2006
    #18
  19. Steve Brecher

    Chris Uppal Guest

    Steve Brecher wrote:

    > > > Or -- to put it another way -- the CPU usage reported by TaskManager
    > > > is misleading. It suggests that 50% of your available horse-power is
    > > > unused. My bet would be that it's more like 5% -- if not actually
    > > > zero.

    >
    > I'm curious about why that would be, but as implied above it's rather idle
    > curiosity.


    Well, the generally reported figure is in that ball-park.

    As for explaining it, I should first warn you that I'm not especially
    knowledgeable about hardware/chip design, and I'm also relying on a (possibly
    faulty) memory, so take all of the following with the usual pinch of salt, and
    verify (or refute) it for yourself before depending on it.

    That said, my understanding is that, although the Intel HT stuff duplicates
    enough registers to allow two independent execution streams, it does /not/
    duplicate the ALUs, or the instruction decode pipeline. So the actual
    processing power is shared between the two threads, or the two "CPU"s running
    them. That means that the HT architecture only provides a benefit when one
    thread is stalled on a cache read, or otherwise has nothing in its instruction
    pipeline, /and/ the other thread /does/ have all the data and decoded
    instructions necessary to proceed. Since the two threads are competing for the
    cache space (and in any case most programs spend a lot of time stalled one way
    or another) that doesn't happen too very often.

    There /are/ programs which benefit usefully from HT, but the general experience
    seems to be that they are not common. The ideal case (I think) would be when
    the two threads were executing the same (fairly small) section of code and (not
    too big) section of data (so the instruction pipeline and cache would serve
    both as well as the same circuitry could serve either one); and the mix of data
    accesses is such that the interval between stalls for a cache refill is
    approximately equal to the time taken for a cache refill. The less the actual
    mix of instructions seen by each CPU resembles that case, the more the whole
    system will degrade towards acting like one CPU time-sliced between the two
    threads.

    Note that, in the worst case, the cache behaviour of the two threads executing
    at the same time may be worse than it would be if the same two threads were
    time-sliced at coarse intervals by the OS but had the whole of the cache
    available to each thread at a time.

    -- chris
    Chris Uppal, Nov 15, 2006
    #19
  20. Steve Brecher

    Chris Uppal Guest

    Arne Vajhøj wrote:

    [me:]
    > > Or -- to put it another way -- the CPU usage reported by TaskManager is
    > > misleading. It suggests that 50% of your available horse-power is
    > > unused. My bet would be that it's more like 5% -- if not actually zero.

    >
    > Intel claims HT = 1.3 CPU.


    "Yeah, right"

    ;-)


    > I have seen code that do show the +30%.


    Undoubtedly such code does exist. I'm only claiming that it's not the norm.

    -- chris
    Chris Uppal, Nov 15, 2006
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Andrew Thompson

    Astonishing size increase in JRE..

    Andrew Thompson, Jan 11, 2004, in forum: Java
    Replies:
    2
    Views:
    419
    Andrew Thompson
    Jan 12, 2004
  2. nospawn
    Replies:
    2
    Views:
    511
    Real Gagnon
    Apr 25, 2006
  3. hvt
    Replies:
    0
    Views:
    1,206
  4. hvt
    Replies:
    0
    Views:
    1,463
  5. A. Lewenberg

    Looking for a way to increase CPU usage

    A. Lewenberg, Feb 21, 2004, in forum: Perl Misc
    Replies:
    5
    Views:
    281
    Anno Siegel
    Feb 21, 2004
Loading...

Share This Page