How to wait for multiple threads to finish

Discussion in 'Java' started by Gary J, Aug 26, 2004.

  1. Gary J

    Gary J Guest

    I have a java program that spawns a lot of threads to process a very large
    tree data model. How do I force a "parent" thread to wait until all of its
    "child" threads are finished.

    Here's a totally contrived example. TreeCounter simply counts the elements
    in a tree of Elements:

    public class TreeCounter
    {
    private int count;

    void countChildren(Element element)
    {
    count++;

    List children = element.getChildren();
    for (Iterator iter = children.iterator(); iter.hasNext();)
    {
    Element child = iter.next();

    new Thread()
    {

    public void run()
    {
    countChildren(child)
    }

    }.start();

    }

    // wait for all threads created by addChild() to finish before returning

    // print out results
    System.out.println(count + "elements processed");
    }

    }

    This code was written just for this post so there may be some errors but you
    get the idea.

    In the above code, each incarnation of countChildren() can create multiple
    threads.

    How can I ensure that I don't get to the println() statement before all the
    threads have finished? I know I can make threads run one at a time by
    making a method "synchronized" but here I really want to make sure they all
    run at the same time;

    Thanks
    Gary J, Aug 26, 2004
    #1
    1. Advertising

  2. Gary J

    Frank Guest

    Gary J wrote:
    > I have a java program that spawns a lot of threads to process a very large
    > tree data model. How do I force a "parent" thread to wait until all of its
    > "child" threads are finished.
    >
    > How can I ensure that I don't get to the println() statement before all the
    > threads have finished? I know I can make threads run one at a time by
    > making a method "synchronized" but here I really want to make sure they all
    > run at the same time;
    >
    > Thanks


    Didn't read your example, but here we go;

    One way to solve this would be to keep a count of the running threads.
    Increase the counter for each thread created and have the threads
    decrease it again just before leaving the run() method. Once the counter
    hits zero again, the main thread can continue.

    A few implementation hints:
    1. You need to synchronize access to the counter.
    2. Make the counter volatile, it's going to be accessed by multiple threads.
    3. To prevent unneccessary polling from the main thread, have it wait()
    on the lock object and make the worker threads notify() before leaving
    the synchronized method.

    -Frank
    Frank, Aug 26, 2004
    #2
    1. Advertising

  3. Gary J

    Paul Lutus Guest

    Gary J wrote:

    > I have a java program that spawns a lot of threads to process a very large
    > tree data model. How do I force a "parent" thread to wait until all of
    > its "child" threads are finished.



    1. Create and start the threads. Each thread increments a synchronized
    global thread counter.

    2. Have each thread signal that is is finished by decrementing the global
    counter.

    3. Put the main process on a timer that periodically tests the counter, and
    idles the rest of the time. When the count gets to zero, take the next
    action.

    --
    Paul Lutus
    http://www.arachnoid.com
    Paul Lutus, Aug 26, 2004
    #3
  4. Gary J wrote:

    > I have a java program that spawns a lot of threads to process a very large
    > tree data model.


    Is it running on a computer with a lot of processors? otherwise, using many
    threads is just a waste of CPU time.
    Michael Borgwardt, Aug 26, 2004
    #4
  5. Gary J

    Chris Uppal Guest

    Gary J wrote:

    > I have a java program that spawns a lot of threads to process a very large
    > tree data model. How do I force a "parent" thread to wait until all of
    > its "child" threads are finished.


    Thread.join() is designed for exactly this. Simply put each thread you create
    into some sort of Collection (an ArrayList for example) and then use a loop to
    call join() on each in turn. If the "joinee" has finished then join() will
    return immediately, if not then it will wait until it has finished. By the end
    of the loop you can be sure that all of them have finished.

    -- chris
    Chris Uppal, Aug 26, 2004
    #5
  6. Gary J

    Gary J Guest

    I thought of using a counter but it seemed too simple a solution. I was a
    little unsure what to make volatile. I made counter volatile but its inside
    a synchronized method so does it matter?

    For those interested Here's what I implemented:

    class ThreadLock
    {
    volatile int counter;

    synchronized void add()
    {
    System.out.println(counter + " threads.");
    counter++;
    }

    synchronized void del()
    {
    counter--;
    System.out.println(counter + " threads.");
    notify();
    }

    synchronized int getCount()
    {
    return counter;
    }
    }

    in the Main Method I declare:

    private ThreadLock threads = new ThreadLock();

    and the threads are created like this:

    threads.add();
    new Thread()
    {
    public void run()
    {
    doStuff()
    threads.del();
    }
    }.start();


    The main method checks for all the threads to be completed like this:

    while (threads.getCount() > 0)
    try
    {
    threads.wait();
    } catch(Exception e)
    {}
    System.out.println("Worker threads complete.");


    Is it possible to put this last code inside the ThreadLock class? In other
    words:
    class ThreadLock
    {
    ....

    void resume()
    {
    while (getCount() > 0)
    try
    {
    wait();
    } catch(Exception e)
    {}
    }
    }

    then the main nethod would just call:
    counter.resume();

    thanks




    "Frank" <> wrote in message news:412e19b6$...
    Gary J wrote:
    > I have a java program that spawns a lot of threads to process a very large
    > tree data model. How do I force a "parent" thread to wait until all of

    its
    > "child" threads are finished.
    >
    > How can I ensure that I don't get to the println() statement before all

    the
    > threads have finished? I know I can make threads run one at a time by
    > making a method "synchronized" but here I really want to make sure they

    all
    > run at the same time;
    >
    > Thanks


    Didn't read your example, but here we go;

    One way to solve this would be to keep a count of the running threads.
    Increase the counter for each thread created and have the threads
    decrease it again just before leaving the run() method. Once the counter
    hits zero again, the main thread can continue.

    A few implementation hints:
    1. You need to synchronize access to the counter.
    2. Make the counter volatile, it's going to be accessed by multiple threads.
    3. To prevent unneccessary polling from the main thread, have it wait()
    on the lock object and make the worker threads notify() before leaving
    the synchronized method.

    -Frank
    Gary J, Aug 26, 2004
    #6
  7. Gary J

    Gary J Guest

    In this case it is indeed running on a multi-processor computer. Running
    without multiple threads takes about 120% longer than running with them.
    The more threads, the faster it runs. There's probably a point of
    diminishing returns of course.


    "Michael Borgwardt" <> wrote in message
    news:...
    Gary J wrote:

    > I have a java program that spawns a lot of threads to process a very large
    > tree data model.


    Is it running on a computer with a lot of processors? otherwise, using many
    threads is just a waste of CPU time.
    Gary J, Aug 26, 2004
    #7
  8. Gary J

    Gary J Guest

    You're absolutely correct.

    When I initially saw the join() method I thought my problems were over but I
    discovered a possible problem with that approach.

    The problem is that in traversing a tree threads are created and finished
    the time which means the contents of the Collection would constantly change.
    The main method would have to constantly recheck and join to all the threads
    in the list on every pass until there were no more threads to join to. At
    that point I'm really doing the same thing as checking a counter but with
    more work :)

    Of course I could be mistaken and haven't tried this one out yet. There's
    probably a flaw in my logic.


    "Chris Uppal" <-THIS.org> wrote in message
    news:...
    Gary J wrote:

    > I have a java program that spawns a lot of threads to process a very large
    > tree data model. How do I force a "parent" thread to wait until all of
    > its "child" threads are finished.


    Thread.join() is designed for exactly this. Simply put each thread you
    create
    into some sort of Collection (an ArrayList for example) and then use a loop
    to
    call join() on each in turn. If the "joinee" has finished then join() will
    return immediately, if not then it will wait until it has finished. By the
    end
    of the loop you can be sure that all of them have finished.

    -- chris
    Gary J, Aug 26, 2004
    #8
  9. Gary J

    anon Guest

    Gary J wrote:
    > In this case it is indeed running on a multi-processor computer. Running
    > without multiple threads takes about 120% longer than running with them.
    > The more threads, the faster it runs. There's probably a point of
    > diminishing returns of course.
    >

    Absolutely. The rule of thumb I've always heard is no more
    active/running threads _per_process_ than 2x the number of available CPUs.

    YMMV, but I've been more-or-less obeying that limit for years, and it
    does seem to hold true for *most* situations.
    anon, Aug 27, 2004
    #9
  10. Michael Borgwardt wrote:
    > Gary J wrote:
    >
    >> I have a java program that spawns a lot of threads to process a very
    >> large
    >> tree data model.

    >
    >
    > Is it running on a computer with a lot of processors? otherwise, using many
    > threads is just a waste of CPU time.


    Not necessarily. Consider a case when each thread is waiting for some inputs
    that would arrive at some indeterminate time. Processing in parallel instead of
    serially would allow the inputs that arrived to be processed while other threads
    (probably started first) are still waiting.

    BK
    Babu Kalakrishnan, Aug 27, 2004
    #10
  11. Babu Kalakrishnan wrote:
    >> Is it running on a computer with a lot of processors? otherwise, using
    >> many threads is just a waste of CPU time.


    > Not necessarily. Consider a case when each thread is waiting for some inputs
    > that would arrive at some indeterminate time. Processing in parallel instead of
    > serially would allow the inputs that arrived to be processed while other threads
    > (probably started first) are still waiting.


    True, but it seems pretty clear that this is not such a case.
    Michael Borgwardt, Aug 27, 2004
    #11
  12. Gary J wrote:
    > In this case it is indeed running on a multi-processor computer. Running
    > without multiple threads takes about 120% longer than running with them.
    > The more threads, the faster it runs. There's probably a point of
    > diminishing returns of course.


    If the problem is purely CPU-bound, then this point should be reached exactly
    when the number of threads equals the number of processors, if not earlier
    (if the threads have to synchronize a lot, see Amdahl's law).
    Michael Borgwardt, Aug 27, 2004
    #12
  13. Gary J

    Chris Uppal Guest

    Michael Borgwardt wrote:

    > Is it running on a computer with a lot of processors? otherwise, using
    > many threads is just a waste of CPU time.


    Just for interest: this is not necessarily true even for compute-bound
    algorithms.

    Consider the case where there are two techniques for solving a given problem,
    one is slow but is guaranteed to find the correct answer, the other is 10 times
    faster but only gets the right answer about half the time (and the other cases
    are easily detectable /after/ they've failed but are hard to recognise in
    advance). In that case, attempting to solve the problem using two threads (one
    for each technique) will reduce the average time to produce an answer to about
    65%, plus the overhead of the extra thread and thread switching. Obviously
    this can be generalised; for instance the analysis is essentially the same if
    you have different techniques that all always work, but are fast in different
    cases.

    That's just for academic interest -- I don't think I've ever encountered such a
    situation in a real application.

    -- chris
    Chris Uppal, Aug 27, 2004
    #13
  14. Gary J

    Chris Uppal Guest

    Gary J wrote:

    > The problem is that in traversing a tree threads are created and finished
    > the time which means the contents of the Collection would constantly
    > change.


    Hmm, then that is actually a different problem -- you are not trying to wait
    for threads to finish, you are trying to detect when <something else> has
    /stopped/ creating new threads.

    There are loads of ways of doing that. If I had that kind of system, then my
    first idea would be to handle it by introducing the simple convention that any
    thread that started another thread had to join() with it before dying itself.

    If that doesn't work for you, then the kinds of counting solutions that have
    been discussed earlier are an option. I've rather lost track of where the rest
    of the thread has got to, are you happy that you now know how to do that ?

    BTW, do take the warnings about not expecting magic from threads seriously --
    it sounds to me as if you may be overdoing the threading thing (unless this is
    some sort of academic exercise).

    -- chris
    Chris Uppal, Aug 27, 2004
    #14
  15. Gary J wrote:

    > You're absolutely correct.
    >
    > When I initially saw the join() method I thought my problems were over but I
    > discovered a possible problem with that approach.
    >
    > The problem is that in traversing a tree threads are created and finished
    > the time which means the contents of the Collection would constantly change.
    > The main method would have to constantly recheck and join to all the threads
    > in the list on every pass until there were no more threads to join to. At
    > that point I'm really doing the same thing as checking a counter but with
    > more work :)
    >
    > Of course I could be mistaken and haven't tried this one out yet. There's
    > probably a flaw in my logic.


    Apply the join() technique in each thread that starts threads. Each
    thread will have its own collection of the Threads it started, which
    will not change once that thread stops creating new Threads and starts
    join()ing the ones it created.


    John Bollinger
    John C. Bollinger, Aug 27, 2004
    #15
  16. Gary J

    thoff Guest

    Concurrency is way of structuring programs. It's not really related
    to the number of CPUs. Some languages like java make this costly.
    Other languages like erlang do not.

    Michael Borgwardt wrote:
    > Gary J wrote:
    >
    >> I have a java program that spawns a lot of threads to process a very
    >> large
    >> tree data model.

    >
    >
    > Is it running on a computer with a lot of processors? otherwise, using many
    > threads is just a waste of CPU time.
    thoff, Aug 29, 2004
    #16
  17. Gary J

    xarax Guest

    "thoff" <> wrote in message
    news:...
    > Michael Borgwardt wrote:
    > > Gary J wrote:
    > >
    > >> I have a java program that spawns a lot of threads to process a very
    > >> large
    > >> tree data model.

    > >
    > >
    > > Is it running on a computer with a lot of processors? otherwise, using many
    > > threads is just a waste of CPU time.

    > Concurrency is way of structuring programs. It's not really related
    > to the number of CPUs. Some languages like java make this costly.
    > Other languages like erlang do not.
    >


    Even on uni-processors, if the threads are using
    I/O, then concurrency can help tremendously. Those
    threads are "working" will get service while the
    other threads that are "waiting" on I/O won't impact
    the application through-put.

    Note that I/O also includes virtual paging on
    most systems. However, there is little that a
    Java application can do to isolate its virtual
    storage usage (thread local storage comes to
    mind, but most folks rarely use that).
    xarax, Aug 29, 2004
    #17
  18. Gary J

    Chris Uppal Guest

    thoff wrote:

    > Concurrency is way of structuring programs. It's not really related
    > to the number of CPUs. Some languages like java make this costly.
    > Other languages like erlang do not.


    It's not really a language issue, as such. It'd be perfectly possible to
    create a Java implementation (or JVM) where threads were as lightweight as in
    Erlang.

    The implementors have a choice in this matter. At one extreme they can
    implement Java threads as real OS-level threads, which on most OSes is going to
    mean that threads are heavyweight things. That has the considerable advantage
    that Java programs are then using the same implementation of concurrency as any
    other code (e.g. system calls only block the one thread, the OS can maps
    threads to CPUs, threaded external libraries work as expected, ...). At the
    other extreme the implementors can go for very lightweight threads; that would
    lose the advantages of "real" OS threads, but enable (for instance) an
    Erlang-style of programming with many small threads (and be a /good deal/
    easier to implement, too).

    To some extent the decision will reflect what programmers are actually /doing/
    with threads, and that in turn will depend on programmers' expectations of how
    threads are implemented. The Java programming community seems to have settled
    on an expectation that threads are heavyweight, and so it makes sense for JVM
    designers to optimise for the resulting patterns of usage.

    (In any case, I'm not convinced that the Erlang way of programming with
    micro-threads would work so well without Erlang's "share nothing" semantics, or
    something similar.)

    (Of course, all this assumes that the OS's threads are not themselves
    lightweight, and it also assumes that a hybrid model with both lightweight and
    heavyweight flavours of threads would be infeasible.)

    -- chris
    Chris Uppal, Aug 30, 2004
    #18
  19. Gary J

    thoff Guest

    Chris Uppal wrote:
    > (In any case, I'm not convinced that the Erlang way of programming with
    > micro-threads would work so well without Erlang's "share nothing" semantics, or
    > something similar.)


    That's why it's a program language issue. With a share nothing
    model and a global heap for efficient message sends, mere mortals can
    make concurrent programs work. Not something you can say about java.
    I could make an actor model and i could say everyone don't touch stuff
    in other actors, but it would never work as people will touch
    stuff, even assuming that threads could be made efficient enough to
    handle high levels of concurrency.
    thoff, Aug 30, 2004
    #19
  20. Gary J

    Chris Uppal Guest

    thoff wrote:

    > > (In any case, I'm not convinced that the Erlang way of programming with
    > > micro-threads would work so well without Erlang's "share nothing"
    > > semantics, or something similar.)

    >
    > That's why it's a program language issue. With a share nothing
    > model and a global heap for efficient message sends, mere mortals can
    > make concurrent programs work. Not something you can say about java.


    Fair point, I think.

    -- chris
    Chris Uppal, Aug 31, 2004
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. mike
    Replies:
    6
    Views:
    843
    John C. Bollinger
    Aug 9, 2004
  2. Replies:
    8
    Views:
    4,528
  3. Uros
    Replies:
    2
    Views:
    310
  4. Christopher Dancy
    Replies:
    4
    Views:
    105
    Roger Pack
    Apr 16, 2010
  5. Zhidian Du
    Replies:
    2
    Views:
    98
    David Efflandt
    Feb 21, 2004
Loading...

Share This Page