perl threading; ->join; best method?

Discussion in 'Perl Misc' started by robert.waters, Nov 9, 2006.

  1. Hi,

    I have a function that does a repetitive task, each task identified by
    a unique index; I would like this task to be shared amongst a number of
    threads. My question is, in what manner should I call 'join' in order
    to be sure that each thread has completed it's task before the end of
    the program?
    My current logic is this:
    --------------------------------------------------------------------
    use LWP::UserAgent;
    use HTTP::Request;
    use threads;
    use threads::shared;

    my @threads;
    my $threadcnt = 20;

    my $unique_index : shared = 1;
    $unique_index = 1;
    my $top_index = 1000;

    # CREATE ALL THE THREADS
    for (my $i=0;$i<$threadcnt;$i++) {

    $threads[$i] = threads->new('workerFunction', $i);

    }

    # JOIN ALL THREADS
    foreach $thread (threads->list) {

    $thread->join;

    }

    print "done.\n";


    sub workerFunction {

    my $threadindex = shift;

    print STDERR "thread $threadindex starting...";

    my $ua = LWP::UserAgent->new;

    while ($unique_index < $top_index) {

    my $url = "http://something.com/index.php?tid=".$unique_index++;

    my $req = HTTP::Request->new(GET => $url);
    my $resp = $ua->request($req);

    print $resp->content;


    }

    }

    -----------------------------------------
    Basically, each thread shares a variable representing the index (and
    increments it each time it is used).
    After all the threads have been created, I iterate through the thread
    list, calling join on each one in order. This iteration seems hackish
    to me though; can this really be the right way to go about waiting for
    the threads to finish? If thread 6 finishes first (which is definitely
    possible), I am still waiting for threads 1 through 5 to finish before
    I ever even check to see if thread 6 completes.
    This technique, while working fine enough, just does not seem to be
    "The Right Way" to do it; I would assume that the process of waiting
    for threads to finish should be asynchronous.

    Any suggestions?

    Thank you-
    Robert
    robert.waters, Nov 9, 2006
    #1
    1. Advertising

  2. robert.waters

    Guest

    "robert.waters" <> wrote:
    > Hi,
    >
    > I have a function that does a repetitive task, each task identified by
    > a unique index; I would like this task to be shared amongst a number of
    > threads. My question is, in what manner should I call 'join' in order
    > to be sure that each thread has completed it's task before the end of
    > the program?
    > My current logic is this:
    > --------------------------------------------------------------------
    > use LWP::UserAgent;
    > use HTTP::Request;
    > use threads;
    > use threads::shared;
    >
    > my @threads;
    > my $threadcnt = 20;
    >
    > my $unique_index : shared = 1;
    > $unique_index = 1;
    > my $top_index = 1000;
    >
    > # CREATE ALL THE THREADS
    > for (my $i=0;$i<$threadcnt;$i++) {
    >
    > $threads[$i] = threads->new('workerFunction', $i);
    >
    > }
    >
    > # JOIN ALL THREADS
    > foreach $thread (threads->list) {
    >
    > $thread->join;
    >
    > }
    >
    > print "done.\n";
    >
    > sub workerFunction {
    >
    > my $threadindex = shift;
    >
    > print STDERR "thread $threadindex starting...";
    >
    > my $ua = LWP::UserAgent->new;
    >
    > while ($unique_index < $top_index) {
    >
    > my $url =
    > "http://something.com/index.php?tid=".$unique_index++;


    This doesn't seem safe to me. unique_index could change between the
    while and the url construction. Also, if two threads create the url
    at the same time, they might both get the same url, and the next url would
    get skipped. You should lock the variable.

    while (1) {
    {
    lock($unique_index);
    my $private = $unique_index++;
    } # release the lock;
    last if $unique_index >= $top_index

    or you could use Thread::Queue instead of all of this stuff.
    ....

    > Basically, each thread shares a variable representing the index (and
    > increments it each time it is used).
    > After all the threads have been created, I iterate through the thread
    > list, calling join on each one in order. This iteration seems hackish
    > to me though; can this really be the right way to go about waiting for
    > the threads to finish?



    Yep.

    > If thread 6 finishes first (which is definitely
    > possible), I am still waiting for threads 1 through 5 to finish before
    > I ever even check to see if thread 6 completes.


    So what?

    > This technique, while working fine enough, just does not seem to be
    > "The Right Way" to do it; I would assume that the process of waiting
    > for threads to finish should be asynchronous.


    You could use is_joinable in threads.pm newer than 1.34, or for older
    threads.pm then use Thread::State. But in the given situation, I don't
    see how this would be an improvement. Making the change would be more
    hackish then just leaving it as it is.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
    , Nov 9, 2006
    #2
    1. Advertising

  3. robert.waters

    zentara Guest

    On 9 Nov 2006 11:30:03 -0800, "robert.waters" <>
    wrote:

    >I have a function that does a repetitive task, each task identified by
    >a unique index; I would like this task to be shared amongst a number of
    >threads. My question is, in what manner should I call 'join' in order
    >to be sure that each thread has completed it's task before the end of
    >the program?
    >My current logic is this:


    Just remember that in order for a thread to be joinable, it must finish,
    or somehow return from it's code block.

    With LWP, you will need to set a timeout on the connection, or use an
    alarm block, to detect LWP connections which hang. Otherwise, those
    hanging threads will NEVER be joinable, and you will stay in your
    join loop forever.

    If you kill the main thread, you will get the error " a thread exited
    while x threads running".



    --
    I'm not really a human, but I play one on earth.
    http://zentara.net/japh.html
    zentara, Nov 10, 2006
    #3
  4. robert.waters

    jdhedden Guest

    Robert Waters wrote:
    > # JOIN ALL THREADS
    > foreach $thread (threads->list) {
    > $thread->join;
    > }
    >
    > My question is: In what manner should I call 'join' in
    > order to be sure that each thread has completed it's task
    > before the end of the program?


    The method you have above is perfectly correct. It can be
    compacted a bit using:

    $_->join() foreach threads->list();

    > This technique, while working fine enough, just does not
    > seem to be "The Right Way" to do it; I would assume that
    > the process of waiting for threads to finish should be
    > asynchronous.


    If you upgrade to the lastest version of 'threads' off of
    CPAN, you can do this asynchronously;

    while (threads->list()) {
    $_->join() foreach threads->list(threads::joinable);
    sleep(1);
    }

    Hope this helps.
    jdhedden, Nov 10, 2006
    #4
  5. [A complimentary Cc of this posting was sent to
    jdhedden
    <>], who wrote in article <>:
    > while (threads->list()) {
    > $_->join() foreach threads->list(threads::joinable);
    > sleep(1);
    > }


    Badly designed API; with a better API it should have been something like

    1 while threads->join_a_thread();

    Hope this helps,
    Ilya
    Ilya Zakharevich, Nov 11, 2006
    #5
  6. robert.waters

    jdhedden Guest

    Jerry D. Hedden wrote:
    > while (threads->list()) {
    > $_->join() foreach threads->list(threads::joinable);
    > sleep(1);
    > }


    Ilya Zakharevich wrote:
    > Badly designed API; with a better API it should have been something like
    > 1 while threads->join_a_thread();


    The problem I see with this is that it provides no mechanism for
    identifying which thread was joined.
    jdhedden, Nov 11, 2006
    #6
  7. [A complimentary Cc of this posting was sent to
    jdhedden
    <>], who wrote in article <>:
    > Ilya Zakharevich wrote:
    > > Badly designed API; with a better API it should have been something like
    > > 1 while threads->join_a_thread();

    >
    > The problem I see with this is that it provides no mechanism for
    > identifying which thread was joined.


    Same as for the initial code. If you need this info, just inspect the
    return value of join_a_thread(). I would think that something as

    if (my ($thread_id_was, @retVal) = threads->join_a_thread()) {
    } else { # Nothing joinable
    }

    would be sufficient...

    Hope this helps,
    Ilya

    P.S. IMO, Any API that encourages using sleep(1) should go back to
    the design board...
    Ilya Zakharevich, Nov 12, 2006
    #7
  8. robert.waters

    jdhedden Guest

    Ilya Zakharevich wrote:
    > Badly designed API; with a better API it should have been
    > something like
    > 1 while threads->join_a_thread();


    Jerry D. Hedden remarked:
    > The problem I see with this is that it provides no
    > mechanism for identifying which thread was joined.


    Ilya Zakharevich replied:
    > Same as for the initial code. If you need this info, just
    > inspect the return value of join_a_thread(). I would
    > think that something as
    >
    > if (my ($thread_id_was, @retVal) = threads->join_a_thread()) {
    > } else { # Nothing joinable
    > }
    >
    > would be sufficient...


    This is not consistent with your original idea in the
    'while' loop. You need to block until at least one thread
    is joined. Otherwise, you end up in a fast loop. However,
    your if-else statement above returns if no thread is
    joinable.

    > P.S. IMO, Any API that encourages using sleep(1) should
    > go back to the design board...


    I don't disagree that using a sleep call is inefficient, and
    if you read my original post you'll see that I stated that:

    $_->join() foreach threads->list();

    was just fine for the poster's purposes. My example of:

    while (threads->list()) {
    $_->join() foreach threads->list(threads::joinable);
    sleep(1);
    }

    was not meant to be some sort of improvement. It's not. I
    presented it just to illustrate some of the newer 'threads'
    module capabilities. The presenter asked about asynchronous
    joins. My example does that, but it is not optimal for the
    original problem. (Again, it was notional, not definitive.)

    I know that the blocking/non-blocking issue for
    'join_a_thread' could be worked out using a flag argument,
    and I know that using a 'sleep' to prevent a fast loop for
    non-blocking calls is inefficient. My point is that your
    ideas are only half-baked.

    If you really think you can come up with a better API, then
    instead of just complaining, you ought to do the work of
    writing up the code, the docs and the test cases, and then
    either submit the result as a patch to
    , or if it is incompatible with the
    existing API, generate a new module and post it to CPAN.
    jdhedden, Nov 12, 2006
    #8
  9. [A complimentary Cc of this posting was sent to
    jdhedden
    <>], who wrote in article <>:
    > > something like
    > > 1 while threads->join_a_thread();


    > > Same as for the initial code. If you need this info, just
    > > inspect the return value of join_a_thread(). I would
    > > think that something as
    > >
    > > if (my ($thread_id_was, @retVal) = threads->join_a_thread()) {
    > > } else { # Nothing joinable
    > > }
    > >
    > > would be sufficient...


    > This is not consistent with your original idea in the
    > 'while' loop.


    Why do you think so?

    > You need to block until at least one thread
    > is joined. Otherwise, you end up in a fast loop. However,
    > your if-else statement above returns if no thread is
    > joinable.


    I think you do not understand what I meant by "joinable". It is not
    what is finished; it is what is not "fire-and-forget".

    > if you read my original post you'll see that I stated that:
    >
    > $_->join() foreach threads->list();


    This assumes that threads->list() returns only joinable threads, and
    that they all return in a suitable order.

    > My point is that your ideas are only half-baked.


    Thanks. But your ideas about what my ideas are are wrong.

    > If you really think you can come up with a better API, then
    > instead of just complaining, you ought to do the work of
    > writing up the code, the docs and the test cases, and then


    Thanks for teaching me what I ought to do. This is really appreciated.

    Yours,
    Ilya
    Ilya Zakharevich, Nov 12, 2006
    #9
  10. robert.waters

    Guest

    Ilya Zakharevich <> wrote:
    >
    > > if you read my original post you'll see that I stated that:
    > >
    > > $_->join() foreach threads->list();

    >
    > This assumes that threads->list() returns only joinable threads,


    Isn't that exactly what it does?

    > and
    > that they all return in a suitable order.


    From the original post, there was no order dependent code. So any
    order is suitable.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
    , Nov 12, 2006
    #10
  11. [A complimentary Cc of this posting was sent to

    <>], who wrote in article <20061112182050.498$>:
    > > > $_->join() foreach threads->list();


    > > This assumes that threads->list() returns only joinable threads,


    > Isn't that exactly what it does?


    You mean *now*? Maybe... I do not care much, since last time I
    checked, Perl threading API was pretty lame in this respect. E.g., is
    there a way to specify that "I do not care to ever join this thread"
    at thread start? This could shave some overhead...

    > > and
    > > that they all return in a suitable order.


    > From the original post, there was no order dependent code. So any
    > order is suitable.


    For the OP, yes. But I was discussing an API, which should better
    work in a more general case too.

    Ilya
    Ilya Zakharevich, Nov 13, 2006
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Alan Silver
    Replies:
    0
    Views:
    871
    Alan Silver
    Jun 5, 2006
  2. Peter Hansen

    Bug in threading.Thread.join() ?

    Peter Hansen, Mar 24, 2005, in forum: Python
    Replies:
    3
    Views:
    455
    Peter Hansen
    Mar 26, 2005
  3. googleboy
    Replies:
    1
    Views:
    908
    Benji York
    Oct 1, 2005
  4. Gal Aviel
    Replies:
    8
    Views:
    895
    Dennis Lee Bieber
    Feb 29, 2008
  5. Roy Smith
    Replies:
    15
    Views:
    396
    Roy Smith
    Sep 3, 2011
Loading...

Share This Page