perl threading; ->join; best method?


R

robert.waters

Hi,

I have a function that does a repetitive task, each task identified by
a unique index; I would like this task to be shared amongst a number of
threads. My question is, in what manner should I call 'join' in order
to be sure that each thread has completed it's task before the end of
the program?
My current logic is this:
--------------------------------------------------------------------
use LWP::UserAgent;
use HTTP::Request;
use threads;
use threads::shared;

my @threads;
my $threadcnt = 20;

my $unique_index : shared = 1;
$unique_index = 1;
my $top_index = 1000;

# CREATE ALL THE THREADS
for (my $i=0;$i<$threadcnt;$i++) {

$threads[$i] = threads->new('workerFunction', $i);

}

# JOIN ALL THREADS
foreach $thread (threads->list) {

$thread->join;

}

print "done.\n";


sub workerFunction {

my $threadindex = shift;

print STDERR "thread $threadindex starting...";

my $ua = LWP::UserAgent->new;

while ($unique_index < $top_index) {

my $url = "http://something.com/index.php?tid=".$unique_index++;

my $req = HTTP::Request->new(GET => $url);
my $resp = $ua->request($req);

print $resp->content;


}

}

-----------------------------------------
Basically, each thread shares a variable representing the index (and
increments it each time it is used).
After all the threads have been created, I iterate through the thread
list, calling join on each one in order. This iteration seems hackish
to me though; can this really be the right way to go about waiting for
the threads to finish? If thread 6 finishes first (which is definitely
possible), I am still waiting for threads 1 through 5 to finish before
I ever even check to see if thread 6 completes.
This technique, while working fine enough, just does not seem to be
"The Right Way" to do it; I would assume that the process of waiting
for threads to finish should be asynchronous.

Any suggestions?

Thank you-
Robert
 
Ad

Advertisements

X

xhoster

robert.waters said:
Hi,

I have a function that does a repetitive task, each task identified by
a unique index; I would like this task to be shared amongst a number of
threads. My question is, in what manner should I call 'join' in order
to be sure that each thread has completed it's task before the end of
the program?
My current logic is this:
--------------------------------------------------------------------
use LWP::UserAgent;
use HTTP::Request;
use threads;
use threads::shared;

my @threads;
my $threadcnt = 20;

my $unique_index : shared = 1;
$unique_index = 1;
my $top_index = 1000;

# CREATE ALL THE THREADS
for (my $i=0;$i<$threadcnt;$i++) {

$threads[$i] = threads->new('workerFunction', $i);

}

# JOIN ALL THREADS
foreach $thread (threads->list) {

$thread->join;

}

print "done.\n";

sub workerFunction {

my $threadindex = shift;

print STDERR "thread $threadindex starting...";

my $ua = LWP::UserAgent->new;

while ($unique_index < $top_index) {

my $url =
"http://something.com/index.php?tid=".$unique_index++;

This doesn't seem safe to me. unique_index could change between the
while and the url construction. Also, if two threads create the url
at the same time, they might both get the same url, and the next url would
get skipped. You should lock the variable.

while (1) {
{
lock($unique_index);
my $private = $unique_index++;
} # release the lock;
last if $unique_index >= $top_index

or you could use Thread::Queue instead of all of this stuff.
....
Basically, each thread shares a variable representing the index (and
increments it each time it is used).
After all the threads have been created, I iterate through the thread
list, calling join on each one in order. This iteration seems hackish
to me though; can this really be the right way to go about waiting for
the threads to finish?

Yep.

If thread 6 finishes first (which is definitely
possible), I am still waiting for threads 1 through 5 to finish before
I ever even check to see if thread 6 completes.

So what?
This technique, while working fine enough, just does not seem to be
"The Right Way" to do it; I would assume that the process of waiting
for threads to finish should be asynchronous.

You could use is_joinable in threads.pm newer than 1.34, or for older
threads.pm then use Thread::State. But in the given situation, I don't
see how this would be an improvement. Making the change would be more
hackish then just leaving it as it is.

Xho
 
Z

zentara

I have a function that does a repetitive task, each task identified by
a unique index; I would like this task to be shared amongst a number of
threads. My question is, in what manner should I call 'join' in order
to be sure that each thread has completed it's task before the end of
the program?
My current logic is this:

Just remember that in order for a thread to be joinable, it must finish,
or somehow return from it's code block.

With LWP, you will need to set a timeout on the connection, or use an
alarm block, to detect LWP connections which hang. Otherwise, those
hanging threads will NEVER be joinable, and you will stay in your
join loop forever.

If you kill the main thread, you will get the error " a thread exited
while x threads running".
 
J

jdhedden

Robert said:
# JOIN ALL THREADS
foreach $thread (threads->list) {
$thread->join;
}

My question is: In what manner should I call 'join' in
order to be sure that each thread has completed it's task
before the end of the program?

The method you have above is perfectly correct. It can be
compacted a bit using:

$_->join() foreach threads->list();
This technique, while working fine enough, just does not
seem to be "The Right Way" to do it; I would assume that
the process of waiting for threads to finish should be
asynchronous.

If you upgrade to the lastest version of 'threads' off of
CPAN, you can do this asynchronously;

while (threads->list()) {
$_->join() foreach threads->list(threads::joinable);
sleep(1);
}

Hope this helps.
 
I

Ilya Zakharevich

[A complimentary Cc of this posting was sent to
jdhedden
while (threads->list()) {
$_->join() foreach threads->list(threads::joinable);
sleep(1);
}

Badly designed API; with a better API it should have been something like

1 while threads->join_a_thread();

Hope this helps,
Ilya
 
J

jdhedden

Jerry said:
while (threads->list()) {
$_->join() foreach threads->list(threads::joinable);
sleep(1);
}

Ilya said:
Badly designed API; with a better API it should have been something like
1 while threads->join_a_thread();

The problem I see with this is that it provides no mechanism for
identifying which thread was joined.
 
Ad

Advertisements

I

Ilya Zakharevich

[A complimentary Cc of this posting was sent to
jdhedden
The problem I see with this is that it provides no mechanism for
identifying which thread was joined.

Same as for the initial code. If you need this info, just inspect the
return value of join_a_thread(). I would think that something as

if (my ($thread_id_was, @retVal) = threads->join_a_thread()) {
} else { # Nothing joinable
}

would be sufficient...

Hope this helps,
Ilya

P.S. IMO, Any API that encourages using sleep(1) should go back to
the design board...
 
J

jdhedden

Ilya said:
Badly designed API; with a better API it should have been
something like
1 while threads->join_a_thread();

Jerry D. Hedden remarked:
The problem I see with this is that it provides no
mechanism for identifying which thread was joined.

Ilya Zakharevich replied:
Same as for the initial code. If you need this info, just
inspect the return value of join_a_thread(). I would
think that something as

if (my ($thread_id_was, @retVal) = threads->join_a_thread()) {
} else { # Nothing joinable
}

would be sufficient...

This is not consistent with your original idea in the
'while' loop. You need to block until at least one thread
is joined. Otherwise, you end up in a fast loop. However,
your if-else statement above returns if no thread is
joinable.
P.S. IMO, Any API that encourages using sleep(1) should
go back to the design board...

I don't disagree that using a sleep call is inefficient, and
if you read my original post you'll see that I stated that:

$_->join() foreach threads->list();

was just fine for the poster's purposes. My example of:

while (threads->list()) {
$_->join() foreach threads->list(threads::joinable);
sleep(1);
}

was not meant to be some sort of improvement. It's not. I
presented it just to illustrate some of the newer 'threads'
module capabilities. The presenter asked about asynchronous
joins. My example does that, but it is not optimal for the
original problem. (Again, it was notional, not definitive.)

I know that the blocking/non-blocking issue for
'join_a_thread' could be worked out using a flag argument,
and I know that using a 'sleep' to prevent a fast loop for
non-blocking calls is inefficient. My point is that your
ideas are only half-baked.

If you really think you can come up with a better API, then
instead of just complaining, you ought to do the work of
writing up the code, the docs and the test cases, and then
either submit the result as a patch to
(e-mail address removed), or if it is incompatible with the
existing API, generate a new module and post it to CPAN.
 
I

Ilya Zakharevich

[A complimentary Cc of this posting was sent to
jdhedden
This is not consistent with your original idea in the
'while' loop.

Why do you think so?
You need to block until at least one thread
is joined. Otherwise, you end up in a fast loop. However,
your if-else statement above returns if no thread is
joinable.

I think you do not understand what I meant by "joinable". It is not
what is finished; it is what is not "fire-and-forget".
if you read my original post you'll see that I stated that:

$_->join() foreach threads->list();

This assumes that threads->list() returns only joinable threads, and
that they all return in a suitable order.
My point is that your ideas are only half-baked.

Thanks. But your ideas about what my ideas are are wrong.
If you really think you can come up with a better API, then
instead of just complaining, you ought to do the work of
writing up the code, the docs and the test cases, and then

Thanks for teaching me what I ought to do. This is really appreciated.

Yours,
Ilya
 
X

xhoster

Ilya Zakharevich said:
This assumes that threads->list() returns only joinable threads,

Isn't that exactly what it does?
and
that they all return in a suitable order.

From the original post, there was no order dependent code. So any
order is suitable.

Xho
 
Ad

Advertisements

I

Ilya Zakharevich

[A complimentary Cc of this posting was sent to

Isn't that exactly what it does?

You mean *now*? Maybe... I do not care much, since last time I
checked, Perl threading API was pretty lame in this respect. E.g., is
there a way to specify that "I do not care to ever join this thread"
at thread start? This could shave some overhead...
From the original post, there was no order dependent code. So any
order is suitable.

For the OP, yes. But I was discussing an API, which should better
work in a more general case too.

Ilya
 
Ad

Advertisements


Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top