Question about system() in multiple threads

enstrophy.2000 · Feb 20, 2006

Hi,
I'm trying to create a perl script for managing multiple tasks. What
I have been doing is creating a queue of tasks via Thread::Queue, and
having two separate threads retrieving
the task names from the queue and execute each one via system() call.
However, upon
running the script, I found that, if I call system() for more than once
after each retrieval of
the task name, only the first system() call actually succeeds in
executing, whereas the
rest got ignored. Even more counterintuitive is that the rest of the
system() calls do return
0, meaning there was a success. Is there anything fundamentally wrong
with this approach?
Any feedback will be greatly appreciated.

Here is the script in question:

#!/usr/bin/perl -w

use threads;
use Thread::Queue;
use Cwd;

$num_thread = 2;

my $q = new Thread::Queue;

#populate this queue using directory names

opendir(DDIR,"..");
@ddir= grep(/^\d/,readdir(DDIR));

foreach $ddir(@ddir){
$q->enqueue($ddir);
}

for (1..$num_thread){
$thr[$_-1] = threads->new(\&sub1,$_);
# print "creating tid ", $thr[$_-1]->tid,"\n";
}

sub sub1 {
my ($id) = @_;
while($foo =$q->dequeue_nb){
batch($foo);
$left = $q->pending;
print "In the thread $id foo=$foo left=$left\n";
last if($left==0);
#sleep($id);
}

foreach $tt(@thr){
print "joining ", $tt->tid,"\n";
$tt->join();
}

sub batch{
my ($ddir) = @_;
chdir "../".$ddir||die "unable to enter $ddir\n";

print "rpt2dat.pl\n";
system("rpt2dat.pl");
print "$ddir R CMD BATCH cmb.obs.mod.R \n";
system("R CMD BATCH cmb.obs.mod.R")==0
or die "$ddir cmb.obs.mod failed $?";

print "$ddir R CMD BATCH comp.ann.sum.R \n";
system("R CMD BATCH comp.ann.sum.R")==0
or die "$ddir cmb.ann.sum failed $?";
print "$ddir R CMD BATCH comp.ann.sum.R finished\n";

wait();
print "$ddir R CMD BATCH comp.mon.sum.R \n";
system("R CMD BATCH comp.mon.sum.R")==0
: or die "$ddir cmb.mon.sum failed $?";
wait();
print "$ddir R CMD BATCH comp.mon.sum.R finished\n";
}

xhoster · Feb 21, 2006

Hi,
I'm trying to create a perl script for managing multiple tasks. What
I have been doing is creating a queue of tasks via Thread::Queue, and
having two separate threads retrieving
the task names from the queue and execute each one via system() call.
However, upon
running the script, I found that, if I call system() for more than once
after each retrieval of
the task name, only the first system() call actually succeeds in
executing, whereas the
rest got ignored.

Replacing R with echo, I cannot reproduce your results. How do you know
the sytems are being ignored, rather than running and simply producing
no results?

Here is the script in question:

#!/usr/bin/perl -w

use threads;
use Thread::Queue;
use Cwd;

$|=1; # at least for debugging purposes

sub batch{
my ($ddir) = @_;
chdir "../".$ddir||die "unable to enter $ddir\n";

I think chdir chdirs the directory of the entire process, not on per-thread
basis. So this is a race condition.

print "rpt2dat.pl\n";
system("rpt2dat.pl");
print "$ddir R CMD BATCH cmb.obs.mod.R \n";
system("R CMD BATCH cmb.obs.mod.R")==0
or die "$ddir cmb.obs.mod failed $?";

print "$ddir R CMD BATCH comp.ann.sum.R \n";
system("R CMD BATCH comp.ann.sum.R")==0
or die "$ddir cmb.ann.sum failed $?";
print "$ddir R CMD BATCH comp.ann.sum.R finished\n";

wait();

What are you waiting for?

Xho

enstrophy.2000 · Feb 22, 2006

xhoster,

Thank you so much for your kind reply. You are right that the wait()
is redundant; I put it there nevertheless because I found the system
calls were not executed.

Good point about that the chdir() may create a race condition.
I wonder if there is a way for specifically testing this?
I have multiple directories and each script outputs to
the local directory. Perhaps I need a semaphore to have this
work? I will appreciate any input.

xhoster · Feb 22, 2006

xhoster,

Thank you so much for your kind reply. You are right that the wait()
is redundant; I put it there nevertheless because I found the system
calls were not executed.

Good point about that the chdir() may create a race condition.
I wonder if there is a way for specifically testing this?
I have multiple directories and each script outputs to
the local directory. Perhaps I need a semaphore to have this
work? I will appreciate any input.

I don't think that *simply* adding a semaphore will help you, because then
you might as well not use threads at all, they will run one at a time.

My first recommendation would be using Parallel::ForkManager rather than
threads. Then you should get independent cwd (at least on Linux; I don't
know what would happend on Windows).

If not that, then I'd try re-writing your processes to use full (or at
least fuller) paths, so you don't need to chdir at all.

third choice would be to do the chdir in the "system" calls:
system "cd ../$ddir && R CMD BATCH cmb.obs.mod.R" and die ....;

Finally, maybe a semaphore method, combined with putting jobs in the
background:

{
lock $semaphore;
chdir "../$ddir" or die;
system "R CMD BATCH cmb.obs.mod.R &" and die ...;
}; # release the lock
wait;

But on second thought, this won't work because you (probably) can't be sure
that wait will get its own background job rather than the other threads
background job, and when you put it a job in the background the success of
"system" is not an indicator of overall success.

Xho

enstrophy.2000 · Feb 23, 2006

xhoster,
Thanks again for offering these solutions. I tried the first one and
it worked fine.
I did some testing again on the script that I posted, and I found I can
reproduce
the problem by replacing "R CMD BATCH..." with the execution of two
perl scripts.
Here are the scripts:

####################################################
test1.pl
####################################################
#!/usr/bin/perl

print "this is test1\n";

####################################################
test2.pl
####################################################
#!/usr/bin/perl

print "this is test2\n";

####################################################
test.thread.pl
####################################################
#!/usr/bin/perl -w

use threads;
use Thread::Queue;
use Cwd;

$|=1;

$num_thread = 2;

my $q = new Thread::Queue;

#populate this queue using directory names

opendir(DDIR,"..");
@ddir= grep(/^\d/,readdir(DDIR));

foreach $ddir(@ddir){
$q->enqueue($ddir);
}

for (1..$num_thread){
$thr[$_-1] = threads->new(\&sub1,$_);
# print "creating tid ", $thr[$_-1]->tid,"\n";
}

sub sub1 {
my ($id) = @_;
while($foo =$q->dequeue_nb){
batch($foo);
$left = $q->pending;
print "In the thread $id foo=$foo left=$left\n";
last if($left==0);
#sleep($id);
}
}

foreach $tt(@thr){
print "joining ", $tt->tid,"\n";
$tt->join();
}

sub batch{
my ($ddir) = @_;
chdir "../".$ddir||die "unable to enter $ddir\n";

print system("test1.pl >test1.log "),"\n";
print system("test2.pl >test2.log "),"\n";
}

I put test1.pl and test2.pl under each directory, and run
test.thread.pl,
then I found test1.log in every directory where test1.pl is, but
test2.log
only under some of the directories. As you pointed out, chdir is not
thread safe, so I guess it is possible that the current directory may
have
changed after the first system() call due to the activity in the other
thread. I will test this a little further. Thank you very much for your

direction and I really appreciate it.

Many threads working constantly	1	Jul 1, 2007
Define alarm in threads	1	May 15, 2009
Exiting threads via signal	6	Nov 11, 2009
threads on XP-- system() works, backtic & popen dosen't...	4	Jun 9, 2006
fork it	11	Aug 8, 2013
threads problem	1	Mar 17, 2006
File locking using threads	14	Oct 23, 2009
threads / freeing memory	1	Apr 18, 2004

Question about system() in multiple threads

enstrophy.2000

xhoster

enstrophy.2000

xhoster

enstrophy.2000

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads