fork,exec, and parallel processing

D

drew

Hi,

I've a need for a script that can query multiple NetBackup servers for
a clientlist, in parallel (rather than sequentially) if possible.

I've developed the following code, which seems to work, but also seems
to only run sequentially. What can I do to correct the script so it
will run the children in parallel?

Thanks,

Drew

__BEGIN_SCRIPT__
use strict;

my $passes = 0;
my @kids = ();
my @clients;
my %backup_results = ();
my @serverlist = qw/server1 server2/;
my %client_to_master = ();

for my $server (@serverlist) {
my $pid = open(KID_TO_READ,"-|");
print "PID: $pid\n";
if ($pid) {
# parent
@clients = ();
while(<KID_TO_READ>) {
next if (/Windows/);
next if (/^Hardware /);
next if (/^-----/);
my ($hardware,$os,$client_name) = split;
push(@clients,$client_name);
$client_to_master{$client_name} = $server;
}
close(KID_TO_READ) or warn "Child exited: $?\n";
$backup_results{$server}{'clientlist'} = [@clients];
push(@kids, $pid);
} elsif ($pid == 0) {
# child
my @options = qw/ssh/;
my $path = '/usr/openv/netbackup/bin/admincmd';
my $cmd = 'bpplclients';
my $bpplclients = $path . '/' . $cmd;
push(@options,$server,$bpplclients);
exec('/usr/bin/sudo',@options) or die "Can't exec: $!
\n";
exit(0);
} else {
die "Can't fork: $!\n";
}
}


foreach(@kids) {
waitpid($_,0);
}

exit(0);

__END_SCRIPT__
 
X

xhoster

Hi,

I've a need for a script that can query multiple NetBackup servers for
a clientlist, in parallel (rather than sequentially) if possible.

Did you look for existing modules that will do that?
I've developed the following code, which seems to work, but also seems
to only run sequentially. What can I do to correct the script so it
will run the children in parallel?

Thanks,

Drew

__BEGIN_SCRIPT__
use strict;

my $passes = 0;
my @kids = ();
my @clients;
my %backup_results = ();
my @serverlist = qw/server1 server2/;
my %client_to_master = ();

for my $server (@serverlist) {
my $pid = open(KID_TO_READ,"-|");

You should use lexical file handles.

my $pid = open(my $fh, "-|") or die;
push @handles,$fh;
### or maybe, depending on your needs:
### $ioselect->add($fh);
print "PID: $pid\n";
if ($pid) {
# parent
@clients = ();
while(<KID_TO_READ>) {
next if (/Windows/);
next if (/^Hardware /);
next if (/^-----/);
my ($hardware,$os,$client_name) = split;
push(@clients,$client_name);
$client_to_master{$client_name} = $server;
}
close(KID_TO_READ) or warn "Child exited: $?\n";
$backup_results{$server}{'clientlist'} = [@clients];
push(@kids, $pid);

This should all be moved to after the the parallel open loop. I don't
think you need to save the $pid, only the file-handle.

} elsif ($pid == 0) {
# child
my @options = qw/ssh/;
my $path = '/usr/openv/netbackup/bin/admincmd';
my $cmd = 'bpplclients';
my $bpplclients = $path . '/' . $cmd;
push(@options,$server,$bpplclients);
exec('/usr/bin/sudo',@options) or die "Can't exec: $!
\n";
exit(0);
} else {
die "Can't fork: $!\n";
}
}

This part would be pretty much the same.
foreach(@kids) {
waitpid($_,0);
}

I don't think this currently does anything, as the close of a pipe open
automatically does a waitpid.

Anyway, you need to put the above parent code here. The exact nature
depends on what it is you need to do, as "parallel" covers a vast range
of options. If the children can be harvested in an arbitrary order, then
just use a:

foreach my $handle (@handles) {
## harvesting code from above here, with suitable changes.
}

If a fast child gets stuck behind a slow one in the @handles list, then
it will just around idle until the slow one is done, before it can get
harvested. Often, that is not a problem.

On the other hand, if the children must be harvested ASAP after each one is
done, then you would need to use IO::Select or something like it.

Xho
 
J

John W. Krahn

I've a need for a script that can query multiple NetBackup servers for
a clientlist, in parallel (rather than sequentially) if possible.

I've developed the following code, which seems to work, but also seems
to only run sequentially. What can I do to correct the script so it
will run the children in parallel?

In addition to xhoster's comments:

__BEGIN_SCRIPT__
use strict;

You should also include the 'warnings' pragma.

use warnings;

my $passes = 0;
my @kids = ();
my @clients;
my %backup_results = ();
my @serverlist = qw/server1 server2/;
my %client_to_master = ();

for my $server (@serverlist) {
my $pid = open(KID_TO_READ,"-|");
print "PID: $pid\n";
if ($pid) {
# parent
@clients = ();
while(<KID_TO_READ>) {
next if (/Windows/);
next if (/^Hardware /);
next if (/^-----/);
my ($hardware,$os,$client_name) = split;
push(@clients,$client_name);
$client_to_master{$client_name} = $server;
}
close(KID_TO_READ) or warn "Child exited: $?\n";
$backup_results{$server}{'clientlist'} = [@clients];
push(@kids, $pid);
} elsif ($pid == 0) {

The else clause will never execute because 'undef' in numerical context is the
same as 0:

$ perl -Mstrict -le'my $x; print $x == 0 ? "\$x == 0" : "oops"'
$x == 0
$ perl -Mwarnings -Mstrict -le'my $x; print $x == 0 ? "\$x == 0" : "oops"'
Use of uninitialized value in numeric eq (==) at -e line 1.
$x == 0

At least with warnings enabled you will get some indication. You should use
the defined() function to test for failure, something like:

defined( my $pid = open KID_TO_READ, '-|' )
or die "Can't fork: $!\n";

# child
my @options = qw/ssh/;
my $path = '/usr/openv/netbackup/bin/admincmd';
my $cmd = 'bpplclients';
my $bpplclients = $path . '/' . $cmd;
push(@options,$server,$bpplclients);
exec('/usr/bin/sudo',@options) or die "Can't exec: $!
\n";
exit(0);
} else {
die "Can't fork: $!\n";
}
}



John
 
J

Juha Laiho

I've a need for a script that can query multiple NetBackup servers for
a clientlist, in parallel (rather than sequentially) if possible.

I've developed the following code, which seems to work, but also seems
to only run sequentially. What can I do to correct the script so it
will run the children in parallel?

If I read the code correctly, your reader side (parent) is completely
sequential. You're only forking the childs to run alongside the parent
(which is something you have to do anyway), but you're forking/reading
them sequentially.

So, you'll need to fork separately, before opening the pipes. But to
do this, you'll also have to devise a way the children of this "top"
fork have a way to pass information back to parent. If you just fork
and start a reading loop (before forking the next child), you haven't
won anything. You'll need osme amount of asynchronicity between the
top-level process and the children running the clientlist queries.

So, looking at what the parent does; it loops through the server list.
for my $server (@serverlist) { It creates a child.
my $pid = open(KID_TO_READ,"-|");
print "PID: $pid\n";
if ($pid) {
# parent
@clients = ();
The parent then goes into reading loop.
while(<KID_TO_READ>) {
next if (/Windows/);
next if (/^Hardware /);
next if (/^-----/);
my ($hardware,$os,$client_name) = split;
push(@clients,$client_name);
$client_to_master{$client_name} = $server;
}
close(KID_TO_READ) or warn "Child exited: $?\n";
Which only exits after the child has done all processing,
and the next round in the serverlist loop can only start after this.

The "elsif" branch gets executed just fine (regardless of what was
said in another response), however you're not catching fork failures
(where $pid would actually be undef; see the code example in
"perldoc perlipc"). Fork failures are relatively rare, though -- but
would cause here your parent to execute child code, which would be
pretty confusing to debug.
 
J

Joe

Hi,

I've a need for a script that can query multiple NetBackup servers for
a clientlist, in parallel (rather than sequentially) if possible.

I've developed the following code, which seems to work, but also seems
to only run sequentially. What can I do to correct the script so it
will run the children in parallel?

Thanks,

Drew

__BEGIN_SCRIPT__
use strict;

my $passes = 0;
my @kids = ();
my @clients;
my %backup_results = ();
my @serverlist = qw/server1 server2/;
my %client_to_master = ();

for my $server (@serverlist) {
my $pid = open(KID_TO_READ,"-|");
print "PID: $pid\n";
if ($pid) {
# parent
@clients = ();
while(<KID_TO_READ>) {
next if (/Windows/);
next if (/^Hardware /);
next if (/^-----/);
my ($hardware,$os,$client_name) = split;
push(@clients,$client_name);
$client_to_master{$client_name} = $server;
}
close(KID_TO_READ) or warn "Child exited: $?\n";
$backup_results{$server}{'clientlist'} = [@clients];
push(@kids, $pid);
} elsif ($pid == 0) {
# child
my @options = qw/ssh/;
my $path = '/usr/openv/netbackup/bin/admincmd';
my $cmd = 'bpplclients';
my $bpplclients = $path . '/' . $cmd;
push(@options,$server,$bpplclients);
exec('/usr/bin/sudo',@options) or die "Can't exec: $!
\n";
exit(0);
} else {
die "Can't fork: $!\n";
}
}


foreach(@kids) {
waitpid($_,0);
}

exit(0);

__END_SCRIPT__

Use:
http://search.cpan.org/~dlux/Parallel-ForkManager-0.7.5/ForkManager.pm

However you cannot write to a file or array while this is forking due
to overwrites..

Dump the data you need into a Database then parse the database once
the script is finished...

Works like a champ...

Joe
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

fork/exec question 6
Fork (and exec) in a threaded script. 4
fork and blocking... 3
fork() & pipe() 3
problem with fork 8
fork and hanging 1
fork, exec, and signal handling 5
PID of exec 34

Members online

No members online now.

Forum statistics

Threads
473,930
Messages
2,570,072
Members
46,521
Latest member
JamieCooch

Latest Threads

Top