fork,exec, and parallel processing

Discussion in 'Perl Misc' started by drew, Mar 26, 2007.

  1. drew

    drew Guest

    Hi,

    I've a need for a script that can query multiple NetBackup servers for
    a clientlist, in parallel (rather than sequentially) if possible.

    I've developed the following code, which seems to work, but also seems
    to only run sequentially. What can I do to correct the script so it
    will run the children in parallel?

    Thanks,

    Drew

    __BEGIN_SCRIPT__
    use strict;

    my $passes = 0;
    my @kids = ();
    my @clients;
    my %backup_results = ();
    my @serverlist = qw/server1 server2/;
    my %client_to_master = ();

    for my $server (@serverlist) {
    my $pid = open(KID_TO_READ,"-|");
    print "PID: $pid\n";
    if ($pid) {
    # parent
    @clients = ();
    while(<KID_TO_READ>) {
    next if (/Windows/);
    next if (/^Hardware /);
    next if (/^-----/);
    my ($hardware,$os,$client_name) = split;
    push(@clients,$client_name);
    $client_to_master{$client_name} = $server;
    }
    close(KID_TO_READ) or warn "Child exited: $?\n";
    $backup_results{$server}{'clientlist'} = [@clients];
    push(@kids, $pid);
    } elsif ($pid == 0) {
    # child
    my @options = qw/ssh/;
    my $path = '/usr/openv/netbackup/bin/admincmd';
    my $cmd = 'bpplclients';
    my $bpplclients = $path . '/' . $cmd;
    push(@options,$server,$bpplclients);
    exec('/usr/bin/sudo',@options) or die "Can't exec: $!
    \n";
    exit(0);
    } else {
    die "Can't fork: $!\n";
    }
    }


    foreach(@kids) {
    waitpid($_,0);
    }

    exit(0);

    __END_SCRIPT__
     
    drew, Mar 26, 2007
    #1
    1. Advertisements

  2. drew

    xhoster Guest

    Did you look for existing modules that will do that?
    You should use lexical file handles.

    my $pid = open(my $fh, "-|") or die;
    push @handles,$fh;
    ### or maybe, depending on your needs:
    ### $ioselect->add($fh);
    print "PID: $pid\n";
    This should all be moved to after the the parallel open loop. I don't
    think you need to save the $pid, only the file-handle.

    This part would be pretty much the same.
    I don't think this currently does anything, as the close of a pipe open
    automatically does a waitpid.

    Anyway, you need to put the above parent code here. The exact nature
    depends on what it is you need to do, as "parallel" covers a vast range
    of options. If the children can be harvested in an arbitrary order, then
    just use a:

    foreach my $handle (@handles) {
    ## harvesting code from above here, with suitable changes.
    }

    If a fast child gets stuck behind a slow one in the @handles list, then
    it will just around idle until the slow one is done, before it can get
    harvested. Often, that is not a problem.

    On the other hand, if the children must be harvested ASAP after each one is
    done, then you would need to use IO::Select or something like it.

    Xho
     
    xhoster, Mar 26, 2007
    #2
    1. Advertisements

  3. In addition to xhoster's comments:

    You should also include the 'warnings' pragma.

    use warnings;

    The else clause will never execute because 'undef' in numerical context is the
    same as 0:

    $ perl -Mstrict -le'my $x; print $x == 0 ? "\$x == 0" : "oops"'
    $x == 0
    $ perl -Mwarnings -Mstrict -le'my $x; print $x == 0 ? "\$x == 0" : "oops"'
    Use of uninitialized value in numeric eq (==) at -e line 1.
    $x == 0

    At least with warnings enabled you will get some indication. You should use
    the defined() function to test for failure, something like:

    defined( my $pid = open KID_TO_READ, '-|' )
    or die "Can't fork: $!\n";



    John
     
    John W. Krahn, Mar 26, 2007
    #3
  4. drew

    Juha Laiho Guest

    If I read the code correctly, your reader side (parent) is completely
    sequential. You're only forking the childs to run alongside the parent
    (which is something you have to do anyway), but you're forking/reading
    them sequentially.

    So, you'll need to fork separately, before opening the pipes. But to
    do this, you'll also have to devise a way the children of this "top"
    fork have a way to pass information back to parent. If you just fork
    and start a reading loop (before forking the next child), you haven't
    won anything. You'll need osme amount of asynchronicity between the
    top-level process and the children running the clientlist queries.

    So, looking at what the parent does; it loops through the server list.
    The parent then goes into reading loop.
    Which only exits after the child has done all processing,
    and the next round in the serverlist loop can only start after this.

    The "elsif" branch gets executed just fine (regardless of what was
    said in another response), however you're not catching fork failures
    (where $pid would actually be undef; see the code example in
    "perldoc perlipc"). Fork failures are relatively rare, though -- but
    would cause here your parent to execute child code, which would be
    pretty confusing to debug.
     
    Juha Laiho, Apr 4, 2007
    #4
  5. drew

    Joe Guest

    Use:
    http://search.cpan.org/~dlux/Parallel-ForkManager-0.7.5/ForkManager.pm

    However you cannot write to a file or array while this is forking due
    to overwrites..

    Dump the data you need into a Database then parse the database once
    the script is finished...

    Works like a champ...

    Joe
     
    Joe, Apr 18, 2007
    #5
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.