parallel computing in perl?

Discussion in 'Perl Misc' started by Jie, Sep 13, 2007.

  1. Jie

    Jie Guest

    Hi,

    I have to randomly select a sample set and run it 1,000 times. The
    following code that i am using now works fine, except it is taking
    long time.
    fore (1..1000) {
    ##get the random sample set and then run a command
    }

    Now I am thinking to split this program into 10 processes to reduce
    the processing time. Of course, I can just change the first line to
    "for (1..100)" and run this same program in 10 different locations.
    but that is really tedi$$ous, and I believe there is a better way to
    accomplish it so that a big job can be split into multiple small jobs.
    Since I am repeatedly running a random sample set, there would be no
    need to worry where each proces ends and another process begins.

    Your insight is appreciated!!

    jie
     
    Jie, Sep 13, 2007
    #1
    1. Advertisements

  2. Jie

    J. Gleixner Guest

    Check CPAN for: Parallel::ForkManager
     
    J. Gleixner, Sep 13, 2007
    #2
    1. Advertisements

  3. You might want to look at Parallel::ForkManager. You code would look
    like something along the way of

    use Parallel::ForkManager;
    my $pm = new Parallel::ForkManager 10;

    for $data (1 .. 1000) {
    my $pid = $pm->start and next;

    ## get the random sample and process it

    $pm->finish;
    }

    $pm->wait_all_children;

    //Makholm
     
    Peter Makholm, Sep 13, 2007
    #3
  4. Jie

    Jie Guest

    however, the problem for parallel computing is a potential file
    sharing and overwritten.
    for example, previously my code will generate a temporary file and the
    next loop will overwrite it with a new generated file. There is no
    problem because the overwriting happens after each process is
    finished. now when I open 10 parallel processing for example, will
    those 10 temporary files or 10 temporary hashs/arrays/variables get
    messed up????

    thanks!

    jie
     
    Jie, Sep 13, 2007
    #4
  5. Use File::Temp when dealing with temporary files. Then each loop
    should overwrite eaqch other, not even when running in parallel.
    Then perl variables isn't shared between fork'ed processes.

    //Makholm
     
    Peter Makholm, Sep 13, 2007
    #5
  6. Given the specs,

    perldoc -f fork
    perldoc perlipc


    Michele
     
    Michele Dondi, Sep 13, 2007
    #6
  7. Variables belong each to their own process. As far as the files are
    concerned, just create ten *different* ones. File::Temp may be useful.


    Michele
     
    Michele Dondi, Sep 14, 2007
    #7
  8. Jie

    Jie Guest

    Hi, thank you very much for the replies.

    I think below would be the code to do it.
    I don't know if I used the right syntax to open a temporary file...
    Also, I don't know if i need to use "$pm->wait_all_children;" as
    suggested by Peter

    ==========================================================
    use File::Temp
    use Parallel::ForkManager;

    my $pm = new Parallel::ForkManager(10);

    for $data (1 .. 1000) {
    my $pid = $pm->start and next;
    open TEMP_FILE, tempfile();
    ## Do something with this temp_file
    $pm->finish;
    }
    =========================================================






     
    Jie, Sep 14, 2007
    #8
  9. Jie

    J. Gleixner Guest

    Really??.. that works??..

    If you want to know the right syntax, or what a method does,
    you may get the answer by actually reading the documentation.

    perldoc File::Temp
    perldoc Parallel::ForkManager
     
    J. Gleixner, Sep 14, 2007
    #9
  10. Usual recommendations:

    1. use lexical filehandles;
    2. use three-args form of open();
    3. check for success.

    open my $tempfile, '+>', tempfile or die badly;

    I changed the mode open because I suppose that you want to create the
    tempfile for writing and then read back stuff out of it. If you don't
    need the file to have a name, or to know it, then you can avoid
    File::Temp and let perl do it easily for you:

    open my $tempfile, '+>', undef or die badly;


    Michele
     
    Michele Dondi, Sep 15, 2007
    #10
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.