parallel computing in perl?

Discussion in 'Perl Misc' started by Jie, Sep 13, 2007.

  1. Jie

    Jie Guest

    Hi,

    I have to randomly select a sample set and run it 1,000 times. The
    following code that i am using now works fine, except it is taking
    long time.
    fore (1..1000) {
    ##get the random sample set and then run a command
    }

    Now I am thinking to split this program into 10 processes to reduce
    the processing time. Of course, I can just change the first line to
    "for (1..100)" and run this same program in 10 different locations.
    but that is really tedi$$ous, and I believe there is a better way to
    accomplish it so that a big job can be split into multiple small jobs.
    Since I am repeatedly running a random sample set, there would be no
    need to worry where each proces ends and another process begins.

    Your insight is appreciated!!

    jie
    Jie, Sep 13, 2007
    #1
    1. Advertising

  2. Jie

    J. Gleixner Guest

    Jie wrote:
    > Hi,
    >
    > I have to randomly select a sample set and run it 1,000 times. The
    > following code that i am using now works fine, except it is taking
    > long time.
    > fore (1..1000) {

    Are we golfing? :)
    > ##get the random sample set and then run a command
    > }
    >
    > Now I am thinking to split this program into 10 processes to reduce
    > the processing time. Of course, I can just change the first line to
    > "for (1..100)" and run this same program in 10 different locations.
    > but that is really tedi$$ous, and I believe there is a better way to
    > accomplish it so that a big job can be split into multiple small jobs.
    > Since I am repeatedly running a random sample set, there would be no
    > need to worry where each proces ends and another process begins.
    >
    > Your insight is appreciated!!


    Check CPAN for: Parallel::ForkManager
    J. Gleixner, Sep 13, 2007
    #2
    1. Advertising

  3. Jie <> writes:

    > fore (1..1000) {
    > ##get the random sample set and then run a command
    > }
    >
    > Now I am thinking to split this program into 10 processes to reduce
    > the processing time. Of course, I can just change the first line to
    > "for (1..100)" and run this same program in 10 different locations.


    You might want to look at Parallel::ForkManager. You code would look
    like something along the way of

    use Parallel::ForkManager;
    my $pm = new Parallel::ForkManager 10;

    for $data (1 .. 1000) {
    my $pid = $pm->start and next;

    ## get the random sample and process it

    $pm->finish;
    }

    $pm->wait_all_children;

    //Makholm
    Peter Makholm, Sep 13, 2007
    #3
  4. Jie

    Jie Guest

    however, the problem for parallel computing is a potential file
    sharing and overwritten.
    for example, previously my code will generate a temporary file and the
    next loop will overwrite it with a new generated file. There is no
    problem because the overwriting happens after each process is
    finished. now when I open 10 parallel processing for example, will
    those 10 temporary files or 10 temporary hashs/arrays/variables get
    messed up????

    thanks!

    jie


    On Sep 13, 4:33 pm, Peter Makholm <> wrote:
    > Jie <> writes:
    > > fore (1..1000) {
    > > ##get the random sample set and then run a command
    > > }

    >
    > > Now I am thinking to split this program into 10 processes to reduce
    > > the processing time. Of course, I can just change the first line to
    > > "for (1..100)" and run this same program in 10 different locations.

    >
    > You might want to look at Parallel::ForkManager. You code would look
    > like something along the way of
    >
    > use Parallel::ForkManager;
    > my $pm = new Parallel::ForkManager 10;
    >
    > for $data (1 .. 1000) {
    > my $pid = $pm->start and next;
    >
    > ## get the random sample and process it
    >
    > $pm->finish;
    >
    > }
    >
    > $pm->wait_all_children;
    >
    > //Makholm
    Jie, Sep 13, 2007
    #4
  5. Jie <> writes:

    > however, the problem for parallel computing is a potential file
    > sharing and overwritten.
    > for example, previously my code will generate a temporary file and the
    > next loop will overwrite it with a new generated file.


    Use File::Temp when dealing with temporary files. Then each loop
    should overwrite eaqch other, not even when running in parallel.

    > There is no problem because the overwriting happens after each
    > process is finished. now when I open 10 parallel processing for
    > example, will those 10 temporary files or 10 temporary
    > hashs/arrays/variables get messed up????


    Then perl variables isn't shared between fork'ed processes.

    //Makholm
    Peter Makholm, Sep 13, 2007
    #5
  6. On Thu, 13 Sep 2007 13:04:12 -0700, Jie <> wrote:

    >I have to randomly select a sample set and run it 1,000 times. The
    >following code that i am using now works fine, except it is taking
    >long time.
    >fore (1..1000) {
    > ##get the random sample set and then run a command
    >}
    >
    >Now I am thinking to split this program into 10 processes to reduce
    >the processing time. Of course, I can just change the first line to
    >"for (1..100)" and run this same program in 10 different locations.
    >but that is really tedi$$ous, and I believe there is a better way to
    >accomplish it so that a big job can be split into multiple small jobs.
    >Since I am repeatedly running a random sample set, there would be no
    >need to worry where each proces ends and another process begins.


    Given the specs,

    perldoc -f fork
    perldoc perlipc


    Michele
    --
    {$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
    (($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
    ..'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
    256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
    Michele Dondi, Sep 13, 2007
    #6
  7. On Thu, 13 Sep 2007 13:39:51 -0700, Jie <> wrote:

    >next loop will overwrite it with a new generated file. There is no
    >problem because the overwriting happens after each process is
    >finished. now when I open 10 parallel processing for example, will
    >those 10 temporary files or 10 temporary hashs/arrays/variables get
    >messed up????


    Variables belong each to their own process. As far as the files are
    concerned, just create ten *different* ones. File::Temp may be useful.


    Michele
    --
    {$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
    (($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
    ..'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
    256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
    Michele Dondi, Sep 14, 2007
    #7
  8. Jie

    Jie Guest

    Hi, thank you very much for the replies.

    I think below would be the code to do it.
    I don't know if I used the right syntax to open a temporary file...
    Also, I don't know if i need to use "$pm->wait_all_children;" as
    suggested by Peter

    ==========================================================
    use File::Temp
    use Parallel::ForkManager;

    my $pm = new Parallel::ForkManager(10);

    for $data (1 .. 1000) {
    my $pid = $pm->start and next;
    open TEMP_FILE, tempfile();
    ## Do something with this temp_file
    $pm->finish;
    }
    =========================================================






    On Sep 14, 5:26 am, Michele Dondi <> wrote:
    > On Thu, 13 Sep 2007 13:39:51 -0700, Jie <> wrote:
    > >next loop will overwrite it with a new generated file. There is no
    > >problem because the overwriting happens after each process is
    > >finished. now when I open 10 parallel processing for example, will
    > >those 10 temporary files or 10 temporary hashs/arrays/variables get
    > >messed up????

    >
    > Variables belong each to their own process. As far as the files are
    > concerned, just create ten *different* ones. File::Temp may be useful.
    >
    > Michele
    > --
    > {$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
    > (($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
    > .'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
    > 256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
    Jie, Sep 14, 2007
    #8
  9. Jie

    J. Gleixner Guest

    Jie wrote:
    > Hi, thank you very much for the replies.
    >
    > I think below would be the code to do it.
    > I don't know if I used the right syntax to open a temporary file...
    > Also, I don't know if i need to use "$pm->wait_all_children;" as
    > suggested by Peter
    >
    > ==========================================================
    > use File::Temp
    > use Parallel::ForkManager;


    Really??.. that works??..

    If you want to know the right syntax, or what a method does,
    you may get the answer by actually reading the documentation.

    perldoc File::Temp
    perldoc Parallel::ForkManager
    J. Gleixner, Sep 14, 2007
    #9
  10. On Fri, 14 Sep 2007 07:27:12 -0700, Jie <> wrote:

    >I think below would be the code to do it.
    >I don't know if I used the right syntax to open a temporary file...

    [snip]
    > open TEMP_FILE, tempfile();


    Usual recommendations:

    1. use lexical filehandles;
    2. use three-args form of open();
    3. check for success.

    open my $tempfile, '+>', tempfile or die badly;

    I changed the mode open because I suppose that you want to create the
    tempfile for writing and then read back stuff out of it. If you don't
    need the file to have a name, or to know it, then you can avoid
    File::Temp and let perl do it easily for you:

    open my $tempfile, '+>', undef or die badly;


    Michele
    --
    {$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
    (($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
    ..'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
    256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
    Michele Dondi, Sep 15, 2007
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Sori Schwimmer

    parallel computing

    Sori Schwimmer, Apr 25, 2006, in forum: Python
    Replies:
    0
    Views:
    323
    Sori Schwimmer
    Apr 25, 2006
  2. Replies:
    0
    Views:
    412
  3. Replies:
    0
    Views:
    334
  4. optical supercomputing

    Optical Computing: special issue - Natural Computing, Springer

    optical supercomputing, Dec 19, 2008, in forum: C Programming
    Replies:
    0
    Views:
    403
    optical supercomputing
    Dec 19, 2008
  5. optical supercomputing

    Optical Computing: special issue - Natural Computing, Springer

    optical supercomputing, Jan 16, 2009, in forum: C Programming
    Replies:
    0
    Views:
    437
    optical supercomputing
    Jan 16, 2009
Loading...

Share This Page