parallel computing in perl?

J

Jie

Hi,

I have to randomly select a sample set and run it 1,000 times. The
following code that i am using now works fine, except it is taking
long time.
fore (1..1000) {
##get the random sample set and then run a command
}

Now I am thinking to split this program into 10 processes to reduce
the processing time. Of course, I can just change the first line to
"for (1..100)" and run this same program in 10 different locations.
but that is really tedi$$ous, and I believe there is a better way to
accomplish it so that a big job can be split into multiple small jobs.
Since I am repeatedly running a random sample set, there would be no
need to worry where each proces ends and another process begins.

Your insight is appreciated!!

jie
 
J

J. Gleixner

Jie said:
Hi,

I have to randomly select a sample set and run it 1,000 times. The
following code that i am using now works fine, except it is taking
long time.
fore (1..1000) { Are we golfing? :)
##get the random sample set and then run a command
}

Now I am thinking to split this program into 10 processes to reduce
the processing time. Of course, I can just change the first line to
"for (1..100)" and run this same program in 10 different locations.
but that is really tedi$$ous, and I believe there is a better way to
accomplish it so that a big job can be split into multiple small jobs.
Since I am repeatedly running a random sample set, there would be no
need to worry where each proces ends and another process begins.

Your insight is appreciated!!

Check CPAN for: Parallel::ForkManager
 
P

Peter Makholm

Jie said:
fore (1..1000) {
##get the random sample set and then run a command
}

Now I am thinking to split this program into 10 processes to reduce
the processing time. Of course, I can just change the first line to
"for (1..100)" and run this same program in 10 different locations.

You might want to look at Parallel::ForkManager. You code would look
like something along the way of

use Parallel::ForkManager;
my $pm = new Parallel::ForkManager 10;

for $data (1 .. 1000) {
my $pid = $pm->start and next;

## get the random sample and process it

$pm->finish;
}

$pm->wait_all_children;

//Makholm
 
J

Jie

however, the problem for parallel computing is a potential file
sharing and overwritten.
for example, previously my code will generate a temporary file and the
next loop will overwrite it with a new generated file. There is no
problem because the overwriting happens after each process is
finished. now when I open 10 parallel processing for example, will
those 10 temporary files or 10 temporary hashs/arrays/variables get
messed up????

thanks!

jie
 
P

Peter Makholm

Jie said:
however, the problem for parallel computing is a potential file
sharing and overwritten.
for example, previously my code will generate a temporary file and the
next loop will overwrite it with a new generated file.

Use File::Temp when dealing with temporary files. Then each loop
should overwrite eaqch other, not even when running in parallel.
There is no problem because the overwriting happens after each
process is finished. now when I open 10 parallel processing for
example, will those 10 temporary files or 10 temporary
hashs/arrays/variables get messed up????

Then perl variables isn't shared between fork'ed processes.

//Makholm
 
M

Michele Dondi

I have to randomly select a sample set and run it 1,000 times. The
following code that i am using now works fine, except it is taking
long time.
fore (1..1000) {
##get the random sample set and then run a command
}

Now I am thinking to split this program into 10 processes to reduce
the processing time. Of course, I can just change the first line to
"for (1..100)" and run this same program in 10 different locations.
but that is really tedi$$ous, and I believe there is a better way to
accomplish it so that a big job can be split into multiple small jobs.
Since I am repeatedly running a random sample set, there would be no
need to worry where each proces ends and another process begins.

Given the specs,

perldoc -f fork
perldoc perlipc


Michele
 
M

Michele Dondi

next loop will overwrite it with a new generated file. There is no
problem because the overwriting happens after each process is
finished. now when I open 10 parallel processing for example, will
those 10 temporary files or 10 temporary hashs/arrays/variables get
messed up????

Variables belong each to their own process. As far as the files are
concerned, just create ten *different* ones. File::Temp may be useful.


Michele
 
J

Jie

Hi, thank you very much for the replies.

I think below would be the code to do it.
I don't know if I used the right syntax to open a temporary file...
Also, I don't know if i need to use "$pm->wait_all_children;" as
suggested by Peter

==========================================================
use File::Temp
use Parallel::ForkManager;

my $pm = new Parallel::ForkManager(10);

for $data (1 .. 1000) {
my $pid = $pm->start and next;
open TEMP_FILE, tempfile();
## Do something with this temp_file
$pm->finish;
}
=========================================================






next loop will overwrite it with a new generated file. There is no
problem because the overwriting happens after each process is
finished. now when I open 10 parallel processing for example, will
those 10 temporary files or 10 temporary hashs/arrays/variables get
messed up????

Variables belong each to their own process. As far as the files are
concerned, just create ten *different* ones. File::Temp may be useful.

Michele
--
{$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
(($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
.'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
 
J

J. Gleixner

Jie said:
Hi, thank you very much for the replies.

I think below would be the code to do it.
I don't know if I used the right syntax to open a temporary file...
Also, I don't know if i need to use "$pm->wait_all_children;" as
suggested by Peter

==========================================================
use File::Temp
use Parallel::ForkManager;

Really??.. that works??..

If you want to know the right syntax, or what a method does,
you may get the answer by actually reading the documentation.

perldoc File::Temp
perldoc Parallel::ForkManager
 
M

Michele Dondi

I think below would be the code to do it.
I don't know if I used the right syntax to open a temporary file... [snip]
open TEMP_FILE, tempfile();

Usual recommendations:

1. use lexical filehandles;
2. use three-args form of open();
3. check for success.

open my $tempfile, '+>', tempfile or die badly;

I changed the mode open because I suppose that you want to create the
tempfile for writing and then read back stuff out of it. If you don't
need the file to have a name, or to know it, then you can avoid
File::Temp and let perl do it easily for you:

open my $tempfile, '+>', undef or die badly;


Michele
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top