Parallel/ Multithreading /Forking?

P

Prabh

Hi,
My task involves searching a huge number of files on filesystem and
based on the filetype perform some function. e.g. if file is ".java"
perform a particular routine, if its ".pl" perform another.

As the no. of files is quite huge, I thought it'd be most efficient if
I used any parallel-processing techinique.

Some've advised me to use forking, some parallel processing, some
others multi-threading. Needless to say, I'm confused.

Are these terms being interchangeably used?
How are they diff. from one another?

Could anyone give me a clue what would be a good fit for my task.
Should I be downloading any CPAN modules and such?

Thanks for your time.
Prabh
 
C

ctcgag

Hi,
My task involves searching a huge number of files on filesystem and
based on the filetype perform some function. e.g. if file is ".java"
perform a particular routine, if its ".pl" perform another.

As the no. of files is quite huge, I thought it'd be most efficient if
I used any parallel-processing techinique.

The easiest way to parallel process directory tree is just to run
the program multiple times independently, each one operating from
a different root node.

perl do_stuff.pl /opt/partition1/mystuff &
perl do_stuff.pl /opt/partition1/yourstuff &
perl do_stuff.pl /opt/partition2 &

etc.

But that doesn't do load balancing very well (if one program finishes much
ealier than the others, it will just exit, rather that start working on
whatever the slower ones haven't gotten to yet.)
Some've advised me to use forking, some parallel processing, some
others multi-threading. Needless to say, I'm confused.

forking and multi-threading are two different ways to do parallel
processing.
Are these terms being interchangeably used?
How are they diff. from one another?

In very, very brief and lying more than a little: in Perl forking is
usually for unix/linux and threading is usually for Windows. For more
honest and accurate descriptions, see "perldoc perlfork" and "perldoc
perlthrtut"

Could anyone give me a clue what would be a good fit for my task.
Should I be downloading any CPAN modules and such?

Look at Parallel::ForkManager.

Xho
 
B

Ben Morrow

Quoth (e-mail address removed):
In very, very brief and lying more than a little: in Perl forking is
usually for unix/linux and threading is usually for Windows. For more
honest and accurate descriptions, see "perldoc perlfork" and "perldoc
perlthrtut"

I know you know this, but probably the OP doesn't: atm perl's threading
implementation is not worth using on Unix if you can use fork instead,
and perl will use threads to emulate fork on Win32 (though IME perl5.8
and WinNT are required for it to work well). So the best solution
usually is to just pretend you're on Unix and use fork, and that will do
whatever is possible on your platform. If you're on win9x, I wouldn't
hold out much hope for any performance gain by multitasking, as the
win9x scheduler is *awful*.

Ben
 
A

Anno Siegel

The easiest way to parallel process directory tree is just to run
the program multiple times independently, each one operating from
a different root node.

perl do_stuff.pl /opt/partition1/mystuff &
perl do_stuff.pl /opt/partition1/yourstuff &
perl do_stuff.pl /opt/partition2 &

etc.

But that doesn't do load balancing very well (if one program finishes much
ealier than the others, it will just exit, rather that start working on
whatever the slower ones haven't gotten to yet.)

If there are symlinks criss-crossing the file system there is also
the possibility that it does part of the tree repeatedly.

Anno
 
C

ctcgag

If there are symlinks criss-crossing the file system there is also
the possibility that it does part of the tree repeatedly.

Well, that depends on the internal rules that do_stuff.pl uses for
traversing down from the top-level directories it is given. The same
need for care arises in threader or forked situations, also.

Xho
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top