parallel computations: subprocess.Popen(...).communicate()[0] doesnot work with multiprocessing.Pool

H

Hseu-Ming Chen

Hi,
I am having an issue when making a shell call from within a
multiprocessing.Process(). Here is the story: i tried to parallelize
the computations in 800-ish Matlab scripts and then save the results
to MySQL. The non-parallel/serial version has been running fine for
about 2 years. However, in the parallel version via multiprocessing
that i'm working on, it appears that the Matlab scripts have never
been kicked off and nothing happened with subprocess.Popen. The debug
printing below does not show up either.

Moreover, even if i replace the Matlab invocation with some trivial
"sed" call, still nothing happens.

Is it possible that the Python interpreter i'm using (version 2.6
released on Oct. 1, 2008) is too old? Nevertheless, i would like to
make sure the basic framework i've now is not blatantly wrong.

Below is a skeleton of my Python program:

----------------------------------------------
import subprocess
from multiprocessing import Pool

def worker(DBrow,config):
# run one Matlab script
cmd1 = "/usr/local/bin/matlab ... myMatlab.1.m"
subprocess.Popen([cmd1], shell=True, stdout=subprocess.PIPE).communicate()[0]
print "this does not get printed"

cmd2 = "sed ..."
print subprocess.Popen(cmd2, shell=True,
stdout=subprocess.PIPE).communicate()[0]
print "this does not get printed either"
sys.stdout.flush()

### main program below
.......
# kick off parallel processing
pool = Pool()
for DBrow in DBrows: pool.apply_async(worker,(DBrow,config))
pool.close()
pool.join()
.......
----------------------------------------------

Furthermore, i also tried adding the following:
multiprocessing.current_process().curr_proc.daemon = False
at the beginning of the "worker" function above but to no avail.

Any help would really be appreciated.
 
C

Chris Torek

I am having an issue when making a shell call from within a
multiprocessing.Process(). Here is the story: i tried to parallelize
the computations in 800-ish Matlab scripts and then save the results
to MySQL. The non-parallel/serial version has been running fine for
about 2 years. However, in the parallel version via multiprocessing
that i'm working on, it appears that the Matlab scripts have never
been kicked off and nothing happened with subprocess.Popen. The debug
printing below does not show up either.

I obviously do not have your code, and have not even tried this as
an experiment in a simplified environment, but:
import subprocess
from multiprocessing import Pool

def worker(DBrow,config):
# run one Matlab script
cmd1 = "/usr/local/bin/matlab ... myMatlab.1.m"
subprocess.Popen([cmd1], shell=True, stdout=subprocess.PIPE).communicate()[0]
print "this does not get printed" ...
# kick off parallel processing
pool = Pool()
for DBrow in DBrows: pool.apply_async(worker,(DBrow,config))
pool.close()
pool.join()

The multiprocessing code makes use of pipes to communicate between
the various subprocesses it creates. I suspect these "extra" pipes
are interfering with your subprocesses, when pool.close() waits
for the Matlab script to do something with its copy of the pipes.
To make the subprocess module close them -- so that Matlab does
not have them in the first place and hence pool.close() cannot get
stuck there -- add "close_fds=True" to the Popen() call.

There could still be issues with competing wait() and/or waitpid()
calls (assuming you are using a Unix-like system, or whatever the
equivalent is for Windows) "eating" the wrong subprocess completion
notifications, but that one is harder to solve in general :) so
if close_fds fixes things, it was just the pipes. If close_fds
does not fix things, you will probably need to defer the pool.close()
step until after all the subprocesses complete.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,900
Latest member
Nell636132

Latest Threads

Top