multiprocessing

F

fsaldan1

I am a beginner with Python, coming from R, and I am having problems with parallelization with the multiprocessing module. I know that other people have asked similar questions but the answers were mostly over my head.

Here is my problem: I tried to execute code in parallel in two ways:

1) In a plain xyz.py file without calling main()
2) In a xyz.py file that calls main

Under 1) I was able to run parallel processes but:

a) The whole script runs from the beginning up to the line where p1.start()or p2.start() is called. That is, if I had 10 processes p1, p2, ..., p10 the whole file would be run from the beginning up to the line where the command pX.start() is called. Maybe it has to be that way so that these processes get the environment they need, but I doubt it.

b) I was not able to extract a value from the function called. I was able only to use print(). I tried to create a Queue object to get the return values but then I get error messages:
from multiprocessing import *

print('\nRunning ' + __name__ + "\n")

from multiprocessing import Process, Queue, freeze_support
freeze_support() # it does not make any difference to run this command or not

queue1 = Queue() # create a queue object


def multiply(a, b, que):
que.put(a * b)

p = Process(target=multiply, args=(5, 4, queue1))
p.start()
p.join()


Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\WinPython-64bit-3.3.2.1\python-3.3.2.amd64\lib\multiprocessing\forking.py", line 350, in main
prepare(preparation_data)
File "C:\WinPython-64bit-3.3.2.1\python-3.3.2.amd64\lib\multiprocessing\forking.py", line 457, in prepare
'__parents_main__', file, path_name, etc
File "C:\WinPython-64bit-3.3.2.1\python-3.3.2.amd64\lib\imp.py", line 175, in load_module
return load_source(name, filename, file)
File "C:\WinPython-64bit-3.3.2.1\python-3.3.2.amd64\lib\imp.py", line 114, in load_source
_LoadSourceCompatibility(name, pathname, file).load_module(name)
File "<frozen importlib._bootstrap>", line 586, in _check_name_wrapper
File "<frozen importlib._bootstrap>", line 1024, in load_module
File "<frozen importlib._bootstrap>", line 1005, in load_module
File "<frozen importlib._bootstrap>", line 562, in module_for_loader_wrapper
File "<frozen importlib._bootstrap>", line 870, in _load_module
File "<frozen importlib._bootstrap>", line 313, in _call_with_frames_removed
File "C:\Files\Programs\Wush\Python\parallel_test_2.py", line 15, in <module>
p.start()
File "C:\WinPython-64bit-3.3.2.1\python-3.3.2.amd64\lib\multiprocessing\process.py", line 111, in start
self._popen = Popen(self)
File "C:\WinPython-64bit-3.3.2.1\python-3.3.2.amd64\lib\multiprocessing\forking.py", line 216, in __init__
cmd = ' '.join('"%s"' % x for x in get_command_line())
File "C:\WinPython-64bit-3.3.2.1\python-3.3.2.amd64\lib\multiprocessing\forking.py", line 328, in get_command_line
is not going to be frozen to produce a Windows executable.''')
RuntimeError:
Attempt to start a new process before the current process
has finished its bootstrapping phase.

This probably means that you are on Windows and you have
forgotten to use the proper idiom in the main module:

if __name__ == '__main__':
freeze_support()
...

The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce a Windows executable.
Under 2) I get problems with pickling. See below


from multiprocessing import *


def main():

print('\nRunning ' + __name__ + "\n")

freeze_support()

def f(name):
print('hello', name)

p = Process(target=f, args=('bob',))
p.start()
p.join()

if __name__ == '__main__':
main()

Running __main__

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\WinPython-64bit-3.3.2.1\python-3.3.2.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 562, in runfile
execfile(filename, namespace)
File "C:\WinPython-64bit-3.3.2.1\python-3.3.2.amd64\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 41, in execfile
exec(compile(open(filename).read(), filename, 'exec'), namespace)
File "C:\Files\Programs\Wush\Python\parallel_test.py", line 18, in <module>
main()
File "C:\Files\Programs\Wush\Python\parallel_test.py", line 14, in main
p.start()
File "C:\WinPython-64bit-3.3.2.1\python-3.3.2.amd64\lib\multiprocessing\process.py", line 111, in start
self._popen = Popen(self)
File "C:\WinPython-64bit-3.3.2.1\python-3.3.2.amd64\lib\multiprocessing\forking.py", line 248, in __init__
dump(process_obj, to_child, HIGHEST_PROTOCOL)
File "C:\WinPython-64bit-3.3.2.1\python-3.3.2.amd64\lib\multiprocessing\forking.py", line 166, in dump
ForkingPickler(file, protocol).dump(obj)
 
I

Ian Kelly

I am a beginner with Python, coming from R, and I am having problems withparallelization with the multiprocessing module. I know that other people have asked similar questions but the answers were mostly over my head.

Here is my problem: I tried to execute code in parallel in two ways:

1) In a plain xyz.py file without calling main()
2) In a xyz.py file that calls main

Under 1) I was able to run parallel processes but:

a) The whole script runs from the beginning up to the line where p1.start() or p2.start() is called. That is, if I had 10 processes p1, p2, ..., p10the whole file would be run from the beginning up to the line where the command pX.start() is called. Maybe it has to be that way so that these processes get the environment they need, but I doubt it.

See the multiprocessing programming guidelines at:
http://docs.python.org/3/library/multiprocessing.html#multiprocessing-programming

In particular, read the section titled "Safe importing of main module"
under "17.2.3.2 Windows". The child process needs to import the main
module, which means that anything that isn't protected by "if __name__
== '__main__'" is going to get executed in the child process. This
also appears to be the cause of the error that you pasted.
Under 2) I get problems with pickling. See below


from multiprocessing import *


def main():

print('\nRunning ' + __name__ + "\n")

freeze_support()

def f(name):
print('hello', name)

p = Process(target=f, args=('bob',))
p.start()
p.join()

if __name__ == '__main__':
main()

Read the section "More picklability" under the above documentation
link. I suspect the problem here is that the target function f that
you're using is a closure, and when it tries to find it in the child
process it can't, because main() hasn't been called there and so the
function isn't defined. Try moving it to the module namespace.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,058
Latest member
QQXCharlot

Latest Threads

Top