Parallelizing python code - design/implementation questions

S

stdazi

Hello!

I'm about to parallelize some algorithm that turned out to be too
slow. Before I start doing it, I'd like to hear some suggestions/hints
from you.

The algorithm essentially works like this: There is a iterator
function "foo" yielding a special kind permutation of [1,....n]. The
main program then iterates through this permutations calculating some
proprieties. Each time a calculation ends, a counter is incremented
and each time the counter is divisible by 100, the current progress is
printed.

The classical idea is to spawn m threads and use some global lock when
calling the instance of the iterator + one global lock for
incrementing the progress counter. Is there any better way? I'm
especially concerned with performance degradation due to locking - is
there any way to somehow avoid it?

I've also read about the `multiprocessing' module and as far as I've
understood :

====
permutation = foo()
threadlst = []
for i in xrange(m) :
p = Process(target=permutation.next)
threadlst.append(p)
p.start()
for p in threadlst:
p.join()
====

should do the trick. Am I right? Is there any better way other than
this?
 
P

Philip Semanchuk

Hello!

I'm about to parallelize some algorithm that turned out to be too
slow. Before I start doing it, I'd like to hear some suggestions/hints
from you.

Hi stdazi,
If you're communicating between multiple processes with Python, you
might find my IPC extensions useful. They're much less sophisticated
than multiprocessing; they just give access to IPC semaphores and
shared memory (no message queues yet) on Unix.

POSIX IPC:
http://semanchuk.com/philip/posix_ipc/

System V IPC:
http://semanchuk.com/philip/sysv_ipc/

More System V IPC:
http://nikitathespider.com/python/shm/

The System V IPC extensions are similar; the latter is older and
better tested but won't be developed anymore. The former is newer, has
a couple more features and is the future of System V IPC w/Python, at
least as far as my work is concerned.

Good luck
Philip

The algorithm essentially works like this: There is a iterator
function "foo" yielding a special kind permutation of [1,....n]. The
main program then iterates through this permutations calculating some
proprieties. Each time a calculation ends, a counter is incremented
and each time the counter is divisible by 100, the current progress is
printed.

The classical idea is to spawn m threads and use some global lock when
calling the instance of the iterator + one global lock for
incrementing the progress counter. Is there any better way? I'm
especially concerned with performance degradation due to locking - is
there any way to somehow avoid it?

I've also read about the `multiprocessing' module and as far as I've
understood :

====
permutation = foo()
threadlst = []
for i in xrange(m) :
p = Process(target=permutation.next)
threadlst.append(p)
p.start()
for p in threadlst:
p.join()
====

should do the trick. Am I right? Is there any better way other than
this?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top