Auto-parallelizing with decorators?

K

Kirk Strauser

I was thinking about how a lot of Lisp proponents claim that Lisp is
inherently parallelizable because its functions don't have (or are not
supposed to have) side effects, and therefore the compiler can easily tell
which calls may be run in parallel instead of strictly serially. I'm not a
Lisp expert so I can't say whether that's true or not, but it seems like an
interesting idea for Python.

Suppose that Python had a new decorator, say "@parallelizable". Functions
so decorated would be eligible for multi-processed or multi-threaded
execution by such native constructs as list comprehensions, or the map()
function. Their logic could be something along the lines of:

1) Is every function in the "function expression" of map() or a list
comprehension decorated with @parallelizable? If not, use the normal
implementation.
2) Spawn an appropriate number of child threads and queue values to be
processed.
3) As results return, either store them in the result list (for map()), or
insert them into their place in the output queue to be consumed by users of
a list comprehension.

In #2, "appropriate" could default to something like 2 times the number of
CPUs detected in the system (which could be found the first time it was
needed so that programs that don't use this functionality don't pay for
finding that information). I could imagine that it'd be useful to pass
hints to the decorator to more explicitly size the thread pool, such as:

@parallelizable(number=100)
def lightweightfunction:
"""Launch as many parallel copies as you can"""

@parallelizable(perproc=1)
def heavierfunction:
"""Only run one copy per CPU"""

@parallelizable(number=2)
def bigslowfunction:
"""This benefits from parallel execution, but don't kill the system"""

Would something like this have merit? I like the idea because it's
completely optional; if you don't use it, you never have to deal with the
side effects. However, if you take care to make your functions loosely
coupled and not dependent on execution order, you could get a nice speedup
every single time you give Python permission to schedule your repeated
function calls.
 
S

Stefan Behnel

Kirk said:
Suppose that Python had a new decorator, say "@parallelizable". Functions
so decorated would be eligible for multi-processed or multi-threaded
execution by such native constructs as list comprehensions, or the map()
function.

Wouldn't that require parallelism in the interpreter first? Mind the GIL...

Stefan
 
K

Kirk Strauser

Stefan said:
Wouldn't that require parallelism in the interpreter first? Mind the
GIL...

That may be. I'd bet though that I could whip up a native Python workalike
using os.fork() that would work around GIL, at least on Unix (just because
I don't know if Windows has os.fork(), having never looked for it).

I know something like this wouldn't be the easiest thing to implement, but
honestly, having such beautiful constructs as map() and list comprehensions
available is just begging for an application like this.
 
J

John Nagle

Kirk said:
I was thinking about how a lot of Lisp proponents claim that Lisp is
inherently parallelizable because its functions don't have (or are not
supposed to have) side effects, and therefore the compiler can easily tell
which calls may be run in parallel instead of strictly serially. I'm not a
Lisp expert so I can't say whether that's true or not, but it seems like an
interesting idea for Python.

Suppose that Python had a new decorator, say "@parallelizable". Functions
so decorated would be eligible for multi-processed or multi-threaded
execution by such native constructs as list comprehensions, or the map()
function.

That implies much smarter compilers than we've seen to date for Python.

A more useful idea might be to provide a map/reduce library in the
sense that Google uses the term.

Google's concept of "map/reduce" is that the mapped function is applied
to all the elements of the list simultaneously, up to the limits of resources
available. The result of each function is a name/value tuple. The
reduction process consists of collecting all tuples with the same name
and feeding them to copies of the "reduce" function in no particular order,
with the "reduce" function producing fewer output tuples than input tuples.
The "reduce" function is applied repeatedly in a tree sense until all
output tuples have been reduced.

See:

http://labs.google.com/papers/mapreduce.html

Contrast this with Python's "reduce", which is inherently sequential.

John Nagle
 
K

Kirk Strauser

Kirk Strauser said:
I was thinking about how a lot of Lisp proponents claim that Lisp is
inherently parallelizable because its functions don't have (or are not
supposed to have) side effects, and therefore the compiler can easily tell
which calls may be run in parallel instead of strictly serially. I'm not a
Lisp expert so I can't say whether that's true or not, but it seems like an
interesting idea for Python.

By the way, I uploaded a sample implementation (in Python) of what I had
in mind to http://www.honeypot.net/multi-processing-map-python . Please
let me know what you think of it and whether it seems remotely
interesting or goofy.
 
J

Josiah Carlson

Kirk said:
By the way, I uploaded a sample implementation (in Python) of what I had
in mind to http://www.honeypot.net/multi-processing-map-python . Please
let me know what you think of it and whether it seems remotely
interesting or goofy.

Try the Processing package available at the Python package index.
Create as many processes as you want, then toss the data you want
processed into a Queue. Watch magically as your data gets processed.

- Josiah
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

confusion with decorators 10
Confusion about decorators 6
Getting lazy with decorators 8
function decorators 3
@decorators 17
Decorators 12
Descriptors and decorators 3
Decorators 0

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top