Comments appreciated on Erlang inspired Process class.

  • Thread starter Brian L. Troutwine
  • Start date
B

Brian L. Troutwine

Lately I've been tinkering around with Erlang and have begun to sorely want
some of its features in Python, mostly the ease at which new processes can be
forked off for computation. To that end I've coded up a class I call,
boringly enough, Process. It takes a function, its args and keywords and runs
the function in another process using os.fork. Processes can be treated as
callables to retrieve the return value of the passed in function.

The code is pasted here: http://deadbeefbabe.org/paste/4972. A simple
exposition of Process is included at the end of the code in some __main__
magic. (Note: The code will not run on any system which lacks os.fork and
spawns 100 child processes while running.)

I'd very much appreciate people to look the code over and give me their
reactions to it, or suggests for improvement. As it stands, I see three major
defects with the current implementation:

1) Process hangs forever if its child errors out.

2) Process isn't portable because of its use of os.fork().

3) Process should use AF_UNIX when available instead of TCP.
 
P

Paul Boddie

Lately I've been tinkering around with Erlang and have begun to sorely want
some of its features in Python, mostly the ease at which new processes can be
forked off for computation. To that end I've coded up a class I call,
boringly enough, Process. It takes a function, its args and keywords and runs
the function in another process using os.fork. Processes can be treated as
callables to retrieve the return value of the passed in function.

This sounds familiar...

http://wiki.python.org/moin/ParallelProcessing

Do you have any opinions about those projects listed on the above page
that are similar to your own? My contribution (pprocess), along with
others (processing, pp...), can offer similar facilities, but the
styles of interfacing with spawned processes may be somewhat
different.

Paul
 
B

Brian L. Troutwine

http://wiki.python.org/moin/ParallelProcessing

Ah, I'd forgotten about that page of the wiki; I hadn't seen it for a few
months.
Do you have any opinions about those projects listed on the above page
that are similar to your own? My contribution (pprocess), along with
others (processing, pp...), can offer similar facilities, but the
styles of interfacing with spawned processes may be somewhat
different.

The interface was my most important design goal, in that I wanted it to be a
dead simple "Drop in some code and forget about it for a while. Retrieve the
results later as if you'd called the function yourself." sort of thing.
Secondly I wanted share nothing parallelism in order to avoid the nastier
bits of implementing concurrent code.

delegate doesn't fit the bill because it returns its results in a dictionary.
pp is rather more feature-full and rather less simple as a result. processing
is a clone of threading, more or less, and voids itself for not being simple
and having shared objects between processes (a nifty trick to be sure). POSH
uses a shared memory approach and hasn't been updated, it would seem, since
2003. pprocess isn't as simple as I wanted, though rather more simple than
all the others. remoteD uses shared memory.

I suppose, then, my opinion is that they're not brain-dead simple enough to
fulfill my desired style of process creation. (I'm smitten with Erlang.)
 
G

George Sakkis

Ah, I'd forgotten about that page of the wiki; I hadn't seen it for a few
months.


The interface was my most important design goal, in that I wanted it to be a
dead simple "Drop in some code and forget about it for a while. Retrieve the
results later as if you'd called the function yourself." sort of thing.
Secondly I wanted share nothing parallelism in order to avoid the nastier
bits of implementing concurrent code.

Funny, I've been working on a similar library these days with the same
primary goal, a minimal intuitive API. The closest to what I had in
mind that I found in the parallel processing wiki was the
Scientific.DistributedComputing package (actually there are no
dependencies with the rest Scientific.* packages, the library
comprises two modules and one script all in all). It's worth checking
out and there's at least one idea I plan to copy (the "watchdog"
thread monitoring if a process has died), but overall I found the API
a bit clunky for my taste.

So I rolled yet another parallel processing package and I'm glad to
have a first working version ready at this point, exposing a single
lightweight API and three distinct platform-independent
implementations: one using multiple threads, one using multiple
processes in one or more hosts (through PYRO) and one singlethreaded
(for the sake of completeness, probably not very useful). I'll write
up some docs and I'll announce it, hopefully within the week.

George
 
N

Nick Craig-Wood

Brian L. Troutwine said:
Lately I've been tinkering around with Erlang and have begun to sorely want
some of its features in Python, mostly the ease at which new processes can be
forked off for computation. To that end I've coded up a class I call,
boringly enough, Process. It takes a function, its args and keywords and runs
the function in another process using os.fork. Processes can be treated as
callables to retrieve the return value of the passed in function.

The code is pasted here: http://deadbeefbabe.org/paste/4972. A simple
exposition of Process is included at the end of the code in some __main__
magic. (Note: The code will not run on any system which lacks os.fork and
spawns 100 child processes while running.)

I'd very much appreciate people to look the code over and give me their
reactions to it, or suggests for improvement. As it stands, I see three major
defects with the current implementation:

1) Process hangs forever if its child errors out.

2) Process isn't portable because of its use of os.fork().

3) Process should use AF_UNIX when available instead of TCP.

You could use os.pipe() or socket.socketpair(). It would simplify the
code a lot too!

The biggest problem I see is that there is no non-blocking way of
seeing whether the Process() has finished or not, and no way to wait
on more than one Process() at once.

If there is an exception then you should return it to the parent (see
the subprocess module for an example).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,046
Latest member
Gavizuho

Latest Threads

Top