run function in separate process

Discussion in 'Python' started by malkarouri@gmail.com, Apr 11, 2007.

  1. Guest

    Hi everyone,

    I have written a function that runs functions in separate processes. I
    hope you can help me improving it, and I would like to submit it to
    the Python cookbook if its quality is good enough.

    I was writing a numerical program (using numpy) which uses huge
    amounts of memory, the memory increasing with time. The program
    structure was essentially:

    for radius in radii:
    result = do_work(params)

    where do_work actually uses a large number of temporary arrays. The
    variable params is large as well and is the result of computations
    before the loop.

    After playing with gc for some time, trying to convince it to to
    release the memory, I gave up. I will be happy, by the way, if
    somebody points me to a web page/reference that says how to call a
    function then reclaim the whole memory back in python.

    Meanwhile, the best that I could do is fork a process, compute the
    results, and return them back to the parent process. This I
    implemented in the following function, which is kinda working for me
    now, but I am sure it can be much improved. There should be a better
    way to return the result that a temporary file, for example. I
    actually thought of posting this after noticing that the pypy project
    had what I thought was a similar thing in their testing, but they
    probably dealt with it differently in the autotest driver [1]; I am
    not sure.

    Here is the function:

    def run_in_separate_process(f, *args, **kwds):
    from os import tmpnam, fork, waitpid, remove
    from sys import exit
    from pickle import load, dump
    from contextlib import closing
    fname = tmpnam()
    pid = fork()
    if pid > 0: #parent
    waitpid(pid, 0) # should have checked for correct finishing
    with closing(file(fname)) as f:
    result = load(f)
    remove(fname)
    return result
    else: #child
    result = f(*args, **kwds)
    with closing(file(fname,'w')) as f:
    dump(result, f)
    exit(0)


    To be used as:

    for radius in radii:
    result = run_in_separate_process (do_work, params)

    [1] http://codespeak.net/pipermail/pypy-dev/2006q3/003273.html



    Regards,

    Muhammad Alkarouri
    , Apr 11, 2007
    #1
    1. Advertising

  2. Guest

    On Apr 11, 9:23 am, wrote:
    > Hi everyone,
    >
    > I have written a function that runs functions in separate processes. I
    > hope you can help me improving it, and I would like to submit it to
    > the Python cookbook if its quality is good enough.
    >
    > I was writing a numerical program (using numpy) which uses huge
    > amounts of memory, the memory increasing with time. The program
    > structure was essentially:
    >
    > for radius in radii:
    > result = do_work(params)
    >
    > where do_work actually uses a large number of temporary arrays. The
    > variable params is large as well and is the result of computations
    > before the loop.
    >
    > After playing with gc for some time, trying to convince it to to
    > release the memory, I gave up. I will be happy, by the way, if
    > somebody points me to a web page/reference that says how to call a
    > function then reclaim the whole memory back in python.
    >
    > Meanwhile, the best that I could do is fork a process, compute the
    > results, and return them back to the parent process. This I
    > implemented in the following function, which is kinda working for me
    > now, but I am sure it can be much improved. There should be a better
    > way to return the result that a temporary file, for example. I
    > actually thought of posting this after noticing that the pypy project
    > had what I thought was a similar thing in their testing, but they
    > probably dealt with it differently in the autotest driver [1]; I am
    > not sure.
    >
    > Here is the function:
    >
    > def run_in_separate_process(f, *args, **kwds):
    > from os import tmpnam, fork, waitpid, remove
    > from sys import exit
    > from pickle import load, dump
    > from contextlib import closing
    > fname = tmpnam()
    > pid = fork()
    > if pid > 0: #parent
    > waitpid(pid, 0) # should have checked for correct finishing
    > with closing(file(fname)) as f:
    > result = load(f)
    > remove(fname)
    > return result
    > else: #child
    > result = f(*args, **kwds)
    > with closing(file(fname,'w')) as f:
    > dump(result, f)
    > exit(0)
    >
    > To be used as:
    >
    > for radius in radii:
    > result = run_in_separate_process (do_work, params)
    >
    > [1]http://codespeak.net/pipermail/pypy-dev/2006q3/003273.html
    >
    > Regards,
    >
    > Muhammad Alkarouri


    I found a post on a similar topic that looks like it may give you some
    ideas:

    http://mail.python.org/pipermail/python-list/2004-October/285400.html
    http://www.artima.com/forums/flat.jsp?forum=106&thread=174099
    http://www.nabble.com/memory-manage-in-python-fu-t3386442.html
    http://www.thescripts.com/forum/thread620226.html

    Mike
    , Apr 11, 2007
    #2
    1. Advertising

  3. <> wrote:
    ...
    > somebody points me to a web page/reference that says how to call a
    > function then reclaim the whole memory back in python.
    >
    > Meanwhile, the best that I could do is fork a process, compute the
    > results, and return them back to the parent process. This I


    That's my favorite way to ensure that all resources get reclaimed: let
    the operating system do the job.

    > implemented in the following function, which is kinda working for me
    > now, but I am sure it can be much improved. There should be a better
    > way to return the result that a temporary file, for example. I


    You can use a pipe. I.e. (untested code):

    def run_in_separate_process(f, *a, **k):
    import os, sys, cPickle
    pread, pwrite = os.pipe()
    pid = os.fork()
    if pid>0:
    os.close(pwrite)
    with os.fdopen(pread, 'rb') as f:
    return cPickle.load(f)
    else:
    os.close(pread)
    result = f(*a, **k)
    with os.fdopen(pwrite, 'wb') as f:
    cPickle.dump(f, -1)
    sys.exit()

    Using cPickle instead of pickle, and a negative protocol (on the files
    pedantically specified as binary:), meaning the latest and greatest
    available pickling protocol, rather than the default 0, should improve
    performance.


    Alex
    Alex Martelli, Apr 11, 2007
    #3
  4. Guest

    Thanks Mike for you answer. I will use the occasion to add some
    comments on the links and on my approach.

    I am programming in Python 2.5, mainly to avoid the bug that memory
    arenas were never freed before.
    The program is working on both Mac OS X (intel) and Linux, so I prefer
    portable approaches.

    On Apr 11, 3:34 pm, wrote:
    [...]
    > I found a post on a similar topic that looks like it may give you some
    > ideas:
    >
    > http://mail.python.org/pipermail/python-list/2004-October/285400.html


    I see the comment about using mmap as valuable. I tried to use that
    using numpy.memmap but I wasn't successful. I don't remember why at
    the moment.
    The other tricks are problem-dependent, and my case is not like them
    (I believe).

    > http://www.artima.com/forums/flat.jsp?forum=106&thread=174099


    Good ideas. I hope that python will grow a replacable gc one day. I
    think that pypy already has a choice at the moment.

    > http://www.nabble.com/memory-manage-in-python-fu-t3386442.html


    > http://www.thescripts.com/forum/thread620226.html


    Bingo! This thread actually reaches more or less the same conclusion.
    In fact, Alex Martelli describes the exact pattern in
    http://mail.python.org/pipermail/python-list/2007-March/431910.html

    I probably got the idea from a previous thread by him or somebody
    else. It should be much earlier than March, though, as my program was
    working since last year.

    So, let's say the function I have written is an implementation of
    Alex's architectural pattern. Probably makes it easier to get in the
    cookbook:)

    Regards,

    Muhammad
    , Apr 11, 2007
    #4
  5. Guest

    On Apr 11, 3:58 pm, (Alex Martelli) wrote:
    [...]
    > That's my favorite way to ensure that all resources get reclaimed: let
    > the operating system do the job.


    Thanks a lot, Alex, for confirming the basic idea. I will be playing
    with your function later today, and will give more feedback.
    I think I avoided the pipe on the mistaken belief that pipes cannot be
    binary. I know, I should've tested. And I avoided pickle at the time
    because I had a structure that was unpicklable (grown by me using a
    mixture of python, C, ctypes and pyrex at the time). The structure is
    improved now, and I will go for the more standard approach..

    Regards,

    Muhammad
    , Apr 11, 2007
    #5
  6. Guest

    On Apr 11, 4:36 pm, wrote:
    [...]
    > .. And I avoided pickle at the time
    > because I had a structure that was unpicklable (grown by me using a
    > mixture of python, C, ctypes and pyrex at the time). The structure is
    > improved now, and I will go for the more standard approach..


    Sorry, I was speaking about an older version of my code. The code is
    already using pickle, and yes, cPickle is better.

    Still trying the code. So far, after modifying the line:

    cPickle.dump(f, -1)

    to:

    cPickle.dump(result, f, -1)

    it is working.

    Regards,

    Muhammad
    , Apr 11, 2007
    #6
  7. Guest

    After playing with Alex's implementation, and adding some support for
    exceptions, this is what I came up with. I hope I am not getting too
    clever for my needs:

    import os, cPickle
    def run_in_separate_process_2(f, *args, **kwds):
    pread, pwrite = os.pipe()
    pid = os.fork()
    if pid > 0:
    os.close(pwrite)
    with os.fdopen(pread, 'rb') as f:
    status, result = cPickle.load(f)
    os.waitpid(pid, 0)
    if status == 0:
    return result
    else:
    raise result
    else:
    os.close(pread)
    try:
    result = f(*args, **kwds)
    status = 0
    except Exception, exc:
    result = exc
    status = 1
    with os.fdopen(pwrite, 'wb') as f:
    try:
    cPickle.dump((status,result), f,
    cPickle.HIGHEST_PROTOCOL)
    except cPickle.PicklingError, exc:
    cPickle.dump((2,exc), f, cPickle.HIGHEST_PROTOCOL)
    f.close()
    os._exit(0)



    Basically, the function is called in the child process, and a status
    code is returned in addition to the result. The status is 0 if the
    function returns normally, 1 if it raises an exception, and 2 if the
    result is unpicklable. Some cases are deliberately not handled, like a
    SystemExit or a KeyboardInterrupt show up as EOF errors in the
    unpickling in the parent. Some cases are inadvertently not handled,
    these are called bugs. And the original exception trace is lost. Any
    comments?

    Regards,

    Muhammad Alkarouri
    , Apr 11, 2007
    #7
  8. Guest

    After playing a little with Alex's function, I got to:

    import os, cPickle
    def run_in_separate_process_2(f, *args, **kwds):
    pread, pwrite = os.pipe()
    pid = os.fork()
    if pid > 0:
    os.close(pwrite)
    with os.fdopen(pread, 'rb') as f:
    status, result = cPickle.load(f)
    os.waitpid(pid, 0)
    if status == 0:
    return result
    else:
    raise result
    else:
    os.close(pread)
    try:
    result = f(*args, **kwds)
    status = 0
    except Exception, exc:
    result = exc
    status = 1
    with os.fdopen(pwrite, 'wb') as f:
    try:
    cPickle.dump((status,result), f,
    cPickle.HIGHEST_PROTOCOL)
    except cPickle.PicklingError, exc:
    cPickle.dump((2,exc), f, cPickle.HIGHEST_PROTOCOL)
    f.close()
    os._exit(0)


    It handles exceptions as well, partially. Basically the child process
    returns a status code as well as a result. If the status is 0, then
    the function returned successfully and its result is returned. If the
    status is 1, then the function raised an exception, which will be
    raised in the parent. If the status is 2, then the function has
    returned successfully but the result is not picklable, an exception is
    raised.
    Exceptions such as SystemExit and KeyboardInterrupt in the child are
    not checked and will result in an EOFError in the parent.

    Any comments?

    Regards,

    Muhammad
    , Apr 11, 2007
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. EvgueniB
    Replies:
    1
    Views:
    615
    Anthony Borla
    Dec 15, 2003
  2. Frank Fredstone
    Replies:
    1
    Views:
    430
    Jean-Francois Briere
    Jun 27, 2006
  3. Replies:
    3
    Views:
    1,221
  4. Replies:
    9
    Views:
    956
    Paddy O'Loughlin
    Feb 27, 2009
  5. KevinSimonson
    Replies:
    7
    Views:
    341
    Screamin Lord Byron
    Oct 18, 2010
Loading...

Share This Page