Re: Newbie help for using multiprocessing and subprocess packages forcreating child processes

Discussion in 'Python' started by Matt, Jun 16, 2009.

  1. Matt

    Matt Guest

    Try replacing:
    cmd = [ "ls /path/to/file/"+staname+"_info.pf" ]
    with:
    cmd = [ “ls”, “/path/to/file/"+staname+"_info.pf" ]

    Basically, the first is the conceptual equivalent of executing the
    following in BASH:
    ‘ls /path/to/file/FOO_info.pf’
    The second is this:
    ‘ls’ ‘/path/to/file/FOO_info.pf’

    The first searches for a command in your PATH named ‘ls /path...’. The
    second searches for a command names ‘ls’ and gives it the argument
    ‘/path...’

    Also, I think this is cleaner (but it’s up to personal preference):
    cmd = [ "ls", "/path/to/file/%s_info.pf" % staname]

    ________________________
    ~Matthew Strax-Haber
    Northeastern University, CCIS & CBA
    Co-op, NASA Langley Research Center
    Student Government Association, Special Interest Senator
    Resident Student Association, SGA Rep & General Councilor
    Chess Club, Treasurer
    E-mail: strax-haber.m=AT=neu.edu

    On Tue, Jun 16, 2009 at 3:13 PM, Rob Newman<> wrote:
    > Hi All,
    >
    > I am new to Python, and have a very specific task to accomplish. I have a
    > command line shell script that takes two arguments:
    >
    > create_graphs.sh -v --sta=STANAME
    >
    > where STANAME is a string 4 characters long.
    >
    > create_graphs creates a series of graphs using Matlab (among other 3rd party
    > packages).
    >
    > Right now I can run this happily by hand, but I have to manually execute the
    > command for each STANAME. What I want is to have a Python script that I pass
    > a list of STANAMEs to, and it acts like a daemon and spawns as many child
    > processes as there are processors on my server (64), until it goes through
    > all the STANAMES (about 200).
    >
    > I posted a message on Stack Overflow (ref:
    > http://stackoverflow.com/questions/...sses-on-a-multi-processor-system-use-multipro) and
    > was recommended to use the multiprocessing and subprocess packages. In the
    > Stack Overflow answers, it was suggested that I use the process pool class
    > in multiprocessing. However, the server I have to use is a Sun Sparc (T5220,
    > Sun OS 5.10) and there is a known issue with sem_open() (ref:
    > http://bugs.python.org/issue3770), so it appears I cannot use the process
    > pool class.
    >
    > So, below is my script (controller.py) that I have attempted to use as a
    > test, that just calls the 'ls' command on a file I know exists rather than
    > firing off my shell script (which takes ~ 10 mins to run per STANAME):
    >
    > #!/path/to/python
    >
    > import sys
    > import os
    > import json
    > import multiprocessing
    > import subprocess
    >
    > def work(verbose,staname):
    >  print 'function:',staname
    >  print 'parent process:', os.getppid()
    >  print 'process id:', os.getpid()
    >  print "ls /path/to/file/"+staname+"_info.pf"
    >  # cmd will eventually get replaced with the shell script with the verbose
    > and staname options
    >  cmd = [ "ls /path/to/file/"+staname+"_info.pf" ]
    >  return subprocess.call(cmd, shell=False)
    >
    > if __name__ == '__main__':
    >
    >  report_sta_list = ['B10A','B11A','BNLO']
    >
    >  # Print out the complete station list for testing
    >  print report_sta_list
    >
    >  # Get the number of processors available
    >  num_processes = multiprocessing.cpu_count()
    >
    >  print 'Number of processes: %s' % (num_processes)
    >
    >  print 'Now trying to assign all the processors'
    >
    >  threads = []
    >
    >  len_stas = len(report_sta_list)
    >
    >  print "+++ Number of stations to process: %s" % (len_stas)
    >
    >  # run until all the threads are done, and there is no data left
    >  while len(threads) < len(report_sta_list):
    >
    >    # if we aren't using all the processors AND there is still data left to
    >    # compute, then spawn another thread
    >
    >    print "+++ Starting to set off all child processes"
    >
    >    if( len(threads) < num_processes ):
    >
    >      this_sta = report_sta_list.pop()
    >
    >      print "+++ Station is %s" % (this_sta)
    >
    >      p = multiprocessing.Process(target=work,args=['v',this_sta])
    >
    >      p.start()
    >
    >      print p, p.is_alive()
    >
    >      threads.append(p)
    >
    >    else:
    >
    >      for thread in threads:
    >
    >        if not thread.is_alive():
    >
    >          threads.remove(thread)
    >
    > However, I seem to be running into a whole series of errors:
    >
    > myhost{rt}62% controller.py
    > ['B10A', 'B11A', 'BNLO']
    > Number of processes: 64
    > Now trying to assign all the processors
    > +++ Number of stations to process: 3
    > +++ Starting to set off all child processes
    > +++ Station is BNLO
    > <Process(Process-1, started)> True
    > +++ Starting to set off all child processes
    > +++ Station is B11A
    > function: BNLO
    > parent process: 22341
    > process id: 22354
    > ls /path/to/file/BNLO_info.pf
    > <Process(Process-2, started)> True
    > function: B11A
    > parent process: 22341
    > process id: 22355
    > ls /path/to/file/B11A_info.pf
    > Process Process-1:
    > Traceback (most recent call last):
    >  File "/opt/csw/lib/python/multiprocessing/process.py", line 231, in
    > _bootstrap
    >    self.run()
    >  File "/opt/csw/lib/python/multiprocessing/process.py", line 88, in run
    >    self._target(*self._args, **self._kwargs)
    >  File "controller.py", line 104, in work
    >    return subprocess.call(cmd, shell=False)
    >  File "/opt/csw/lib/python/subprocess.py", line 444, in call
    >    return Popen(*popenargs, **kwargs).wait()
    >  File "/opt/csw/lib/python/subprocess.py", line 595, in __init__
    >    errread, errwrite)
    >  File "/opt/csw/lib/python/subprocess.py", line 1092, in _execute_child
    >    raise child_exception
    > OSError: [Errno 2] No such file or directory
    > Process Process-2:
    > Traceback (most recent call last):
    >  File "/opt/csw/lib/python/multiprocessing/process.py", line 231, in
    > _bootstrap
    >    self.run()
    >  File "/opt/csw/lib/python/multiprocessing/process.py", line 88, in run
    >    self._target(*self._args, **self._kwargs)
    >  File "controller.py", line 104, in work
    >    return subprocess.call(cmd, shell=False)
    >  File "/opt/csw/lib/python/subprocess.py", line 444, in call
    >    return Popen(*popenargs, **kwargs).wait()
    >  File "/opt/csw/lib/python/subprocess.py", line 595, in __init__
    >    errread, errwrite)
    >  File "/opt/csw/lib/python/subprocess.py", line 1092, in _execute_child
    >    raise child_exception
    > OSError: [Errno 2] No such file or directory
    >
    > The files are there:
    >
    > mhost{me}11% ls -la /path/to/files/BNLO_info.pf
    > -rw-rw-r--   1 me       group     391 May 19 22:40
    > /path/to/files/BNLO_info.pf
    > myhost{me}12% ls -la /path/to/file/B11A_info.pf
    > -rw-rw-r--   1 me       group     391 May 19 22:27
    > /path/to/files/B11A_info.pf
    >
    > I might be doing this completely wrong, but I thought this would be the way
    > to list the files dynamically. Admittedly this is just a stepping stone to
    > running the actual shell script I want to run. Can anyone point me in the
    > right direction or offer any advice for using these packages?
    >
    > Thanks in advance for any help or insight.
    > - Rob
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >
    Matt, Jun 16, 2009
    #1
    1. Advertising

  2. Re: Newbie help for using multiprocessing and subprocess packages for creating child processes

    >>>>> Matt <> (M) wrote:

    >M> Try replacing:
    >M> cmd = [ "ls /path/to/file/"+staname+"_info.pf" ]
    >M> with:
    >M> cmd = [ “lsâ€, “/path/to/file/"+staname+"_info.pf" ]


    In addition I would like to remark that -- if the only thing you want to
    do is to start up a new command with subprocess.Popen -- the use of the
    multiprocessing package is overkill. You could use threads as well.

    Moreover, if you don't expect any output from these processes and don't
    supply input to them through pipes there isn't even a need for these
    threads. You could just use os.wait() to wait for a child to finish and
    then start a new process if necessary.
    --
    Piet van Oostrum <>
    URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
    Private email:
    Piet van Oostrum, Jun 16, 2009
    #2
    1. Advertising

  3. Re: Newbie help for using multiprocessing and subprocess packagesfor creating child processes

    On Tue, 16 Jun 2009 23:20:05 +0200
    Piet van Oostrum <> wrote:

    > >>>>> Matt <> (M) wrote:

    >
    > >M> Try replacing:
    > >M> cmd = [ "ls /path/to/file/"+staname+"_info.pf" ]
    > >M> with:
    > >M> cmd = [ “lsâ€, “/path/to/file/"+staname+"_info.pf" ]

    >
    > In addition I would like to remark that -- if the only thing you want
    > to do is to start up a new command with subprocess.Popen -- the use
    > of the multiprocessing package is overkill. You could use threads as
    > well.
    >
    > Moreover, if you don't expect any output from these processes and
    > don't supply input to them through pipes there isn't even a need for
    > these threads. You could just use os.wait() to wait for a child to
    > finish and then start a new process if necessary.


    And even if there is need to read/write data from/to the pipes more
    than once (aka communicate), using threads or any more python
    subprocesses seem like hammering a nail with sledgehammer - just _read_
    or _write_ to pipes asynchronously.

    --
    Mike Kazantsev // fraggod.net
    Mike Kazantsev, Jun 17, 2009
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jeff Rodriguez
    Replies:
    23
    Views:
    1,091
    David Schwartz
    Dec 9, 2003
  2. Rob Newman
    Replies:
    0
    Views:
    942
    Rob Newman
    Jun 16, 2009
  3. mheavner
    Replies:
    10
    Views:
    468
    mheavner
    Jul 20, 2009
  4. Niklas Berliner
    Replies:
    0
    Views:
    148
    Niklas Berliner
    Jan 10, 2013
  5. Dave Angel
    Replies:
    0
    Views:
    129
    Dave Angel
    Jan 10, 2013
Loading...

Share This Page