subprocess.Popen and multiprocessing fails to execute external program

Discussion in 'Python' started by Niklas Berliner, Jan 10, 2013.

  1. I have a pipline that involves processing some data, handing the data to an
    external program (t_coffee used for sequence alignments in bioinformatics),
    and postprocessing the result. Since I have a lot of data, I need to run my
    pipeline in parallel which I implemented using the multiprocessing module
    following Doug Hellmanns blog (

    My pipeline works perfectly fine when I run it with the multiprocessing
    implementation and one consumer, i.e. on one core. If I increase the number
    of consumers, i.e. that multiple instances of my pipeline run in parallel
    the external program fails with a core dump.

    To call the external programm I let python write a bash wrapper script that
    is called by
    subprocess.Popen(system_command, stdout=subprocess.PIPE,
    stderr=subprocess.PIPE, shell=True)
    result, error = childProcess.communicate()
    rc = childProcess.returncode
    (I also tried shell=False and calling the program directly specifying the
    env for the call)

    To avoid conflict between the external program each program call gets a
    flushed environment and the important environment variables are set to
    unique, existing paths. An example looks like this:
    env -i
    export HOME_4_TCOFFEE="/home/niklas/tcoffee/parallel/99-1-Consumer-2/"
    export CACHE_4_TCOFFEE="$HOME_4_TCOFFEE/cache/"
    export TMP_4_TCOFFEE="$HOME_4_TCOFFEE/tmp/"
    export LOCKDIR_4_TCOFFEE="$HOME_4_TCOFFEE/lock/"
    mkdir -p $CACHE_4_TCOFFEE
    mkdir -p $TMP_4_TCOFFEE
    mkdir -p $LOCKDIR_4_TCOFFEE

    t_coffee -mode expresso -seq
    /home/niklas/tcoffee/parallel/Consumer-2Q9FHL4_ARATH -blast_server=LOCAL
    -pdb_db=pdbaa -outorder=input -output fasta_aln -quiet -no_warning
    If I replace the t_coffee command by some simple 'touch I-<unique
    ID>-was-here' the files are created as expected and no error is produced.
    The developers of the external program assured me that running their
    program in parallel should not be a problem if the environment variables
    are set correctly. If a take the exact same bash scripts that are generated
    by python and that failed when trying to run them in parallel through
    python and execute batches of them manually using a for loop in multiple
    terminals (i.e. in parallel) they don't produce an error.

    I am really puzzled and stuck. Python seems to work correctly on its own
    and the external program seems to work correctly on its own. But somehow,
    when combined, they won't work.
    Any help and hints would be really appreciated! I need that to work.

    I am using Ubuntu 12.04 with python 2.7.3

    Niklas Berliner, Jan 10, 2013
    1. Advertisements

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
  2. Replies:
  3. ckkart
    Nov 27, 2008
  4. Hseu-Ming Chen
    Chris Torek
    Jun 12, 2011
  5. Dave Angel
    Dave Angel
    Jan 10, 2013

Share This Page