How do subprocess.Popen("ls | grep foo", shell=True) withshell=False?

Discussion in 'Python' started by Chris Seberino, Jun 10, 2010.

  1. How do subprocess.Popen("ls | grep foo", shell=True) with shell=False?

    Does complex commands with "|" in them mandate shell=True?

    cs
    Chris Seberino, Jun 10, 2010
    #1
    1. Advertising

  2. Chris Seberino

    Chris Rebert Guest

    On Wed, Jun 9, 2010 at 9:15 PM, Chris Seberino <> wrote:
    > How do subprocess.Popen("ls | grep foo", shell=True) with shell=False?


    I would think:

    from subprocess import Popen, PIPE
    ls = Popen("ls", stdout=PIPE)
    grep = Popen(["grep", "foo"], stdin=ls.stdout)

    Cheers,
    Chris
    --
    http://blog.rebertia.com
    Chris Rebert, Jun 10, 2010
    #2
    1. Advertising

  3. Chris Seberino

    Nobody Guest

    Re: How do subprocess.Popen("ls | grep foo", shell=True) with shell=False?

    On Wed, 09 Jun 2010 21:15:48 -0700, Chris Seberino wrote:

    > How do subprocess.Popen("ls | grep foo", shell=True) with shell=False?


    The same way that the shell does it, e.g.:

    from subprocess import Popen, PIPE
    p1 = Popen("ls", stdout=PIPE)
    p2 = Popen(["grep", "foo"], stdin=p1.stdout, stdout = PIPE)
    p1.stdout.close()
    result = p2.communicate()[0]
    p1.wait()

    Notes:

    Without the p1.stdout.close(), if the reader (grep) terminates before
    consuming all of its input, the writer (ls) won't terminate so long as
    Python retains the descriptor corresponding to p1.stdout. In this
    situation, the p1.wait() will deadlock.

    The communicate() method wait()s for the process to terminate. Other
    processes need to be wait()ed on explicitly, otherwise you end up with
    "zombies" (labelled "<defunct>" in the output from "ps").

    > Does complex commands with "|" in them mandate shell=True?


    No.

    Also, "ls | grep" may provide a useful tutorial for the subprocess module,
    but if you actually need to enumerate files, use e.g. os.listdir/os.walk()
    and re.search/fnmatch, or glob. Spawning child processes to perform tasks
    which can easily be performed in Python is inefficient (and often creates
    unnecessary portability issues).
    Nobody, Jun 10, 2010
    #3
  4. On 2010-06-10, Chris Seberino <> wrote:

    > How do subprocess.Popen("ls | grep foo", shell=True) with shell=False?


    You'll have to build your own pipeline with multiple calls to subprocess

    > Does complex commands with "|" in them mandate shell=True?


    Yes.

    Hey, I've got a novel idea!

    Read the documentation for the subprocess module:

    http://docs.python.org/library/subprocess.html#replacing-shell-pipeline

    --
    Grant Edwards grant.b.edwards Yow! ... My pants just went
    at on a wild rampage through a
    gmail.com Long Island Bowling Alley!!
    Grant Edwards, Jun 10, 2010
    #4
  5. On Jun 10, 6:52 am, Nobody <> wrote:
    > Without the p1.stdout.close(), if the reader (grep) terminates before
    > consuming all of its input, the writer (ls) won't terminate so long as
    > Python retains the descriptor corresponding to p1.stdout. In this
    > situation, the p1.wait() will deadlock.
    >
    > The communicate() method wait()s for the process to terminate. Other
    > processes need to be wait()ed on explicitly, otherwise you end up with
    > "zombies" (labelled "<defunct>" in the output from "ps").


    You are obviously very wise on such things. I'm curious if this
    deadlock issue is a rare event since I'm grep (hopefully) would rarely
    terminate before consuming all its input.

    Even if zombies are created, they will eventually get dealt with my OS
    w/o any user intervention needed right?

    I'm just trying to verify the naive solution of not worrying about
    these deadlock will still be ok and handled adequately by os. :)

    cs
    Chris Seberino, Jun 10, 2010
    #5
  6. Chris Seberino

    Lie Ryan Guest

    Re: How do subprocess.Popen("ls | grep foo", shell=True) with shell=False?

    On 06/10/10 21:52, Nobody wrote:
    > Spawning child processes to perform tasks
    > which can easily be performed in Python is inefficient


    Not necessarily so, recently I wrote a script which takes a blink of an
    eye when I pipe through cat/grep to prefilter the lines before doing
    further complex filtering in python; however when I eliminated the
    cat/grep subprocess and rewrite it in pure python, what was done in a
    blink of an eye turns into ~8 seconds (not much to fetter around, but it
    shows that using subprocess can be faster). I eventually optimized a
    couple of things and reduced it to ~1.5 seconds, up to which, I stopped
    since to go even faster would require reading by larger chunks,
    something which I don't really want to do.

    The task was to take a directory of ~10 files, each containing thousands
    of short lines (~5-10 chars per line on average) and count the number of
    lines which match a certain criteria, a very typical script job, however
    the overhead of reading the files line-by-line in pure python can be
    straining (you can read in larger chunks, but that's not the point,
    eliminating grep may not come for free).
    Lie Ryan, Jun 10, 2010
    #6
  7. Chris Seberino

    Nobody Guest

    Re: How do subprocess.Popen("ls | grep foo", shell=True) with shell=False?

    On Thu, 10 Jun 2010 08:40:03 -0700, Chris Seberino wrote:

    > On Jun 10, 6:52 am, Nobody <> wrote:
    >> Without the p1.stdout.close(), if the reader (grep) terminates before
    >> consuming all of its input, the writer (ls) won't terminate so long as
    >> Python retains the descriptor corresponding to p1.stdout. In this
    >> situation, the p1.wait() will deadlock.
    >>
    >> The communicate() method wait()s for the process to terminate. Other
    >> processes need to be wait()ed on explicitly, otherwise you end up with
    >> "zombies" (labelled "<defunct>" in the output from "ps").

    >
    > You are obviously very wise on such things. I'm curious if this
    > deadlock issue is a rare event since I'm grep (hopefully) would rarely
    > terminate before consuming all its input.


    That depends; it might never start (missing grep, missing shared
    library), segfault, terminate due to a signal, etc. Also, the program
    might later be modified to use "grep -m <count> ..." which will terminate
    after finding <count> matches.

    > Even if zombies are created, they will eventually get dealt with my OS
    > w/o any user intervention needed right?


    They will persist until the parent either wait()s for them (I think that
    this will happen if the process gets garbage-collected) or terminates. For
    short-lived processes, you can forget about them; for long-lived
    processes, they need to be dealt with.

    > I'm just trying to verify the naive solution of not worrying about
    > these deadlock will still be ok and handled adequately by os. :)


    Deadlock is deadlock. If you wait() on the child while it's blocked
    waiting for your Python program to consume its output, the wait() will
    block forever.
    Nobody, Jun 12, 2010
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Tom Brown
    Replies:
    0
    Views:
    446
    Tom Brown
    Sep 22, 2005
  2. André
    Replies:
    3
    Views:
    1,554
  3. bdb112
    Replies:
    45
    Views:
    1,297
    jazbees
    Apr 29, 2009
  4. Replies:
    0
    Views:
    281
  5. Chris Rebert
    Replies:
    0
    Views:
    440
    Chris Rebert
    May 17, 2010
Loading...

Share This Page