How do subprocess.Popen("ls | grep foo", shell=True) withshell=False?

Chris Seberino · Jun 10, 2010

How do subprocess.Popen("ls | grep foo", shell=True) with shell=False?

Does complex commands with "|" in them mandate shell=True?

cs

Chris Rebert · Jun 10, 2010

How do subprocess.Popen("ls | grep foo", shell=True) with shell=False?

I would think:

from subprocess import Popen, PIPE
ls = Popen("ls", stdout=PIPE)
grep = Popen(["grep", "foo"], stdin=ls.stdout)

Cheers,
Chris

Nobody · Jun 10, 2010

How do subprocess.Popen("ls | grep foo", shell=True) with shell=False?

The same way that the shell does it, e.g.:

from subprocess import Popen, PIPE
p1 = Popen("ls", stdout=PIPE)
p2 = Popen(["grep", "foo"], stdin=p1.stdout, stdout = PIPE)
p1.stdout.close()
result = p2.communicate()[0]
p1.wait()

Notes:

Without the p1.stdout.close(), if the reader (grep) terminates before
consuming all of its input, the writer (ls) won't terminate so long as
Python retains the descriptor corresponding to p1.stdout. In this
situation, the p1.wait() will deadlock.

The communicate() method wait()s for the process to terminate. Other
processes need to be wait()ed on explicitly, otherwise you end up with

zombies" (labelled said:
Does complex commands with "|" in them mandate shell=True?

No.

Also, "ls | grep" may provide a useful tutorial for the subprocess module,
but if you actually need to enumerate files, use e.g. os.listdir/os.walk()
and re.search/fnmatch, or glob. Spawning child processes to perform tasks
which can easily be performed in Python is inefficient (and often creates
unnecessary portability issues).

Grant Edwards · Jun 10, 2010

How do subprocess.Popen("ls | grep foo", shell=True) with shell=False?

You'll have to build your own pipeline with multiple calls to subprocess

Does complex commands with "|" in them mandate shell=True?

Yes.

Hey, I've got a novel idea!

Read the documentation for the subprocess module:

http://docs.python.org/library/subprocess.html#replacing-shell-pipeline

Chris Seberino · Jun 10, 2010

Without the p1.stdout.close(), if the reader (grep) terminates before
consuming all of its input, the writer (ls) won't terminate so long as
Python retains the descriptor corresponding to p1.stdout. In this
situation, the p1.wait() will deadlock.

The communicate() method wait()s for the process to terminate. Other
processes need to be wait()ed on explicitly, otherwise you end up with
"zombies" (labelled "<defunct>" in the output from "ps").

You are obviously very wise on such things. I'm curious if this
deadlock issue is a rare event since I'm grep (hopefully) would rarely
terminate before consuming all its input.

Even if zombies are created, they will eventually get dealt with my OS
w/o any user intervention needed right?

I'm just trying to verify the naive solution of not worrying about
these deadlock will still be ok and handled adequately by os.

cs

Lie Ryan · Jun 10, 2010

Spawning child processes to perform tasks
which can easily be performed in Python is inefficient

Not necessarily so, recently I wrote a script which takes a blink of an
eye when I pipe through cat/grep to prefilter the lines before doing
further complex filtering in python; however when I eliminated the
cat/grep subprocess and rewrite it in pure python, what was done in a
blink of an eye turns into ~8 seconds (not much to fetter around, but it
shows that using subprocess can be faster). I eventually optimized a
couple of things and reduced it to ~1.5 seconds, up to which, I stopped
since to go even faster would require reading by larger chunks,
something which I don't really want to do.

The task was to take a directory of ~10 files, each containing thousands
of short lines (~5-10 chars per line on average) and count the number of
lines which match a certain criteria, a very typical script job, however
the overhead of reading the files line-by-line in pure python can be
straining (you can read in larger chunks, but that's not the point,
eliminating grep may not come for free).

Nobody · Jun 12, 2010

You are obviously very wise on such things. I'm curious if this
deadlock issue is a rare event since I'm grep (hopefully) would rarely
terminate before consuming all its input.

That depends; it might never start (missing grep, missing shared
library), segfault, terminate due to a signal, etc. Also, the program
might later be modified to use "grep -m <count> ..." which will terminate

after finding said:
Even if zombies are created, they will eventually get dealt with my OS
w/o any user intervention needed right?

They will persist until the parent either wait()s for them (I think that
this will happen if the process gets garbage-collected) or terminates. For
short-lived processes, you can forget about them; for long-lived
processes, they need to be dealt with.

I'm just trying to verify the naive solution of not worrying about
these deadlock will still be ok and handled adequately by os.

Deadlock is deadlock. If you wait() on the child while it's blocked
waiting for your Python program to consume its output, the wait() will
block forever.

How to determine subprocess.Popen() failed when shell=True	0	May 17, 2010
When draggable=false & draggable=true dont work ?	1	Jan 14, 2023
How to I do this in Python ?	6	Aug 16, 2013
On Windows, how do I protect arguments to shell scripts launched with subprocess?	0	Feb 8, 2011
subprocess module: execution of standard binaries without shell?	0	Feb 26, 2009
Piping processes works with 'shell = True' but not otherwise.	13	May 24, 2013
parallel computations: subprocess.Popen(...).communicate()[0] doesnot work with multiprocessing.Pool	1	Jun 10, 2011
Statement evals as False in my IDE and True elsewhere	11	Jan 30, 2014

How do subprocess.Popen("ls | grep foo", shell=True) withshell=False?

Chris Seberino

Chris Rebert

Nobody

Grant Edwards

Chris Seberino

Lie Ryan

Nobody

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads