Piping processes works with 'shell = True' but not otherwise.

Luca Cerone · May 24, 2013

Hi everybody,
I am new to the group (and relatively new to Python)
so I am sorry if this issues has been discussed (although searching for topics in the group I couldn't find a solution to my problem).

I am using Python 2.7.3 to analyse the output of two 3rd parties programs that can be launched in a linux shell as:

program1 | program2

To do this I have written a function that pipes program1 and program2 (using subprocess.Popen) and the stdout of the subprocess, and a function that parses the output:

A basic example:

from subprocess import Popen, STDOUT, PIPE
def run():
p1 = Popen(['program1'], stdout = PIPE, stderr = STDOUT)
p2 = Popen(['program2'], stdin = p1.stdout, stdout = PIPE, stderr = STDOUT)
p1.stdout.close()
return p2.stdout

def parse(out):
for row in out:
print row
#do something else with each line
out.close()
return parsed_output

# main block here

pout = run()

parsed = parse(pout)

#--- END OF PROGRAM ----#

I want to parse the output of 'program1 | program2' line by line because the output is very large.

When running the code above, occasionally some error occurs (IOERROR: [Errno 0]). However this error doesn't occur if I code the run() function as:

def run():
p = Popen('program1 | program2', shell = True, stderr = STDOUT, stdout = PIPE)
return p.stdout

I really can't understand why the first version causes errors, while the second one doesn't.

Can you please help me understanding what's the difference between the two cases?

Thanks a lot in advance for the help,
Cheers, Luca

Luca Cerone · May 26, 2013

Can you please help me understanding what's the difference between the two cases?

Hi guys has some of you ideas on what is causing my issue?

Chris Rebert · May 26, 2013

Hi everybody,
I am new to the group (and relatively new to Python)
so I am sorry if this issues has been discussed (although searching for

topics in the group I couldn't find a solution to my problem).

I am using Python 2.7.3 to analyse the output of two 3rd parties programs

that can be launched in a linux shell as:

program1 | program2

To do this I have written a function that pipes program1 and program2

(using subprocess.Popen) and the stdout of the subprocess, and a function
that parses the output:

A basic example:

from subprocess import Popen, STDOUT, PIPE
def run():
p1 = Popen(['program1'], stdout = PIPE, stderr = STDOUT)
p2 = Popen(['program2'], stdin = p1.stdout, stdout = PIPE, stderr =

STDOUT)

Could you provide the *actual* commands you're using, rather than the
generic "program1" and "program2" placeholders? It's *very* common for
people to get the tokenization of a command line wrong (see the Note box in
http://docs.python.org/2/library/subprocess.html#subprocess.Popen for some
relevant advice).

p1.stdout.close()
return p2.stdout

def parse(out):
for row in out:
print row
#do something else with each line
out.close()
return parsed_output

# main block here

pout = run()

parsed = parse(pout)

#--- END OF PROGRAM ----#

I want to parse the output of 'program1 | program2' line by line because the output is very large.

When running the code above, occasionally some error occurs (IOERROR:

[Errno 0]).

Could you provide the full & complete error message and exception traceback?

However this error doesn't occur if I code the run() function as:

def run():
p = Popen('program1 | program2', shell = True, stderr = STDOUT, stdout = PIPE)
return p.stdout

I really can't understand why the first version causes errors, while the second one doesn't.

Can you please help me understanding what's the difference between the

two cases?

One obvious difference between the 2 approaches is that the shell doesn't
redirect the stderr streams of the programs, whereas you /are/ redirecting
the stderrs to stdout in the non-shell version of your code. But this is
unlikely to be causing the error you're currently seeing.

You may also want to provide /dev/null as p1's stdin, out of an abundance
of caution.

Lastly, you may want to consider using a wrapper library such as
http://plumbum.readthedocs.org/en/latest/ , which makes it easier to do
pipelining and other such "fancy" things with subprocesses, while still
avoiding the many perils of the shell.

Cheers,
Chris

Luca Cerone · May 27, 2013

Could you provide the *actual* commands you're using, rather than the generic "program1" and "program2" placeholders? It's *very* common for people to get the tokenization of a command line wrong (see the Note box in http://docs.python.org/2/library/subprocess.html#subprocess.Popen for some relevant advice).Hi Chris, first of all thanks for the help. Unfortunately I can't provide the actual commands because are tools that are not publicly available.
I think I get the tokenization right, though.. the problem is not that the programs don't run.. it is just that sometimes I get that error..

Just to be clear I run the process like:

p = subprocess.Popen(['program1','--opt1','val1',...'--optn','valn'], ...the rest)

which I think is the right way to pass arguments (it works fine for other commands)..

Could you provide the full & complete error message and exception traceback?

yes, as soon as I get to my work laptop..

One obvious difference between the 2 approaches is that the shell doesn'tredirect the stderr streams of the programs, whereas you /are/ redirectingthe stderrs to stdout in the non-shell version of your code. But this is unlikely to be causing the error you're currently seeing.

You may also want to provide /dev/null as p1's stdin, out of an abundanceof caution.

I tried to redirect the output to /dev/null using the Popen argument:
'stdin = os.path.devnull' (having imported os of course)..
But this seemed to cause even more troubles...

Lastly, you may want to consider using a wrapper library such as http://plumbum.readthedocs.org/en/latest/ , which makes it easier to do pipelining and other such "fancy" things with subprocesses, while still avoiding the many perils of the shell.

Thanks, I didn't know this library, I'll give it a try.
Though I forgot to mention that I was using the subprocess module, because I want the code to be portable (even though for now if it works in Unix platform is OK).

Thanks a lot for your help,
Cheers,
Luca

Carlos Nepomuceno · May 27, 2013

pipes usually consumes disk storage at '/tmp'. Are you sure you have enoughroom on that filesystem? Make sure no other processes are competing against for that space. Just my 50c because I don't know what's causing Errno 0. I don't even know what are the possible causes of such error. Good luck!

----------------------------------------

Date: Sun, 26 May 2013 16:58:57 -0700
Subject: Re: Piping processes works with 'shell = True' but not otherwise.
From: (e-mail address removed)
To: (e-mail address removed) [...]
I tried to redirect the output to /dev/null using the Popen argument:
'stdin = os.path.devnull' (having imported os of course)..
But this seemed to cause even more troubles...

Lastly, you may want to consider using a wrapper library such as http://plumbum.readthedocs.org/en/latest/ , which makes it easier to do pipelining and other such "fancy" things with subprocesses, while still avoiding the many perils of the shell.

Click to expand...

Thanks, I didn't know this library, I'll give it a try.
Though I forgot to mention that I was using the subprocess module, because I want the code to be portable (even though for now if it works in Unixplatform is OK).

Thanks a lot for your help,
Cheers,
Luca

Luca Cerone · May 27, 2013

Will it violate privacy / NDA to post the command line? Even if we

can't actually replicate your system, we may be able to see something

from the commands given.

Unfortunately yes..

Chris Rebert · May 29, 2013

On Sun said:
Hi Chris, first of all thanks for the help. Unfortunately I can't provide the actual commands because are tools that are not publicly available.
I think I get the tokenization right, though.. the problem is not that the programs don't run.. it is just that sometimes I get that error..

Just to be clear I run the process like:

p = subprocess.Popen(['program1','--opt1','val1',...'--optn','valn'], ... the rest)

which I think is the right way to pass arguments (it works fine for other commands)..

You may also want to provide /dev/null as p1's stdin, out of an abundance of caution.

Click to expand...

I tried to redirect the output to /dev/null using the Popen argument:
'stdin = os.path.devnull' (having imported os of course)..
But this seemed to cause even more troubles...

That's because stdin/stdout/stderr take file descriptors or file
objects, not path strings.

Cheers,
Chris

Thomas Rachel · May 29, 2013

Am 27.05.2013 02:14 schrieb Carlos Nepomuceno:

pipes usually consumes disk storage at '/tmp'.

Good that my pipes don't know about that.

Why should that happen?

Thomas

Carlos Nepomuceno · May 29, 2013

----------------------------------------

From: (e-mail address removed)
Subject: Re: Piping processes works with 'shell = True' but not otherwise.
Date: Wed, 29 May 2013 19:39:40 +0200
To: (e-mail address removed)

Am 27.05.2013 02:14 schrieb Carlos Nepomuceno:

Good that my pipes don't know about that.

Why should that happen?

Thomas

Ooops! My mistake! We've been using 'tee' when in debugging mode and I though that would apply to this case. Nevermind!

Cameron Simpson · May 29, 2013

| Am 27.05.2013 02:14 schrieb Carlos Nepomuceno:
| >pipes usually consumes disk storage at '/tmp'.
|
| Good that my pipes don't know about that.
| Why should that happen?

It probably doesn't on anything modern. On V7 UNIX at least there
was a kernel notion of the "pipe fs", where pipe storage existed;
usually /tmp; using small real (but unnamed) files is an easy way
to implement them, especially on systems where RAM is very small
and without a paging VM - for example, V7 UNIX ran on PDP-11s amongst
other things. And files need a filesystem.

But even then pipes are still small fixed length buffers; they don't
grow without bound as you might have inferred from the quoted
statement.

Cheers,

Luca Cerone · May 31, 2013

That's because stdin/stdout/stderr take file descriptors or file

objects, not path strings.

Thanks Chris, how do I set the file descriptor to /dev/null then?

Peter Otten · May 31, 2013

Luca said:
Thanks Chris, how do I set the file descriptor to /dev/null then?

For example:

with open(os.devnull, "wb") as stderr:
p = subprocess.Popen(..., stderr=stderr)
...

In Python 3.3 and above:

p = subprocess.Popen(..., stderr=subprocess.DEVNULL)

Luca Cerone · Aug 5, 2013

thanks and what about python 2.7?

In Python 3.3 and above:

p = subprocess.Popen(..., stderr=subprocess.DEVNULL)

P.s. sorry for the late reply, I discovered I don't receive notifications from google groups..

Tobiah · Aug 5, 2013

Unfortunately yes..

p1 = Popen(['nsa_snoop', 'terror_suspect', '--no-privacy', '--dispatch-squad'], ...

piping with subprocess	8	Feb 1, 2014
Multiple process output	0	Aug 12, 2011
GCC process not working as expected when called in Python (3.1.2)subprocess-shell, but OK otherwise	1	Oct 14, 2010
piping question	2	Apr 17, 2006
subprocess escaping POpen?!	0	Aug 5, 2010
Subprocess with and without shell	3	May 15, 2007
New line conversion with Popen attached to a pty	2	Jun 20, 2013
subprocess.Popen and replacing the shell pipe line	0	Sep 22, 2005

Piping processes works with 'shell = True' but not otherwise.

Luca Cerone

Luca Cerone

Chris Rebert

Luca Cerone

Carlos Nepomuceno

Luca Cerone

Chris Rebert

Thomas Rachel

Carlos Nepomuceno

Cameron Simpson

Luca Cerone

Peter Otten

Luca Cerone

Tobiah

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads