polling for output from a subprocess module

Thomas Bellman · Feb 4, 2008

try:
test = Popen(test_path,
stdout=PIPE,
stderr=PIPE,
close_fds=True,
env=test_environ)

while test.poll() == None:
ready = select.select([test.stderr], [], [])

if test.stderr in ready[0]:
t_stderr_new = test.stderr.readlines()
if t_stderr_new != []:
print "STDERR:", "\n".join(t_stderr_new)
t_stderr.extend(t_stderr_new) [...]
The problem is, that it seems that all the output from the subprocess
seems to be coming at once. Do I need to take a different approach?

The readlines() method will read until it reaches end of file (or
an error occurs), not just what is available at the moment. You
can see that for your self by running:

$ python -c 'import sys; print sys.stdin.readlines()'

The call to sys.stdin.readlines() will not return until you press
Ctrl-D (or, I think, Ctrl-Z if you are using MS-Windows).

However, the os.read() function will only read what is currently
available. Note, though, that os.read() does not do line-based
I/O, so depending on the timing you can get incomplete lines, or
multiple lines in one read.

jakub.hrozek · Feb 4, 2008

Hello,
My program uses the subprocess module to spawn a child and capture its
output. What I'd like to achieve is that stdout is parsed after the
subprocess finishes, but anything that goes to stderr is printed
immediately. The code currently looks like:

try:
test = Popen(test_path,
stdout=PIPE,
stderr=PIPE,
close_fds=True,
env=test_environ)

while test.poll() == None:
ready = select.select([test.stderr], [], [])

if test.stderr in ready[0]:
t_stderr_new = test.stderr.readlines()
if t_stderr_new != []:
print "STDERR:", "\n".join(t_stderr_new)
t_stderr.extend(t_stderr_new)

except OSError, e:
print >>sys.stderr, _("Test execution failed"), e
else:
self.result.return_code = test.returncode
self.result.process(test.stdout.readlines(), t_stderr)

The problem is, that it seems that all the output from the subprocess
seems to be coming at once. Do I need to take a different approach?

Christian Heimes · Feb 4, 2008

Thomas said:
The readlines() method will read until it reaches end of file (or
an error occurs), not just what is available at the moment. You
can see that for your self by running:

Bad idea

readlines() on a subprocess Popen instance will block when you PIPE more
than one stream and the buffer of the other stream is full.

You can find some insight at http://bugs.python.org/issue1606. I
discussed the matter with Guido a while ago.

Christian

jakub.hrozek · Feb 4, 2008

[email protected] said:
[email protected] said:

try:
test = Popen(test_path,
stdout=PIPE,
stderr=PIPE,
close_fds=True,
env=test_environ)
while test.poll() == None:
ready = select.select([test.stderr], [], [])
if test.stderr in ready[0]:
t_stderr_new = test.stderr.readlines()
if t_stderr_new != []:
print "STDERR:", "\n".join(t_stderr_new)
t_stderr.extend(t_stderr_new) [...]
The problem is, that it seems that all the output from the subprocess
seems to be coming at once. Do I need to take a different approach?

Click to expand...

The readlines() method will read until it reaches end of file (or
an error occurs), not just what is available at the moment. You
can see that for your self by running:

$ python -c 'import sys; print sys.stdin.readlines()'

The call to sys.stdin.readlines() will not return until you press
Ctrl-D (or, I think, Ctrl-Z if you are using MS-Windows).

However, the os.read() function will only read what is currently
available. Note, though, that os.read() does not do line-based
I/O, so depending on the timing you can get incomplete lines, or
multiple lines in one read.

Right, I didn't realize that. I'll try the os.read() method. Reading
what's available (as opposed to whole lines) shouldn't be an issue in
this specific case. Thanks for the pointer!

Thomas Bellman · Feb 5, 2008

Bad idea

Why is it a bad idea to see how the readlines() method behaves?

readlines() on a subprocess Popen instance will block when you PIPE more
than one stream and the buffer of the other stream is full.

You can find some insight at http://bugs.python.org/issue1606. I
discussed the matter with Guido a while ago.

Umm... Yes, you are correct that the code in the original post
also has a deadlock problem. I missed that. But saying that it
is the readline() method that is blocking is a bit misleading,
IMHO. Both processes will be blocking, in a deadly embrace.
It's a problem that has been known since the concept of inter-
process communication was invented, and isn't specific to the
readlines() method in Python.

But the OP *also* has the problem that I described in my reply.
Even if he only PIPE:d one of the output streams from his
subprocess, he would only receive its output when the subprocess
finished (if it ever does), not as it is produced.

(To those that don't understand why the OP's code risks a deadly
embrace: if a process (A) writes significant amounts of data to
both its standard output and standard error, but the process that
holds the other end of those streams (process B) only reads data
from one of those streams, process A will after a while fill the
operating system's buffers for the other stream. When that
happens, the OS will block process A from running until process B
reads data from that stream too, freeing up buffer space. If
process B never does that, then process A will never run again.

The OP must therefore do a select() on both the standard output
and standard error of his subprocess, and use os.read() to
retrieve the output from both streams to free up buffer space in
the pipes.)

Ivo · Feb 5, 2008

Thomas said:
try:
test = Popen(test_path,
stdout=PIPE,
stderr=PIPE,
close_fds=True,
env=test_environ)

Click to expand...

while test.poll() == None:
ready = select.select([test.stderr], [], [])

Click to expand...

if test.stderr in ready[0]:
t_stderr_new = test.stderr.readlines()
if t_stderr_new != []:
print "STDERR:", "\n".join(t_stderr_new)
t_stderr.extend(t_stderr_new) [...]
The problem is, that it seems that all the output from the subprocess
seems to be coming at once. Do I need to take a different approach?

Click to expand...

The readlines() method will read until it reaches end of file (or
an error occurs), not just what is available at the moment. You
can see that for your self by running:

$ python -c 'import sys; print sys.stdin.readlines()'

The call to sys.stdin.readlines() will not return until you press
Ctrl-D (or, I think, Ctrl-Z if you are using MS-Windows).

However, the os.read() function will only read what is currently
available. Note, though, that os.read() does not do line-based
I/O, so depending on the timing you can get incomplete lines, or
multiple lines in one read.

be carefull that you specify how much you want to read at a time,
otherwise it cat be that you keep on reading.

Specify read(1024) or somesuch.

In case of my PPCEncoder I recompiled the mencoder subprocess to deliver
me lines that end with \n.

If anyone can tell me how to read a continues stream than I am really
interested.

cya

Thomas Bellman · Feb 6, 2008

Ivo said:
Thomas Bellman wrote:
be carefull that you specify how much you want to read at a time,
otherwise it cat be that you keep on reading.

Specify read(1024) or somesuch.

Well, of course you need to specify how much you want to read.
Otherwise os.read() throws an exception:
Traceback (most recent call last):

In case of my PPCEncoder I recompiled the mencoder subprocess to deliver
me lines that end with \n.

If anyone can tell me how to read a continues stream than I am really
interested.

I have never had any problem when using the os.read() function,
as long as I understand the effects of output buffering in the
subprocess. The file.read() method is a quite different animal.

(And then there's the problem of getting mplayer/mencoder to
output any *useful* information, but that is out of the scope of
this newsgroup.

Reading Live Output from a Subprocess	5	Apr 6, 2012
subprocess pipe	6	Nov 14, 2010
os.pipe and subprocess under Windows	3	Nov 17, 2008
ANN: Sarge, a library wrapping the subprocess module,has been released.	16	Feb 10, 2012
ANN: Version 0.1.2 of sarge (a subprocess wrapper library) has beenreleased.	0	Dec 17, 2013
How can I read streaming output of a subprocess	2	May 2, 2012
spawning a process with subprocess	14	Nov 26, 2007
ANN: Version 0.1.1 of sarge (a subprocess wrapper library) has been released.	0	Jun 4, 2013

polling for output from a subprocess module

Thomas Bellman

jakub.hrozek

Christian Heimes

jakub.hrozek

Thomas Bellman

Ivo

Thomas Bellman

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads