what happens to Popen()'s parent-side file descriptors?

R

Roger Davis

Hi, I am new to this group, please forgive me if this is a repeat
question. I am a new Python programmer but experienced in C/Unix. I am
converting a shell script to Python which essentially loops
infinitely, each pass through the loop running commands like

set output = `cat this | grep that | whatever ...`

My understanding is that this functionality is best coded via
subprocess.Popen(). I need to read output from these spawned children
via a pipe from their stdout, hence something like

p= subprocess.Popen(args, stdout=subprocess.PIPE)

This means that somewhere a pipe file descriptor is opened on the
parent side to read from the child's stdout. When, if ever, is that
descriptor closed? Per-process FDs are limited and I am looping
infinitely so I need to be very careful about not running out of them.
Are there any other FDs related to this operation that also need to be
closed?

Testing with the interpreter (2.6, MacOSX) it appears that p.stdout is
being closed somehow by someone other than me:

import subprocess
args= ["echo", "This is a mystery!"]
i= 0
while True:
p= subprocess.Popen(args, stdout=subprocess.PIPE)
for line in p.stdout:
print "[%5d] %s" % (i, line.strip())
i+= 1

The above code closes nothing but appears to run indefinitely without
running the parent out of FDs. WTF is going on here?

Popen.communicate() similarly appears to be closing the parent's pipe
FDs, although that seems more understandable as it appears to be
designed to encapsulate a lot of cleanup activity. In either case
(code snippet above or communicate() an attempt to manually close
p,stdout goes as follows:
None

Attempts to close anything else fail spectacularly as one might
expect:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'close'

Can anyone explain the treatment of the pipe FDs opened in the parent
by Popen() to me or point me to some documentation? The python.org
docs say absolutely nothing about this as far as I can tell, a glaring
deficiency IMO (hint to all you underworked volunteer developers ;-

Also, does Popen.returncode contain only the child's exit code or is
does it also contain signal info like the return of os.wait()?
Documentation on this is also unclear to me.

Thanks very much!
 
S

Seebs

Hi, I am new to this group, please forgive me if this is a repeat
question. I am a new Python programmer but experienced in C/Unix. I am
converting a shell script to Python which essentially loops
infinitely, each pass through the loop running commands like

set output = `cat this | grep that | whatever ...`

My understanding is that this functionality is best coded via
subprocess.Popen().

Actually...

It's probably BEST coded, when possible, by doing the work internally.

So:
thisfile = open('this')
lines = []
for line in thisfile:
if re.match('that', line):
lines.append(line)
whatever(lines)

.... or something like that.

(Which you can write as a oneliner list comprehension, but I don't know
that it's got clarity)

Basically, don't use popen() unless the thing you'd be calling is something
hard enough to do that you really, really, want to run an external program
for it. This is not a shell script; it is not just there to be glue
around other programs.

-s
 
N

Nobody

My understanding is that this functionality is best coded via
subprocess.Popen(). I need to read output from these spawned children
via a pipe from their stdout, hence something like

p= subprocess.Popen(args, stdout=subprocess.PIPE)

This means that somewhere a pipe file descriptor is opened on the
parent side to read from the child's stdout. When, if ever, is that
descriptor closed? Per-process FDs are limited and I am looping
infinitely so I need to be very careful about not running out of them.
Are there any other FDs related to this operation that also need to be
closed?

Testing with the interpreter (2.6, MacOSX) it appears that p.stdout is
being closed somehow by someone other than me:

import subprocess
args= ["echo", "This is a mystery!"]
i= 0
while True:
p= subprocess.Popen(args, stdout=subprocess.PIPE)
for line in p.stdout:
print "[%5d] %s" % (i, line.strip())
i+= 1

The above code closes nothing but appears to run indefinitely without
running the parent out of FDs. WTF is going on here?

Garbage collection is going on here.

On the second and subsequent invocations of:

p= subprocess.Popen(args, stdout=subprocess.PIPE)

the assignment to p causes its previous value to become unreferenced.

This will eventually result in a call to the __del__ method of the
now-unreferenced Popen object, which will poll to see if the child is
still running.

If it has terminated, the child process will be reaped and the Popen
object can be discarded. Otherwise, it will be put onto a list
(subprocess._active) of processes to be polled by subprocess._cleanup(),
which is called whenever a new Popen object is created.

IOW, Popen objects are garbage-collected in a "sane" manner; the
associated child processes are allowed to continue running and will
be reaped semi-automatically once they terminate (at least, they won't
accumulate without bound).
Also, does Popen.returncode contain only the child's exit code or is
does it also contain signal info like the return of os.wait()?

If the child terminated normally, .returncode will be non-negative, and
contains the "exit status", i.e. the value returned from main() or passed
to exit() etc.

If the child terminated on a signal, .returncode will be negative, and
contains the negation of the signal number.

If the child is still running (or is a zombie), .returncode will be None.
 
C

Chris Torek

My understanding is that this functionality is best coded via
subprocess.Popen().

"Best" is always a big question mark. :)
I need to read output from these spawned children
via a pipe from their stdout, hence something like

p= subprocess.Popen(args, stdout=subprocess.PIPE)

This means that somewhere a pipe file descriptor is opened on the
parent side to read from the child's stdout. When, if ever, is that
descriptor closed?

(I am going to tell this tale in a slightly different order than
your question asked, as I think it works out better that way.)

subprocess.Popen() creates the instance variable and any pipes
needed, forks (on a Unix system) the target process, but has not
yet done any I/O with it (except to read a success/fail indicator
for whether the exec worked and/or any exception that occurred
before then, e.g., during the preexec_fn). It then makes the stdin,
stdout, and/or stderr attributes (p.stdout, for the example above)
using os.fdopen(). Streams not requested in the call are set to
None (so p.stderr, for instance, will be None in this case).

At this point, then, the underlying open pipe is still around.
But your next step is (normally) to use p.communicate(); this is
where most of the magic happens. The Unix implementation loops,
using select() to read and write from/to whichever pipe(s) are open
to the child process, until *all* data are sent and received. As
each data stream is finished, it is closed (in this case, via
self.stdout.close()). Lastly, p.communicate() invokes p.wait() (via
self.wait()), to wait for the child process to exit.

By the time p.communicate() returns, the pipe is closed and the
command has finished. The entire output text, however large it
is, is returned as the first element of the return-value 2-tuple
(remember that p.communicate() returns both the stdout and the
stderr -- stderr will be the empty string in this case, as stderr
was not redirected in the subprocess.Popen() call).
Per-process FDs are limited and I am looping
infinitely so I need to be very careful about not running out of them.
Are there any other FDs related to this operation that also need to be
closed?

Only if you (or code you call) have opened them and not set
FD_CLOEXEC. In this case, you can set close_fds = True in your
call to subprocess.Popen(). That will make the child of fork()
loop over higher-number fd's, calling os.close() on each one.
Testing with the interpreter (2.6, MacOSX) it appears that p.stdout is
being closed somehow by someone other than me:

import subprocess
args= ["echo", "This is a mystery!"]
i= 0
while True:
p= subprocess.Popen(args, stdout=subprocess.PIPE)
for line in p.stdout:
print "[%5d] %s" % (i, line.strip())
i+= 1

The above code closes nothing but appears to run indefinitely without
running the parent out of FDs. WTF is going on here?

The above reads from p.stdout -- the os.fdopen() result on the
underlying pipe -- directly. In the general case (multiple input
and output pipes), this is not safe as you can deadlock with
constipated pipes (hence the existence of p.communicate()). In
this specific case, there is just one pipe so the deadlock issue
goes away. Instead, the file descriptor remains open while the
inner loop runs (i.e., while "line in p.stdout" is able to fetch
lines via the file's iterator). When the loop stops the pipe is
still open in the parent, but the child has finished and is now
exiting (or has exited or will exit soon). You then reach the
"i+=1" line and resume the loop, calling subprocess.Popen()
anew.

Now we get to the even deeper magic. :)

What happens to the *old* value in p? Answer: because p is
reassigned, the (C implementation, interpreted Python bytecode
runtime) reference count drops [%]. Since p was the only live
reference, the count drops from 1 to 0. This makes the old instance
variable go away, invoking old_p.__del__() as it were. The deletion
handler cleans up a few things itself, including a a call to
os.waitpid() if needed, and then simply lets the reference to
old_p.stdout go away. That in turn decrements old_p.stdout's
reference count. Since that, too, reaches zero, its __del__ is
run ... and *that* closes the underlying file descriptor.

[% This is all simplified -- the Python documentation mentions that
reference counting for local variables is somewhat tricked-out by
the compiler to avoid unnecessary increments and decrements. The
principles apply, though.]

Running the above code fragment in a different implementation, in
which garbage collection is deferred, would *not* close the file
descriptor, and the system would potentially run out (depending on
when a gc occurred, and/or whether the system would attempt gc on
running out of file descriptors, in the hope that the gc would free
some up).

The subprocess module does go through a bunch of extra work to make
sure that any as-yet-uncollected fork()ed processes are eventually
waitpid()-ed for.
Can anyone explain the treatment of the pipe FDs opened in the parent
by Popen() to me or point me to some documentation?

The best documentation seems generally to be the source. Fortunately
subprocess.py is written in Python. (Inspecting C modules is less
straightforward. :) )
Also, does Popen.returncode contain only the child's exit code or is
does it also contain signal info like the return of os.wait()?
Documentation on this is also unclear to me.

"A negative value -N indicates that the child was terminated by
signal N (Unix only)." Again, the Python source is handy:

def _handle_exitstatus(self, sts):
if os.WIFSIGNALED(sts):
self.returncode = -os.WTERMSIG(sts)
elif os.WIFEXITED(sts):
self.returncode = os.WEXITSTATUS(sts)
else:
# Should never happen
raise RuntimeError("Unknown child exit status!")

The only things left out are the core-dump flag, and stopped/suspended.
The latter should never occur as os.waitpid() is called with only
os.WNOHANG, not os.WUNTRACED (of course a process being traced,
stopping at a breakpoint, would mess this up, but subprocess.Popen
is not a debugger :) ).

It might be nice to capture os.WCOREDUMPED(sts), though.

Also, while I was writing this, I discovered that appears to be a
buglet in _cleanup(), with regard to "abandoned" Unix processes that
terminate due to a signal. Note that _handle_exitstatus() will
set self.returncode to (e.g.) -1 if the child exits due to SIGHUP.
The _cleanup() function, however, does this in part:

if inst.poll(_deadstate=sys.maxint) >= 0:
try:
_active.remove(inst)

The Unix-specific poll() routine, however, reads:

if self.returncode is None:
try:
pid, sts = os.waitpid(self.pid, os.WNOHANG)
if pid == self.pid:
self._handle_exitstatus(sts)
except os.error:
if _deadstate is not None:
self.returncode = _deadstate
return self.returncode

Hence if pid 12345 is abandoned (and thus on _active), and we
os.waitpid(12345, os.WNOHANG) and get a status that has a termination
signal, we set self.returncode to -N, and return that. Hence
inst.poll returns (e.g.) -1 and we never attempt to remove it from
_active. Now that its returncode is not None, though, every later
poll() will continue to return -1. It seems it would be better to
have _cleanup() read:

if inst.poll(_deadstate=sys.maxint) is not None:

(Note, this is python 2.5, which is what I have installed on my
Mac laptop, where I am writing this at the moment).
 
R

Roger Davis

Many thanks to all who responded to my question! It's nice to know, as
someone new to Python, that there are lots of well-informed people out
there willing to help with such issues.

Thanks, Mike, for your pipes suggestion, I will keep that in mind for
future projects.

Seebs, you are of course correct that the example I quoted (`cat |
grep | whatever`) is best done internally with the re module and built-
in language features, and in fact that has already been done wherever
possible. I should have picked a better example, there are numerous
cases where I am calling external programs whose functionality is not
duplicated by Python features.

'Nobody' (clearly a misnomer!) and Chris, thanks for your excellent
explanations about garbage collection. (Chris, I believe you must have
spent more time looking at the subprocess source and writing your
response than I have spent writing my code.) GC is clearly at the
heart of my lack of understanding on this point. It sounds like, from
what Chris said, that *any* file descriptor
would be closed when GC occurs if it is no longer referenced,
subprocess-related or not. BTW, and this comment is not at all
intended for any of you who have already very generously and patiently
explained this stuff to me, it does seem like it might be a good idea
to provide documentation on some of these more important GC details
for pretty much any class, especially ones which have lots of murky OS
interaction. I have to admit that in this case it makes perfect sense
to close parent pipe descriptors there as I can't think of any reason
why you might want to keep one open after your object is no longer
referenced or your child exits.

It sounds to me that, although my code might be safe now as is, I
probably need to do an explicit p.stdXXX.close() myself for any pipes
which I open via Popen() as soon as I am done with them. Documentation
on python.org states that GC can be postponed or omitted altogether, a
possibility that Chris mentions in his comments. Other documentation
states that there is no harm in doing multiple close()es on the same
file, so I assume that neither my code nor the subprocess GC code will
break if the other does the deed first. If anybody thinks this is a
bad idea, please comment.

On a related point here, I have one case where I need to replace the
shell construct

externalprog <somefile >otherfile

I suppose I could just use os.system() here but I'd rather keep the
Unix shell completely out of the picture (which is why I am moving
things to Python to begin with!), so I'm just doing a simple open() on
somefile and otherfile and then passing those file handles into
Popen() for stdin and stdout. I am already closing those open()ed file
handles after the child completes, but I suppose that I probably
should also explicitly close Popen's p.stdin and p.stdout, too. (I'm
guessing they might be dup()ed from the original file handles?)


Thanks again to all!
 
C

Chris Torek

'Nobody' (clearly a misnomer!) and Chris, thanks for your excellent
explanations about garbage collection. (Chris, I believe you must have
spent more time looking at the subprocess source and writing your
response than I have spent writing my code.)

Well, I just spent a lot of time looking at the code earlier
this week as I was thinking about using it in a program that is
required to be "highly reliable" (i.e., to never lose data, even
if Things Go Wrong, like disks get full and sub-commands fail).

(Depending on shell version, "set -o pipefail" can allow
"cheating" here, i.e., with subprocess, using shell=True and
commands that have the form "a | b":

$ (exit 0) | (exit 2) | (exit 0)
$ echo $?
0
$ set -o pipefail
$ (exit 0) | (exit 2) | (exit 0)
$ echo $?
2

but -o pipefail is not POSIX and I am not sure I can count on
it.)
GC is clearly at the heart of my lack of understanding on this
point. It sounds like, from what Chris said, that *any* file
descriptor would be closed when GC occurs if it is no longer
referenced, subprocess-related or not.

Yes -- but, as noted elsethread, "delayed" failures from events
like "disk is full, can't write last bits of data" become problematic.
It sounds to me that, although my code might be safe now as is, I
probably need to do an explicit p.stdXXX.close() myself for any pipes
which I open via Popen() as soon as I am done with them.

Or, use the p.communicate() function, which contains the explicit
close. Note that if you are using a unidirectional pipe and do
your own I/O -- as in your example -- calling p.communicate()
will just do the one attempt to read from the pipe and then close
it, so you can ignore the result:

import subprocess
p = subprocess.Popen(["cat", "/etc/motd"], stdout=subprocess.PIPE)
for line in p.stdout:
print line.rstrip()
p.communicate()

The last call returns ('', None) (note: not ('', '') as I suggested
earlier, I actually typed this one in on the command line). Run
python with strace and you can observe the close call happen --
this is the [edited to fit] output after entering the p.communicate()
line:

read(0, "\r", 1) = 1
write(1, "\n", 1
) = 1
rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost ...}) = 0
ioctl(0, SNDCTL_TMR_STOP or TCSETSW, {B38400 opost ...}) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost ...}) = 0

[I push "enter", readline echos a newline and does tty ioctl()s]

rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigaction(SIGWINCH, {SIG_DFL}, {0xb759ed10, [], SA_RESTART}, 8) = 0
time(NULL) = 1287075471

[no idea what these are really for, but the signal manipulation
appears to be readline()]

fstat64(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
_llseek(3, 0, 0xbf80d490, SEEK_CUR) = -1 ESPIPE (Illegal seek)
read(3, "", 8192) = 0
close(3) = 0

[fd 3 is the pipe reading from "cat /etc/motd" -- no idea what the
fstat64() and _llseek() are for here, but the read() and close() are
from the communicate() function]

waitpid(13775, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 13775

[this is from p.wait()]

write(1, "(\'\', None)\n", 11('', None)
) = 11

[this is the result being printed, and the rest is presumably
readline() again]

ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost ...}) = 0
rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
ioctl(0, TIOCGWINSZ, {ws_row=44, ws_col=80, ...}) = 0
ioctl(0, TIOCSWINSZ, {ws_row=44, ws_col=80, ...}) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost ...}) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost ...}) = 0
ioctl(0, SNDCTL_TMR_STOP or TCSETSW, {B38400 opost ...}) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost ...}) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigaction(SIGWINCH, {0xb759ed10, [], SA_RESTART}, {SIG_DFL}, 8) = 0
write(1, ">>> ", 4>>> ) = 4
select(1, [0], NULL, NULL, NULL
On a related point here, I have one case where I need to replace the
shell construct

externalprog <somefile >otherfile

I suppose I could just use os.system() here but I'd rather keep the
Unix shell completely out of the picture (which is why I am moving
things to Python to begin with!), so I'm just doing a simple open() on
somefile and otherfile and then passing those file handles into
Popen() for stdin and stdout. I am already closing those open()ed file
handles after the child completes, but I suppose that I probably
should also explicitly close Popen's p.stdin and p.stdout, too. (I'm
guessing they might be dup()ed from the original file handles?)

There is no dup()ing going on so this is not necessary, but again,
using the communicate function will close them for you. In this
case, though, I am not entirely sure subprocess is the right hammer
-- it mostly will give you portablility to Windows (well, plus the
magic for preexec_fn and reporting exec failure).

Once again, peeking at the source is the trick :) ... the arguments
you provide for stdin, stdout, and stderr are used thus:

if stdin is None:
pass
elif stdin == PIPE:
p2cread, p2cwrite = os.pipe()
elif isinstance(stdin, int):
p2cread = stdin
else:
# Assuming file-like object
p2cread = stdin.fileno()

(this is repeated for stdout and stderr) and the resulting
integer file descriptors (or None if not applicable) are
passed to os.fdopen() on the parent side.

(On the child side, the code does the usual shell-like dance
to move the appropriate descriptors to 0 through 2.)
 
S

Seebs

Seebs, you are of course correct that the example I quoted (`cat |
grep | whatever`) is best done internally with the re module and built-
in language features, and in fact that has already been done wherever
possible. I should have picked a better example, there are numerous
cases where I am calling external programs whose functionality is not
duplicated by Python features.

Fair enough. It's just a common pitfall when moving from shell to basically
any other language. My first attempts to code in C involved a lot of
building up of system() calls. :p

-s
 
N

Nobody

On a related point here, I have one case where I need to replace the
shell construct

externalprog <somefile >otherfile

I suppose I could just use os.system() here but I'd rather keep the
Unix shell completely out of the picture (which is why I am moving
things to Python to begin with!), so I'm just doing a simple open() on
somefile and otherfile and then passing those file handles into
Popen() for stdin and stdout. I am already closing those open()ed file
handles after the child completes, but I suppose that I probably
should also explicitly close Popen's p.stdin and p.stdout, too. (I'm
guessing they might be dup()ed from the original file handles?)

p.stdin will be None unless you use stdin=subprocess.PIPE; similarly for
stdout.

Another gotcha regarding pipes: the reader only sees EOF once there are no
writers, i.e. when the *last* writer closes their end.

If Python has a descriptor for the write end of a pipe, any child process
will inherit it unless you use close_fds=True or close it via a function
specified by the preexec_fn argument. Allowing it to be inherited can
prevent the reader from seing EOF on the pipe.

E.g. if you do:

p1 = Popen(..., stdin = PIPE, stdout = PIPE)
p2 = Popen(..., stdin = p1.stdout)

p2 will inherit p1.stdin (that's the write end) from Python. Subsequently
calling p1.stdin.close() *won't* cause p1 to see EOF on its stdin because
p2 still has its inherited copy of the descriptor open.

On Windows, only stdin, stdout and stderr are inherited, so this isn't an
issue there.
 
L

Lawrence D'Oliveiro

Running the above code fragment in a different implementation, in
which garbage collection is deferred, would *not* close the file
descriptor, and the system would potentially run out (depending on
when a gc occurred, and/or whether the system would attempt gc on
running out of file descriptors, in the hope that the gc would free
some up).

Aren’t you glad IronPython is dead?
 
L

Lawrence D'Oliveiro

In message
Roger said:
Documentation on python.org states that GC can be postponed or omitted
altogether ...

Yes, but no sensible Python implementation would do that, as it’s a recipe
for resource bloat.
 
L

Lawrence D'Oliveiro

Another gotcha regarding pipes: the reader only sees EOF once there are no
writers, i.e. when the *last* writer closes their end.

Been there, been bitten by that.
 
C

Chris Torek

Been there, been bitten by that.

"Nobody" mentioned the techniques of setting close_fds = True and
passing a preexec_fn that closes the extra pipe descriptors. You
can also use fcntl.fcntl() to set the fcntl.FD_CLOEXEC flag on the
underlying file descriptors (this of course requires that you are
able to find them).

The subprocess module sets FD_CLOEXEC on the pipe it uses to pass
back a failure to exec, or even to reach the exec, e.g., due to an
exception during preexec_fn. One could argue that perhaps it should
set FD_CLOEXEC on the parent's remaining pipe descriptors, once
the child is successfully started, if it created them (i.e., if
the corresponding arguments were PIPE). In fact, thinking about it
now, I *would* argue that.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top