Python becoming orphaned over ssh

D

David

Hi there, I have a strange situation.

If I do this:

1. Make a script /tmp/test.py on a remote server, with this contents:

#!/usr/bin/python
from subprocess import check_call
check_call(['ping', 'www.google.com'])

2. Call the script like this over SSH:

ssh root@testbox /tmp/test.py

3. Interrupt the script with Ctrl+C.

Then what happens:

The SSH session terminates, as expected.

However:

On the testing box, the Python script is still running, and so is the
ping session.

However, if I make an equivalent shell script, /tmp/test.sh, with this contents:

#!/bin/bash
ping www.google.com

And then run it over ssh like this:

ssh root@testbox /tmp/test.sh

And then hit Ctrl+C, then the shell script and ping are both
interrupted remotely, as expected.

Here is how 'pstree -p' looks for the python script on the test box,
before Ctrl+C:

<init (1) up here>
├─sshd(1158)─┬─sshd(19756)───test.py(19797)───ping(19798)
│ └─sshd(20233)───bash(20269)───pstree(19875)

And after Ctrl+C:

<init (1) up here>
├─sshd(1158)───sshd(20233)───bash(20269)───pstree(20218)
├─test.py(19797)───ping(19798)

Basically, the server-side sshd sub-process has disconnected, but the
Python script (and it's ping subprocess) have become orphaned, and now
belong to the init process.

Note, this only seems to happen if Python is executing a subprocess,
and only while Python is being run through a non-interactive ssh
session.

How can I make Python behave better? I want it to close down itself
and it's subprocess, and not orphan itself when I hit ctrl+C

PS:

The Python version on the testing box: 2.6.4, and the box itself is
running Ubuntu Karmic. Also, it's not just ping, but other utilities,
eg wget.

PPS:

I did also try adding logic to the python script, to keep an eye on
all the ppids (parent, grandparent, etc), and then to interrupt itself
and kill it's subprocess, but that doesn't seem to work: For whatever
reason, Python seems to be unable to kill it's subprocess in this
situation. The Python process closes, and ping becomes a child of
init. But I can then kill ping manually, from a separate ssh session.
 
J

John Nagle

Hi there, I have a strange situation.

If I do this:

1. Make a script /tmp/test.py on a remote server, with this contents:

#!/usr/bin/python
from subprocess import check_call

Python's signal handling for multithread and multiprocess programs
leaves something to be desired.

John Nagle
 
D

David

  Python's signal handling for multithread and multiprocess programs
leaves something to be desired.

Thanks for the confirmation (that I'm not missing something obvious).

I've reported a bug for this behavior in the Python issue tracker.

In the meanwhile, I've made a workaround function called
"check_call_and_monitor_ppids", that behaves like
subprocess.check_call, except that it regularly checks if the parent
pid "chain" (up to init process) changes during execution, and then
terminates the subprocess and raises an exception.

Actually I tried this before, and it didn't work. But strangely, it
seems to work fine so long as I don't try to print any warning
messages to stderr or stdout from the Python script (though, the
called tool itself may print to stdout or stderr without problems).
Quite peculiar...

Anyway, I hope that one of the Python developers will fix this sometime.

David.
 
J

Jean-Paul Calderone

Thanks for the confirmation (that I'm not missing something obvious).

I've reported a bug for this behavior in the Python issue tracker.

In the meanwhile, I've made a workaround function called
"check_call_and_monitor_ppids", that behaves like
subprocess.check_call, except that it regularly checks if the parent
pid "chain" (up to init process) changes during execution, and then
terminates the subprocess and raises an exception.

Actually I tried this before, and it didn't work. But strangely, it
seems to work fine so long as I don't try to print any warning
messages to stderr or stdout from the Python script (though, the
called tool itself may print to stdout or stderr without problems).
Quite peculiar...

Anyway, I hope that one of the Python developers will fix this sometime.

David.

Python ignores SIGPIPE by default. The default SIGPIPE behavior is to
exit. This is sort of what people on POSIX expect. If you're talking
to another process over a pipe and that process goes away, and then
you write to the pipe, you get a SIGPIPE and you exit (of course, if
it takes you 20 minutes before you do another write, then it's 20
minutes before you exit).

But with SIGPIPE ignored, a Python process won't do exactly this.
Instead, you'll get an exception from the write. If you don't handle
the exception, then it'll propagate to the top-level and you'll exit.
Just like with a "normal" process. Except you also get the option to
doing something other than exiting. Pretty nice.

But signal dispositions are inherited by child processes. So you run
ping from your short Python program, and it inherits SIGPIPE being
ignored. And it's written in C, not Python, so when it writes to the
pipe, there's no exception. So ping never gets any indication that it
should exit. No Python writes ever happen in this scenario. The SSH
supplied stdout is shared with the ping process, which writes to it
directly.

You can fix this by resetting the signal disposition of SIGPIPE for
the ping process:

#!/usr/bin/python
import signal

def reset():
signal.signal(signal.SIGPIPE, signal.SIG_DFL)

from subprocess import check_call
check_call(['ping', 'www.google.com'], preexec_fn=reset)

Very likely the subprocess module should be resetting the disposition
of signals that Python itself has fiddled with (and resetting any
other unusual state that the child is going to inherit, but nothing
else comes immediately to mind).
 
D

David

(Sending again to the list, I mailed Jean-Paul off-list by mistake, sorry).

You can fix this by resetting the signal disposition of SIGPIPE for
the ping process:

   #!/usr/bin/python
   import signal

   def reset():
       signal.signal(signal.SIGPIPE, signal.SIG_DFL)

   from subprocess import check_call
   check_call(['ping', 'www.google.com'], preexec_fn=reset)

Very likely the subprocess module should be resetting the disposition
of signals that Python itself has fiddled with (and resetting any
other unusual state that the child is going to inherit, but nothing
else comes immediately to mind).
--

Thanks, that works for me. Also  thanks to Gregory (Python developer)
for fixing it so quickly (don't want to litter the bug tracker with
thanks).

Small annoyance: Though it's fixed with ping, wget still has the
problem. However, I compared the two script versions again (test.sh,
and test.py) for wget, and actually bash has the same problem as
python, so I assume it's a wget bug this time (I'll go report the bug
to them next). Is it that unusual to run scripts over non-interactive
SSH, and then interrupt with Ctrl+C? :). So I guess for now I need to
keep my previous workaround for now, rather than switching to the
preexec_fn syntax.

Also, I don't 100% follow that explanation about the pipes. I read a
bit more over here:

http://en.wikipedia.org/wiki/SIGPIPE

And it seems like SIGPIPE would apply in a case like this:

ping www.google.com | test.py

But I'm not calling ping like that. I guess it applies to regular
stream, non-piping operations, too? ie, something like this is
happening

[ping output] passes through==> [test.py] ==> passes through [ssh] ==>
passes through => [more ssh, bash, etc, up to my terminal]

And when ssh disappears, then Linux automatically sends ping SIGPIPE,
because it's output can no longer be "passed through" to anywhere.

Pretty interesting, but I'm a noob with this stuff. If I want to learn
more, would this WP article (and linked pages) be a good place to
learn more?

http://en.wikipedia.org/wiki/Signal_(computing)

Thanks,

David.
 
A

Antoine Pitrou

But signal dispositions are inherited by child processes. So you run
ping from your short Python program, and it inherits SIGPIPE being
ignored. And it's written in C, not Python, so when it writes to the
pipe, there's no exception. So ping never gets any indication that it
should exit.

But doesn't write() fail with EPIPE in that situation?
That would mean `ping` ignores any error return from write().

Regards

Antoine.
 
J

Jean-Paul Calderone

But doesn't write() fail with EPIPE in that situation?
That would mean `ping` ignores any error return from write().

Quite so. A quick look at ping.c from iputils confirms this - there
are many write() and fprintf() calls with no error handling.

Jean-Paul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top