Re: Missing SIGCHLD

Discussion in 'Python' started by Dan Stromberg, Feb 15, 2011.

  1. On Tue, Feb 15, 2011 at 2:57 AM, Dinh <> wrote:
    > Hi,
    >
    > I currently build a process management system which is able to fork child
    > processes (fork()) and keep them alive (waitpid() ).
    >
    >              if pid in self.current_workers:
    >                  os.waitpid(pid, 0)
    >
    > If a child process dies, it should trigger a SIGCHLD signal and a handler is
    > installed to catch the signal and start a new child process. The code is
    > nothing special, just can be seen in any Python tutorial you can find on the
    > net.
    >
    >             signal.signal(signal.SIGCHLD, self.restart_child_process)
    >             signal.signal(signal.SIGHUP, self.handle) # reload
    >             signal.signal(signal.SIGINT, self.handle)
    >             signal.signal(signal.SIGTERM, self.handle)
    >             signal.signal(signal.SIGQUIT, self.handle)
    >
    > However, this code does not always work as expected. Most of the time, it
    > works. When a child process exits, the master process receives a SIGCHLD and
    > restart_child_process() method is invoked automatically to start a new child
    > process. But the problem is that sometimes, I know a child process exits due
    > to an unexpected exception (via log file) but it seems that master process
    > does not know about it. No SIGCHLD and so restart_child_process() is not
    > triggered. Therefore, no new child process is forked.
    >
    > Could you please kindly tell me why this happens? Is there any special code
    > that need being installed to ensure that every dead child will be informed
    > correctly?
    >
    > Mac OSX 10.6
    > Python 2.6.6


    Hi Dinh.

    I've done no Mac OS/X programming, but I've done Python and *ix
    signals some - so I'm going to try to help you, but it'll be kind of
    stabbing in the dark.

    *ix signals have historically been rather unreliable and troublesome
    when used heavily.

    There are BSD signals, SysV signals, and POSIX signals - they all try
    to solve the problems in different ways. Oh, and Linux has a way of
    doing signals using file descriptors that apparently helps quite a
    bit. I'm guessing your Mac will have available BSD and maybe POSIX
    signals, but you might check on that.

    You might try using ktrace on your Mac to see if any SIGCHLD signals
    are getting lost (it definitely happens in some scenarios), and
    hopefully, which kind of (C level) signal API CPython is using on your
    Mac also.

    You might also make sure your SIGCHLD signal handler is not just
    waitpid'ing once per invocation, but rather doing a nonblocking
    waitpid in a loop until no process is found, in case signals are lost
    (especially if/when signals occur during signal handler processing).

    If the loop in your signal handler doesn't help (enough), you could
    also try using a nonblocking waitpid in a SIGALARM handler in addition
    to your SIGCHLD handler.

    Some signal API's want you to reenable the signal as your first action
    in your signal handler to shorten a race window. Hopefully Mac OS/X
    doesn't need this, but you might check on it.

    BTW, CPython signals and CPython threads don't play very nicely
    together; if you're combining them, you might want to study up on
    this.

    Oh, also, signals in CPython will tend to cause system calls to return
    without completing, and giving an EINTR in errno, and not all CPython
    modules will understand what to do with that. :( Sadly, many
    application programmers tend to ignore the EINTR possibility.

    HTH
     
    Dan Stromberg, Feb 15, 2011
    #1
    1. Advertising

  2. Dan Stromberg

    Adam Skutt Guest

    On Feb 15, 1:28 pm, Dan Stromberg <> wrote:
    > *ix signals have historically been rather unreliable and troublesome
    > when used heavily.
    >
    > There are BSD signals, SysV signals, and POSIX signals - they all try
    > to solve the problems in different ways.


    No, there are just signals[1]. There are several different APIs for
    handling signals, depending on the situation, but they're all driving
    the same functionality underneath the covers. These days, only
    sigaction(2) is remotely usable (in C) for installing handlers and all
    the other APIs should normally be ignored.

    > You might also make sure your SIGCHLD signal handler is not just
    > waitpid'ing once per invocation, but rather doing a nonblocking
    > waitpid in a loop until no process is found, in case signals are lost
    > (especially if/when signals occur during signal handler processing).


    This is the most likely the issue. Multiple instances of the same
    pending signals are coalesced together automatically.

    It would also help to make sure the signal handler just sets a flag,
    within the application's main loop it should then respond to that flag
    appropriately. Running anything inside a signal handler is a recipe
    for disaster.

    Also, SIGCHLD handlers may not get reinstalled on some operating
    systems (even in Python), so the application code needs to reinstall
    it. If not done within the signal handler, this can caused signals to
    get "lost".

    That being said, I'd just spawn a thread and wait there and avoid
    SIGCHLD altogether. It's typically not worth the hassle.

    > Oh, also, signals in CPython will tend to cause system calls to return
    > without completing, and giving an EINTR in errno, and not all CPython
    > modules will understand what to do with that.  :(  Sadly, many
    > application programmers tend to ignore the EINTR possibility.


    This can be disabled by signal.siginterrupt(). Regardless, the signal
    handling facilities provided by Python are rather poor.

    Adam

    [1] Ok, I lied, there's regular signals and realtime signals, which
    have a few minor differences.
     
    Adam Skutt, Feb 16, 2011
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. chuckles
    Replies:
    2
    Views:
    552
    chuckles
    Aug 18, 2005
  2. Asfand Yar Qazi

    SIGCHLD handler not working correctly

    Asfand Yar Qazi, Dec 15, 2007, in forum: Ruby
    Replies:
    1
    Views:
    135
    Lionel Bouton
    Dec 15, 2007
  3. Moritz Karbach

    ignoring SIGCHLD

    Moritz Karbach, Jun 23, 2005, in forum: Perl Misc
    Replies:
    3
    Views:
    425
    Anno Siegel
    Jun 23, 2005
  4. msoulier
    Replies:
    1
    Views:
    143
    Ilmari Karonen
    Jul 16, 2005
  5. Replies:
    1
    Views:
    531
Loading...

Share This Page