atexit + threads = bug?

Discussion in 'Python' started by David Rushby, Jan 12, 2006.

  1. David Rushby

    David Rushby Guest

    Consider the following program (underscores are used to force
    indentation):
    ------------------------------------------------
    import atexit, threading, time

    def atExitFunc():
    ____print 'atExitFunc called.'

    atexit.register(atExitFunc)

    class T(threading.Thread):
    ____def run(self):
    ________assert not self.isDaemon()
    ________print 'T before sleep.'
    ________time.sleep(1.0)
    ________print 'T after sleep.'

    T().start()
    print 'Main thread finished.'
    ------------------------------------------------

    I would expect the program to print 'atExitFunc called.' after 'T after
    sleep.', but instead, it prints (on Windows XP with Python 2.3.5 or
    2.4.2):
    ------------------------------------------------
    T before sleep.
    Main thread finished.
    atExitFunc called.
    T after sleep.
    ------------------------------------------------

    atExitFunc is called when the main thread terminates, rather than when
    the process exits. The atexit documentation contains several warnings,
    but nothing about this. Is this a bug?
    David Rushby, Jan 12, 2006
    #1
    1. Advertising

  2. David Rushby

    Guest

    David> atExitFunc is called when the main thread terminates, rather than
    David> when the process exits. The atexit documentation contains
    David> several warnings, but nothing about this. Is this a bug?

    This might be a bug, but I can't see how it can be in atexit. Atexit just
    registers its own sys.exitfunc function, then when it's called, calls all
    the individual exit functions that have been registered with it. It has no
    control over when sys.exitfunc is invoked. sys.exitfunc is called as the
    first action of Py_Finalize. It appears that Py_Finalize is called when the
    main thread exits.

    Skip
    , Jan 12, 2006
    #2
    1. Advertising

  3. David Rushby

    Tim Peters Guest

    [David Rushby]
    > Consider the following program (underscores are used to force
    > indentation):
    > ------------------------------------------------
    > import atexit, threading, time
    >
    > def atExitFunc():
    > ____print 'atExitFunc called.'
    >
    > atexit.register(atExitFunc)
    >
    > class T(threading.Thread):
    > ____def run(self):
    > ________assert not self.isDaemon()
    > ________print 'T before sleep.'
    > ________time.sleep(1.0)
    > ________print 'T after sleep.'
    >
    > T().start()
    > print 'Main thread finished.'
    > ------------------------------------------------
    >
    > I would expect the program to print 'atExitFunc called.' after 'T after
    > sleep.',


    Why? I expect very little ;-)

    > but instead, it prints (on Windows XP with Python 2.3.5 or
    > 2.4.2):
    > ------------------------------------------------
    > T before sleep.
    > Main thread finished.
    > atExitFunc called.
    > T after sleep.
    > ------------------------------------------------


    That's not what I saw just now on WinXP Pro SP2. With 2.3.5 and 2.4.2
    I saw this order instead:

    Main thread finished
    atExitFunc called.
    T before sleep.
    T after sleep.

    The relative order of "Main thread finished." and "T before sleep" is
    purely due to timing accidents; it's even possible for "T after
    sleep." to appear before "Main thread finished.", although it's not
    possible for "T after sleep." to appear before "T before sleep.". In
    fact, there are only two orderings you can count on here:

    T before sleep < T after sleep
    Main thread finished < atExitFunc called

    If you need more than that, you need to add synchronization code.

    > atExitFunc is called when the main thread terminates, rather than when
    > the process exits.


    Is there a difference between "main thread terminates" and "the
    process exits" on Windows? Not in C. It so happens that Python's
    threading module _also_ registers an atexit callback, which does a
    join() on all the threads you created and didn't mark as daemon
    threads. Because threading.py's atexit callback was registered first,
    it gets called last when Python is shutting down, and it doesn't
    return until it joins all the non-daemon threads still sitting around.
    Your atexit callback runs first because it was registered last. That
    in turn makes it _likely_ that you'll see (as we both saw) "at
    exitFunc called." before seeing "T after sleep.", but doesn't
    guarantee that.

    Don't by fooled by _printing_ "Main thread finished", BTW: that's
    just a sequence of characters ;-). The main thread still does a lot
    of work after that point, to tear down the interpreter in a sane
    order. Part of that work is threading.py waiting for your threads to
    finish.

    > The atexit documentation contains several warnings,
    > but nothing about this. Is this a bug?


    It doesn't look like a bug to me, and I doubt Python wants to make
    stronger promises than it does now about the exact order of assorted
    exit gimmicks.

    You can reliably get "atExitFunc called." printed last by delaying
    your import of the threading module until after you register your
    atExitFunc callback. If you register that first, it's called last,
    and threading.py's wait-for-threads-to-end callback gets called first
    then. That callback won't return before your worker thread finishes.

    There's no promise that will continue to work forever, though. This
    is fuzzy stuff vaguely covered by the atexit doc's "In particular,
    other core Python modules are free to use atexit without the
    programmer's knowledge." threading.py happens to be such a module
    today, but maybe it won't be tomorrow.
    Tim Peters, Jan 12, 2006
    #3
  4. David Rushby

    David Rushby Guest

    >> I would expect...
    > The relative order of "Main thread finished." and "T before
    > sleep" is purely due to timing accidents...


    Sure, I realize that the interactions between threads have no
    guaranteed order except what the programmer imposes upon them. I
    should have qualified my statement of expectation more carefully.

    > In fact, there are only two orderings you can count on here:
    > T before sleep < T after sleep
    > Main thread finished < atExitFunc called


    I understand your explanation and can live with the consequences, but
    the atexit docs sure don't prepare the reader for this.

    They say, "Functions thus registered are automatically executed upon
    normal interpreter termination." It seems like sophistry to argue that
    "normal interpreter termination" has occurred when there are still
    threads other than the main thread running.

    Suppose that today I promise to donate my body to science "upon my
    death", and tomorrow, I'm diagnosed with a gradual but inexorable
    illness that will kill me within ten years. I wouldn't expect to be
    strapped down and dissected immediately after hearing the diagnosis, on
    the basis that the mere prophecy of my death is tantamount to the death
    itself.
    David Rushby, Jan 12, 2006
    #4
  5. David Rushby

    Tim Peters Guest

    [David Rushby]
    > ...
    > I understand your explanation and can live with the consequences, but
    > the atexit docs sure don't prepare the reader for this.


    In fact, they don't mention threading.py at all.

    > They say, "Functions thus registered are automatically executed upon
    > normal interpreter termination." It seems like sophistry to argue that
    > "normal interpreter termination" has occurred when there are still
    > threads other than the main thread running.


    Well, since atexit callbacks are written in Python, it's absurd on the
    face of it to imagine that they run after the interpreter has torn
    itself down. Clearly Python is still running at that point, or they
    wouldn't get run at all.

    It's also strained to imagine that threads have nothing to do with
    shutdown, since the threading docs say "the entire Python program
    exits when only daemon threads are left". It's not magic that
    prevents Python from exiting when non-daemon threads are still
    running. You happened to use the same non-magical hack that
    threading.py uses to fulfill that promise, and you're seeing
    consequences of their interaction. In Python as well as in C, atexit
    only works well when it's got exactly zero or one users <0.1 wink>.

    You're welcome to suggest text you'd like better, but microscopic
    examination of details most people will never care about makes for bad
    docs in a different way. To get a full picture of how CPython's
    shutdown works, you need to explain all of Py_Finalize() in English,
    and you need to get agreement on which details are accidents and which
    are guaranteed.

    Now it's probably a fact that you couldn't care less about 99.9% of
    those finalization details: you only care about the one that just bit
    you. How are you going to beef up the docs in such a way that you
    would have _found_ the bit you cared about, among the vast bulk of new
    detail you don't care about?

    You aren't, so you could settle for suggesting new words that just
    cover the bit you care about. Give it a try!

    > Suppose that today I promise to donate my body to science "upon my
    > death", and tomorrow, I'm diagnosed with a gradual but inexorable
    > illness that will kill me within ten years. I wouldn't expect to be
    > strapped down and dissected immediately after hearing the diagnosis, on
    > the basis that the mere prophecy of my death is tantamount to the death
    > itself.


    Next time, quit while you're ahead ;-)
    Tim Peters, Jan 12, 2006
    #5
  6. David Rushby

    David Rushby Guest

    [Tim Peters]
    > [David Rushby]
    >> They say, "Functions thus registered are automatically executed upon
    >> normal interpreter termination." It seems like sophistry to argue that
    >> "normal interpreter termination" has occurred when there are still
    >> threads other than the main thread running.

    >
    > Well, since atexit callbacks are written in Python, it's absurd on the
    > face of it to imagine that they run after the interpreter has torn
    > itself down.


    Of course.

    > It's also strained to imagine that threads have nothing to do
    > with shutdown...


    I don't imagine that.

    > You're welcome to suggest text you'd like better...


    What I'd like is for the behavior to become less surprising, so that
    the text could describe reasonable behavior, instead of retrofitting
    the text to more clearly explain (what I regard as) flawed behavior.

    What would be unreasonable about adding a
    join_nondaemonic_threads()
    call before the call to
    call_sys_exitfunc()
    near the beginning of Py_Finalize?

    Instead of _MainThread.__exitfunc having to rely on atexit.register to
    ensure that it gets called, join_nondaemonic_threads would call
    _MainThread.__exitfunc (or some functional equivalent). Both
    join_nondaemonic_threads and call_sys_exitfunc would execute while the
    interpreter "is still entirely intact", as the Py_Finalize comment
    says.

    The opening paragraph of the atexit docs could then read:
    "The atexit module defines a single function to register cleanup
    functions. Functions thus registered are automatically executed when
    the main thread begins the process of tearing down the interpreter,
    which occurs after all other non-daemonic threads have terminated and
    the main thread has nothing but cleanup code left to execute."

    This seems simple. Am I overlooking something?
    David Rushby, Jan 12, 2006
    #6
  7. David Rushby

    Guest

    David> What would be unreasonable about adding a
    David> join_nondaemonic_threads()
    David> call before the call to
    David> call_sys_exitfunc()
    David> near the beginning of Py_Finalize?

    David> Instead of _MainThread.__exitfunc having to rely on
    David> atexit.register to ensure that it gets called,
    David> join_nondaemonic_threads would call _MainThread.__exitfunc (or
    David> some functional equivalent). Both join_nondaemonic_threads and
    David> call_sys_exitfunc would execute while the interpreter "is still
    David> entirely intact", as the Py_Finalize comment says.

    ...

    David> This seems simple. Am I overlooking something?

    A patch? <0.5 wink>

    Skip
    , Jan 13, 2006
    #7
  8. David Rushby

    David Rushby Guest

    [Skip]
    > [David]
    >> This seems simple. Am I overlooking something?

    > A patch? <0.5 wink>


    I'm willing to write a patch if it stands a good chance of being
    accepted. So far, though, Tim has seemed resistant to the idea. Maybe
    he has reasons that I'm ignorant of?
    David Rushby, Jan 13, 2006
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Serve Laurijssen

    atexit and global destructors

    Serve Laurijssen, Jan 15, 2004, in forum: C++
    Replies:
    3
    Views:
    6,971
    tom_usenet
    Jan 15, 2004
  2. JKop
    Replies:
    8
    Views:
    457
  3. Bengt Richter
    Replies:
    0
    Views:
    291
    Bengt Richter
    Aug 25, 2004
  4. Chris Gorton

    atexit not being executed

    Chris Gorton, May 3, 2005, in forum: Python
    Replies:
    0
    Views:
    395
    Chris Gorton
    May 3, 2005
  5. Darren Dale
    Replies:
    8
    Views:
    588
    Darren Dale
    Mar 6, 2009
Loading...

Share This Page