multiprocessing module and os.close(sys.stdin.fileno())

Discussion in 'Python' started by Graham Dumpleton, Feb 18, 2009.

  1. Why is the multiprocessing module, ie., multiprocessing/process.py, in
    _bootstrap() doing:

    os.close(sys.stdin.fileno())

    rather than:

    sys.stdin.close()

    Technically it is feasible that stdin could have been replaced with
    something other than a file object, where the replacement doesn't have
    a fileno() method.

    In that sort of situation an AttributeError would be raised, which
    isn't going to be caught as either OSError or ValueError, which is all
    the code watches out for.

    Graham
     
    Graham Dumpleton, Feb 18, 2009
    #1
    1. Advertising

  2. Graham Dumpleton

    Jesse Noller Guest

    On Tue, Feb 17, 2009 at 10:34 PM, Graham Dumpleton
    <> wrote:
    > Why is the multiprocessing module, ie., multiprocessing/process.py, in
    > _bootstrap() doing:
    >
    > os.close(sys.stdin.fileno())
    >
    > rather than:
    >
    > sys.stdin.close()
    >
    > Technically it is feasible that stdin could have been replaced with
    > something other than a file object, where the replacement doesn't have
    > a fileno() method.
    >
    > In that sort of situation an AttributeError would be raised, which
    > isn't going to be caught as either OSError or ValueError, which is all
    > the code watches out for.
    >
    > Graham
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >


    I don't know why it was implemented that way. File an issue on the
    tracker and assign it to me (jnoller) please.
     
    Jesse Noller, Feb 19, 2009
    #2
    1. Advertising

  3. On Feb 19, 1:16 pm, Jesse Noller <> wrote:
    > On Tue, Feb 17, 2009 at 10:34 PM, Graham Dumpleton
    >
    >
    >
    > <> wrote:
    > > Why is the multiprocessing module, ie., multiprocessing/process.py, in
    > > _bootstrap() doing:

    >
    > >  os.close(sys.stdin.fileno())

    >
    > > rather than:

    >
    > >  sys.stdin.close()

    >
    > > Technically it is feasible that stdin could have been replaced with
    > > something other than a file object, where the replacement doesn't have
    > > a fileno() method.

    >
    > > In that sort of situation an AttributeError would be raised, which
    > > isn't going to be caught as either OSError or ValueError, which is all
    > > the code watches out for.

    >
    > > Graham
    > > --
    > >http://mail.python.org/mailman/listinfo/python-list

    >
    > I don't know why it was implemented that way. File an issue on the
    > tracker and assign it to me (jnoller) please.


    Created as:

    http://bugs.python.org/issue5313

    I don't see option to assign, so you are on nosy list to start with.

    Graham
     
    Graham Dumpleton, Feb 19, 2009
    #3
  4. On Feb 21, 4:20 pm, Joshua Judson Rosen <> wrote:
    > Jesse Noller <> writes:
    >
    > > On Tue, Feb 17, 2009 at 10:34 PM, Graham Dumpleton
    > > <> wrote:
    > > > Why is the multiprocessing module, ie., multiprocessing/process.py, in
    > > > _bootstrap() doing:

    >
    > > >  os.close(sys.stdin.fileno())

    >
    > > > rather than:

    >
    > > >  sys.stdin.close()

    >
    > > > Technically it is feasible that stdin could have been replaced with
    > > > something other than a file object, where the replacement doesn't have
    > > > a fileno() method.

    >
    > > > In that sort of situation an AttributeError would be raised, which
    > > > isn't going to be caught as either OSError or ValueError, which is all
    > > > the code watches out for.

    >
    > > I don't know why it was implemented that way. File an issue on the
    > > tracker and assign it to me (jnoller) please.

    >
    > My guess would be: because it's also possible for sys.stdin to be a
    > file that's open in read+*write* mode, and for that file to have
    > pending output buffered (for example, in the case of a socketfile).


    If you are going to have a file that is writable as well as readable,
    such as a socket, then likely that sys.stdout/sys.stderr are going to
    be bound to it at the same time. If that is the case then one should
    not be using close() at all as it will then also close the write side
    of the pipe and cause errors when code subsequently attempts to write
    to sys.stdout/sys.stderr.

    In the case of socket you would actually want to use shutdown() to
    close just the input side of the socket.

    What this all means is that what is the appropriate thing to do is
    going to depend on the environment in which the code is used. Thus,
    having the behaviour hard wired a certain way is really bad. There
    perhaps instead should be a way of a user providing a hook function to
    be called to perform any case specific cleanup of stdin, stdout and
    stderr, or otherwise reassign them.

    That this is currently in the _bootstrap() function, which does other
    important stuff, doesn't exactly make it look like it is easily
    overridden to work for a specific execution environment which is
    different to the norm.

    > There's a general guideline, inherited from C, that one should ensure
    > that the higher-level close() routine is invoked on a given
    > file-descriptor in at most *one* process after that descriptor has
    > passed through a fork(); in the other (probably child) processes, the
    > lower-level close() routine should be called to avoid a
    > double-flush--whereby buffered data is flushed out of one process, and
    > then the *same* buffered data is flushed out of the (other)
    > child-/parent-process' copy of the file-object.
    >
    > So, if you call sys.stdin.close() in the child-process in
    > _bootstrap(), then it could lead to a double-flush corrupting output
    > somewhere in the application that uses the multiprocessing module.
    >
    > You can expect similar issues with just about /any/ `file-like objects'
    > that might have `file-like semantics' of buffering data and flushing
    > it on close, also--because you end up with multiple copies of the same
    > object in `pre-flush' state, and each copy tries to flush at some point.
    >
    > As such, I'd recommend against just using .close(); you might use
    > something like `if hasattr(sys.stdin, "fileno"): ...'; but, if your
    > `else' clause unconditionally calls sys.stdin.close(), then you still
    > have double-flush problems if someone's set sys.stdin to a file-like
    > object with output-buffering.
    >
    > I guess you could try calling that an `edge-case' and seeing if anyone
    > screams. It'd be sort-of nice if there was finer granularity in the
    > file API--maybe if file.close() took a boolean `flush' argument....


    Graham
     
    Graham Dumpleton, Feb 21, 2009
    #4
  5. On Feb 22, 12:52 pm, Joshua Judson Rosen <> wrote:
    > Graham Dumpleton <> writes:
    >
    > > On Feb 21, 4:20 pm, Joshua Judson Rosen <> wrote:
    > > > Jesse Noller <> writes:

    >
    > > > > On Tue, Feb 17, 2009 at 10:34 PM, Graham Dumpleton
    > > > > <> wrote:
    > > > > > Why is the multiprocessing module, ie., multiprocessing/process.py, in
    > > > > > _bootstrap() doing:

    >
    > > > > >  os.close(sys.stdin.fileno())

    >
    > > > > > rather than:

    >
    > > > > >  sys.stdin.close()

    >
    > > > > > Technically it is feasible that stdin could have been replaced with
    > > > > > something other than a file object, where the replacement doesn't have
    > > > > > a fileno() method.

    >
    > > > > > In that sort of situation an AttributeError would be raised, which
    > > > > > isn't going to be caught as either OSError or ValueError, which is all
    > > > > > the code watches out for.

    >
    > > > > I don't know why it was implemented that way. File an issue on the
    > > > > tracker and assign it to me (jnoller) please.

    >
    > > > My guess would be: because it's also possible for sys.stdin to be a
    > > > file that's open in read+*write* mode, and for that file to have
    > > > pending output buffered (for example, in the case of a socketfile).

    >
    > > If you are going to have a file that is writable as well as readable,
    > > such as a socket, then likely that sys.stdout/sys.stderr are going to
    > > be bound to it at the same time.

    >
    > Yes.
    >
    > > If that is the case then one should not be using close() at all

    >
    > If you mean stdin.close(), then that's what I said :)


    Either. The problem is that same, it close for both read and write and
    if was expecting to still be able to write because used for stdout or
    stderr, then will not work.

    > > as it will then also close the write side of the pipe and cause
    > > errors when code subsequently attempts to write to
    > > sys.stdout/sys.stderr.

    >
    > > In the case of socket you would actually want to use shutdown() to
    > > close just the input side of the socket.

    >
    > Sure--but isn't this "you" the /calling/ code that set the whole thing
    > up? What the /caller/ does with its stdio is up to /him/, and beyond
    > the scope of the present discourse. I can appreciate a library forking
    > and then using os.close() on stdio (it protects my files from any I/O
    > the subprocess might think it wants to do with them), but I think I
    > might be even more annoyed if it *shutdown my sockets*


    Ah, yeah, forgot that shutdown does end to end shutdown rather than
    just that file object reference. :)

    Graham

    > than if it
    > caused double-flushes (there's at least a possibility that I could
    > cope with the double-flushes by just ensuring that *I* flushed before
    > the fork--not so with socket.shutdown()!)
    >
    > > What this all means is that what is the appropriate thing to do is
    > > going to depend on the environment in which the code is used. Thus,
    > > having the behaviour hard wired a certain way is really bad. There
    > > perhaps instead should be a way of a user providing a hook function to
    > > be called to perform any case specific cleanup of stdin, stdout and
    > > stderr, or otherwise reassign them.

    >
    > Usually, I'd say that that's what the methods on the passed-in object
    > are for. Though, as I said--the file-object API is lacking, here :(
    >
    > > > As such, I'd recommend against just using .close(); you might use
    > > > something like `if hasattr(sys.stdin, "fileno"): ...'; but, if your
    > > > `else' clause unconditionally calls sys.stdin.close(), then you still
    > > > have double-flush problems if someone's set sys.stdin to a file-like
    > > > object with output-buffering.

    >
    > > > I guess you could try calling that an `edge-case' and seeing if anyone
    > > > screams. It'd be sort-of nice if there was finer granularity in the
    > > > file API--maybe if file.close() took a boolean `flush' argument....

    >
    > --
    > Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr)))).
     
    Graham Dumpleton, Feb 22, 2009
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. chuck
    Replies:
    4
    Views:
    869
    chuck
    Jul 18, 2005
  2. Replies:
    2
    Views:
    689
    velle
    Jan 5, 2006
  3. Replies:
    0
    Views:
    507
  4. Lee Skillen
    Replies:
    1
    Views:
    204
  5. Thomas Hafner

    load "/proc/self/fd/#{$stdin.fileno}"

    Thomas Hafner, Apr 19, 2009, in forum: Ruby
    Replies:
    3
    Views:
    105
    Thomas Hafner
    Apr 22, 2009
Loading...

Share This Page