multiprocessing module and os.close(sys.stdin.fileno())

G

Graham Dumpleton

Why is the multiprocessing module, ie., multiprocessing/process.py, in
_bootstrap() doing:

os.close(sys.stdin.fileno())

rather than:

sys.stdin.close()

Technically it is feasible that stdin could have been replaced with
something other than a file object, where the replacement doesn't have
a fileno() method.

In that sort of situation an AttributeError would be raised, which
isn't going to be caught as either OSError or ValueError, which is all
the code watches out for.

Graham
 
J

Jesse Noller

Why is the multiprocessing module, ie., multiprocessing/process.py, in
_bootstrap() doing:

os.close(sys.stdin.fileno())

rather than:

sys.stdin.close()

Technically it is feasible that stdin could have been replaced with
something other than a file object, where the replacement doesn't have
a fileno() method.

In that sort of situation an AttributeError would be raised, which
isn't going to be caught as either OSError or ValueError, which is all
the code watches out for.

Graham

I don't know why it was implemented that way. File an issue on the
tracker and assign it to me (jnoller) please.
 
G

Graham Dumpleton

My guess would be: because it's also possible for sys.stdin to be a
file that's open in read+*write* mode, and for that file to have
pending output buffered (for example, in the case of a socketfile).

If you are going to have a file that is writable as well as readable,
such as a socket, then likely that sys.stdout/sys.stderr are going to
be bound to it at the same time. If that is the case then one should
not be using close() at all as it will then also close the write side
of the pipe and cause errors when code subsequently attempts to write
to sys.stdout/sys.stderr.

In the case of socket you would actually want to use shutdown() to
close just the input side of the socket.

What this all means is that what is the appropriate thing to do is
going to depend on the environment in which the code is used. Thus,
having the behaviour hard wired a certain way is really bad. There
perhaps instead should be a way of a user providing a hook function to
be called to perform any case specific cleanup of stdin, stdout and
stderr, or otherwise reassign them.

That this is currently in the _bootstrap() function, which does other
important stuff, doesn't exactly make it look like it is easily
overridden to work for a specific execution environment which is
different to the norm.
There's a general guideline, inherited from C, that one should ensure
that the higher-level close() routine is invoked on a given
file-descriptor in at most *one* process after that descriptor has
passed through a fork(); in the other (probably child) processes, the
lower-level close() routine should be called to avoid a
double-flush--whereby buffered data is flushed out of one process, and
then the *same* buffered data is flushed out of the (other)
child-/parent-process' copy of the file-object.

So, if you call sys.stdin.close() in the child-process in
_bootstrap(), then it could lead to a double-flush corrupting output
somewhere in the application that uses the multiprocessing module.

You can expect similar issues with just about /any/ `file-like objects'
that might have `file-like semantics' of buffering data and flushing
it on close, also--because you end up with multiple copies of the same
object in `pre-flush' state, and each copy tries to flush at some point.

As such, I'd recommend against just using .close(); you might use
something like `if hasattr(sys.stdin, "fileno"): ...'; but, if your
`else' clause unconditionally calls sys.stdin.close(), then you still
have double-flush problems if someone's set sys.stdin to a file-like
object with output-buffering.

I guess you could try calling that an `edge-case' and seeing if anyone
screams. It'd be sort-of nice if there was finer granularity in the
file API--maybe if file.close() took a boolean `flush' argument....

Graham
 
G

Graham Dumpleton

If you mean stdin.close(), then that's what I said :)

Either. The problem is that same, it close for both read and write and
if was expecting to still be able to write because used for stdout or
stderr, then will not work.
Sure--but isn't this "you" the /calling/ code that set the whole thing
up? What the /caller/ does with its stdio is up to /him/, and beyond
the scope of the present discourse. I can appreciate a library forking
and then using os.close() on stdio (it protects my files from any I/O
the subprocess might think it wants to do with them), but I think I
might be even more annoyed if it *shutdown my sockets*

Ah, yeah, forgot that shutdown does end to end shutdown rather than
just that file object reference. :)

Graham
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top