The opener parameter of Python 3 open() built-in

Marco · Sep 3, 2012

Does anyone have an example of utilisation?

Dave Angel · Sep 3, 2012

Does anyone have an example of utilisation?

As of Python 3.2.3, there is no "opener" parameter in the open() function.
http://docs.python.org/py3k/library/functions.html

I don't know of any such parameter in earlier or later versions, but I
could be wrong there.

Marco · Sep 3, 2012

As of Python 3.2.3, there is no "opener" parameter in the open() function.
http://docs.python.org/py3k/library/functions.html

I don't know of any such parameter in earlier or later versions, but I
could be wrong there.

It's new in Python 3.3:

http://docs.python.org/dev/library/functions.html#open

Steven D'Aprano · Sep 4, 2012

Am 03.09.2012 14:32, schrieb Marco:

The opener argument is a new 3.3 feature. For example you can use the
feature to implement exclusive creation of a file to avoid symlink
attacks.

import os

def opener(file, flags):
return os.open(file, flags | os.O_EXCL)

open("newfile", "w", opener=opener)

Why does the open builtin need this added complexity? Why not just call
os.open directly? Or for more complex openers, just call the opener
directly?

What is the rationale for complicating open instead of telling people to
just call their opener directly?

Dennis Lee Bieber · Sep 4, 2012

Why does the open builtin need this added complexity? Why not just call
os.open directly? Or for more complex openers, just call the opener
directly?

Because os.open() returns a low-level file descriptor, not a Python
file object?

What is the rationale for complicating open instead of telling people to
just call their opener directly?

To avoid the new syntax would mean coding the example as

f = os.fdopen(os.open("newfile", flags | os.O_EXCL), "w")

which does NOT look any cleaner to me... Especially not if "opener" is
to be used in more than one location. Furthermore, using "opener" could
allow for a localized change to affect all open statements in the module
-- change file path, open for string I/O rather than file I/O, etc.

Steven D'Aprano · Sep 4, 2012

Because os.open() returns a low-level file descriptor, not a
Python file object?

Good point.

But you can wrap the call to os.open, as you mention below. The only
complication is that you have to give the mode twice, converting between
low-level O_* integer modes and high-level string modes:

a = os.open('/tmp/foo', os.O_WRONLY | os.O_CREAT)
b = os.fdopen(a, 'w')

But to some degree, you still have to do that with the opener argument,
at least in your own head.

To avoid the new syntax would mean coding the example as

f = os.fdopen(os.open("newfile", flags | os.O_EXCL), "w")

which does NOT look any cleaner to me...

Well, I don't know about that. Once you start messing about with low-
level O_* flags, it's never going to exactly be clean no matter what you
do. But I think a one-liner like the above *is* cleaner than a three-
liner like the original:

def opener(file, flags):
return os.open(file, flags | os.O_EXCL)

open("newfile", "w", opener=opener)

although I accept that this is a matter of personal taste.

Particularly if the opener is defined far away from where you eventually
use it. A lambda is arguably better from that perspective:

open("newfile", "w",
opener=lambda file, flags: os.open(file, flags | os.O_EXCL)
)

but none of these solutions are exactly neat or clean. You still have to
mentally translate between string modes and int modes, and make sure
you're not passing the wrong mode:

py> open('junk', 'w').write('hello world')
11
py> open('junk', 'r', opener=lambda file, flags: os.open(file, flags |
os.O_TRUNC)).read() # oops
''

so it's not exactly a high-level interface.

In my opinion, a cleaner, more Pythonic interface would be either:

* allow built-in open to take numeric modes:

open(file, os.O_CREAT | os.O_WRONLY | os.O_EXCL)

* or even more Pythonic, expose those numeric modes using strings:

open(file, 'wx')

That's not as general as an opener, but it covers the common use-case and
for everything else, write a helper function.

Especially not if "opener" is to be used in more than one location.

The usual idiom for fixing the "used more than once" is "write a helper",
not "add a callback function to a builtin"

Furthermore, using "opener" could
allow for a localized change to affect all open statements in the module
-- change file path, open for string I/O rather than file I/O, etc.

A common idiom for that is to shadow open in the module, like this:

_open = open
def open(file, *args):
file = file.lowercase()
return _open(file, *args)

Serhiy Storchaka · Sep 4, 2012

Why does the open builtin need this added complexity? Why not just call
os.open directly? Or for more complex openers, just call the opener
directly?

What is the rationale for complicating open instead of telling people to
just call their opener directly?

See http://bugs.python.org/issue12797.

Serhiy Storchaka · Sep 4, 2012

Does anyone have an example of utilisation?

http://bugs.python.org/issue13424

Dennis Lee Bieber · Sep 4, 2012

why not call that directly?

f = opener(file, flags)

It certainly is cleaner than either of the alternatives so far, and it
doesn't add a parameter to the builtin.

But it returns an OS file descriptor... It doesn't return a Python
file object. From what I can tell, (I've just upgraded to Python 2.7
<G>) the opener is meant to replace the low-level function normally used
by Python's open(), and supplies an fd which gets wrapped by Python's
open().
<type 'file'>

The two are not compatible except by using os.fdopen(fd) to get a
file object, or fo.fileno() to get the low-level file descriptor

I don't know of any real-life code which would be significantly improved
by that. Can you point us to some?

Not really -- but if they went one step further and supplied
"reader" and "writer" operations too, they'd get close to what I once
had to do in FORTRAN 77 under DEC VMS (by hooking in code to do double
buffering when reading data from magtape, while keeping the program
using regular F77 I/O statements; the open statement would do a pre-read
of one buffer and return; subsequent read statements would find a
pre-filled buffer, and issue an non-blocking read to fill the other
buffer -- cut the runtime for the program into a third or less as it was
no longer stuck waiting for slow mag-tape operations each time it did a
read). Implementing something like this in Python would likely require
"opener" to spawn a reader thread to do the I/O asynchronously, using a
limited Queue (1 buffer worth -- the reader thread would be the second
buffer, blocked on Q.put()), and a "reader" that would do Q.get() and
return the result to the Python read() logic for any parsing.

Okay, in Python, one could probably subclass "file", and override
the read methods -- but one would not be able to use the Python
open()... You'd have to do something like f = myFile(normal, open, args)
instead...

Terry Reedy · Sep 4, 2012

See http://bugs.python.org/issue12797.

io.open depends on a function the returns an open file descriptor.
opener exposes that dependency so it can be replaced. (Obviously, one
could go crazily overboard with this idea.) I believe this is a simple
form of dependency injection, though it might be hard to discern from
the Java-inspired verbiage of the Wikipedia article. Part of the
rationale in the issue is to future-proof io.open from any future needs
for alternate fd fetching. It could also be used to decouple a test of
io.open from os.open

Chris Angelico · Sep 4, 2012

io.open depends on a function the returns an open file descriptor. opener
exposes that dependency so it can be replaced.

I skimmed the bug report comments but didn't find an answer to this:
Why not just monkey-patch? When a module function calls on a support
function and you want to change that support function's behaviour,
isn't monkey-patching the most usual?

Several possibilities come to mind, but without knowledge of
internals, I have no idea what's actually the case.
* Patching builtins is too confusing or dangerous, and should be avoided?
* You want to narrow the scope of the patch rather than do it globally?
* Explicit is better than implicit?

It just strikes me as something where an API change may not be necessary.

ChrisA

Terry Reedy · Sep 5, 2012

I skimmed the bug report comments but didn't find an answer to this:
Why not just monkey-patch?

As far as I know, one can only use normal Python code to monkey patch
modules written in Python. Even then, one can only rebind names stored
in writable dicts -- the module dict and class attribute dicts. The
attributes of function code objects are readonly. Replacing a code
object is not for the faint of heart.

io.py mostly loads _io compiled from C.

Antoine Pitrou · Sep 6, 2012

Chris Angelico said:
I skimmed the bug report comments but didn't find an answer to this:
Why not just monkey-patch? When a module function calls on a support
function and you want to change that support function's behaviour,
isn't monkey-patching the most usual?

Monkey-patching globals is not thread-safe: other threads will see your
modification, which is risky and fragile.

Regards

Antoine.

Steven D'Aprano · Sep 6, 2012

Monkey-patching globals is not thread-safe: other threads will see your
modification, which is risky and fragile.

Isn't that assuming that you don't intend the other threads to see the
modification?

If I have two functions in my module that call "open", and I monkey-patch
the global (module-level) name "open" to intercept that call, I don't see
that there is more risk of breakage just because one function is called
from a thread.

Obviously monkey-patching the builtin module itself is much riskier,
because it doesn't just effect code in my module, it affects *everything*.

Child/opener page sometimes returns a blank page after reloading/refreshing the same child/opener in JavaScript	2	Aug 11, 2023
Built-in open() with buffering > 1	4	Aug 24, 2012
Trying to access hdml from an open browser using Python.	1	Jan 18, 2023
isinstance(.., file) for Python 3	5	Nov 8, 2012
py_compile vs. built-in compile, with __future__	7	Jun 10, 2013
PEP/GSoC idea: built-in parser generator module for Python?	0	Mar 14, 2014
Install python 2 and 3 in the "wrong" order	0	Feb 13, 2014
StandardError in Python 2 -> 3	1	Nov 17, 2012

The opener parameter of Python 3 open() built-in

Marco

Dave Angel

Marco

Steven D'Aprano

Dennis Lee Bieber

Steven D'Aprano

Serhiy Storchaka

Serhiy Storchaka

Dennis Lee Bieber

Terry Reedy

Chris Angelico

Terry Reedy

Antoine Pitrou

Steven D'Aprano

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads