Python's "only one way to do it" philosophy isn't good?

Lenard Lindstrom · Jul 2, 2007

Douglas said:
No problem:

[...]

class MyFile(file):

Click to expand...

def __exit__(self, exc_type, exc_val, exc_tb):
if exc_type is not None:
self.my_last_posn = self.tell()
return file.__exit__(self, exc_type, exc_val, exc_tb)

Click to expand...

I'm not sure I understand you here. You're saying that I should have
the foresight to wrap all my file opens is a special class to
facilitate debugging?

If so, (1) I don't have that much foresight and don't want to have
to. (2) I debug code that other people have written, and they often
have less foresight than me. (3) It would make my code less clear to
ever file open wrapped in some special class.

Obviously you had the foresight to realize with statements could
compromise debugging. I never considered it myself. I don't know the
specifics of your particular project, what the requirements are. So I
can't claim this is the right solution for you. But the option is available.

Or are you suggesting that early in __main__.main(), when I wish to
debug something, I do something like:

__builtins__.open = __builtins__.file = MyFile

?

I suppose that would work.

No, I would never suggest replacing a builtin like that. Even replacing
a definite hook like __import__ is risky, should more than one package
try and do it in a program.

I'd still prefer to clear exceptions,
though, in those few cases in which a function has caught an exception
and isn't going to be returning soon and have the resources generally
kept alive in the traceback. To me, that's the more elegant and
general solution.

As long as the code isn't dependent on explicitly cleared exceptions.
But if it is I assume it is well documented.

Douglas Alan · Jul 2, 2007

Obviously you had the foresight to realize with statements could
compromise debugging. I never considered it myself.

It's not really so much a matter of having foresight, as much as
having had experience debugging a fair amount of code. And, at times,
having benefited from the traditional idiomatic way of coding in
Python, where files are not explicitly closed.

Since there are benefits with the typical coding style, and I find
there to be no significant downside, other than if, perhaps some code
holds onto tracebacks, I suggest that the problem be idiomatically
addressed in the *few* code locations that hold onto tracebacks,
rather than in all the *myriad* code locations that open and close
files.

No, I would never suggest replacing a builtin like that. Even
replacing a definite hook like __import__ is risky, should more than
one package try and do it in a program.

That misinterpretation of your idea would only be reasonable while
actually debugging, not for standard execution. Standard rules of
coding elegance don't apply while debugging, so I think the
misinterpretation might be a reasonable alternative. Still I think
I'd just prefer to stick to the status quo in this regard.

As long as the code isn't dependent on explicitly cleared
exceptions. But if it is I assume it is well documented.

Typically the resource in question is an open file. These usually
don't have to be closed in a particularly timely fashion. If, for
some reason, a files absolutelys need to be closed rapidly, then it's
probably best to use "with" in such a case. Otherwise, I vote for the
de facto standard idiom of relying on the refcounter along with
explicitly clearing exceptions in the situations we've previously
discusses.

If some code doesn't explicitly clear an exception, though, and holds
onto the the most recent one while running in a loop (or what have
you), in the cases we are considering, it hardly seems like the end of
the world. It will just take a little bit longer for a single file to
be closed than might ideally be desired. But this lack of ideal
behavior is usually not going to cause much trouble.

|>oug

Douglas Alan · Jul 2, 2007

That's the point. Python takes care of clearing the traceback. Calls
to exc_clear are rarely seen.

But that's probably because it's very rare to catch an exception and
then not return quickly. Typically, the only place this would happen
is in main(), or one of its helpers.

If they are simply a performance tweak then it's not an issue *. I
was just concerned that the calls were necessary to keep resources
from being exhausted.

Well, if you catch an exception and don't return quickly, you have to
consider not only the possibility that there could be some open files
left in the traceback, but also that there could be a large and now
useless data structures stored in the traceback.

Some people here have been arguing that all code should use "with" to
ensure that the files are closed. But this still wouldn't solve the
problem of the large data structures being left around for an
arbitrary amount of time.

But some things will make it into ISO Python.

Is there a movement afoot of which I'm unaware to make an ISO standard
for Python?

Just as long as you have weighed the benefits against a future move
to a JIT-accelerated, continuation supporting PyPy interpreter that
might not use reference counting.

I'll worry about that day when it happens, since many of my calls to
the standard library will probably break anyway at that point. Not to
mention that I don't stay within the confines of Python 2.2, which is
where Jython currently is. (E.g., Jython does not have generators.)
Etc.

Yet improved performance appeared to be a priority in Python 2.4
development, and Python's speed continues to be a concern.

I don't think the refcounting semantics should slow Python down much
considering that it never has aimed for C-level performance anyway.
(Some people claim it's a drag on supporting threads. I'm skeptical,
though.) I can see it being a drag on something like Jython, though,
were you are going through a number of different layers to get from
Jython code to the hardware.

Also, I imagine that no one wants to put in the work in Jython to have
a refcounter when the garbage collector comes with the JVM for free.

|>oug

Lenard Lindstrom · Jul 3, 2007

Douglas said:
That misinterpretation of your idea would only be reasonable while
actually debugging, not for standard execution. Standard rules of
coding elegance don't apply while debugging, so I think the
misinterpretation might be a reasonable alternative. Still I think
I'd just prefer to stick to the status quo in this regard.

I totally missed the "when I wish to debug something". Skimming when I
should be reading.

Lenard Lindstrom · Jul 3, 2007

Douglas said:
Is there a movement afoot of which I'm unaware to make an ISO standard
for Python?

Not that I know of. But it would seem any language that lasts long
enough will become ISO standard. I just meant that even though Python
has no formal standard there are documented promises that will not be
broken lightly by any implementation.

Jorgen Grahn · Jul 4, 2007

I'm not convinced that Python has *any* semantics at all outside of
specific implementations. It has never been standardized to the rigor
of your typical barely-readable language standards document.

Yes, I have no interest at the moment in trying to make my code
portable between every possible implementation of Python, since I have
no idea what features such implementations may or may not support.
When I code in Python, I'm coding for CPython. In the future, I may
do some stuff in Jython, but I wouldn't call it "Python" -- it'd call
it "Jython".

Yeah, especially since Jython is currently (according to the Wikipedia
article) an implementation of Python 2.2 ... not even *I* use
versions that are that old these days!

[I have, for a long time, been meaning to post here about refcounting
and relying on CPython's __del__ semantics, but I've never had the
energy to write clearly or handle the inevitable flame war. So I'll
just note that my view on this seems similar to Doug's.]

/Jorgen

Chris Mellon · Jul 5, 2007

Well, if you catch an exception and don't return quickly, you have to
consider not only the possibility that there could be some open files
left in the traceback, but also that there could be a large and now
useless data structures stored in the traceback.

Some people here have been arguing that all code should use "with" to
ensure that the files are closed. But this still wouldn't solve the
problem of the large data structures being left around for an
arbitrary amount of time.

I don't think anyone has suggested that. Let me be clear about *my*
position: When you need to ensure that a file has been closed by a
certain time, you need to be explicit about it. When you don't care,
just that it will be closed "soonish" then relying on normal object
lifetime calls is sufficient. This is true regardless of whether
object lifetimes are handled via refcount or via "true" garbage
collection. Relying on the specific semantics of refcounting to give
certain lifetimes is a logic error.

For example:

f = some_file() #maybe it's the file store for a database implementation
f.write('a bunch of stuff')
del f
#insert code that assumes f is closed.

This is the sort of code that I warn against writing.

f = some_file()
with f:
f.write("a bunch of stuff")
#insert code that assumes f is closed, but correctly this time

is better.

On the other hand,
f = some_file()
f.write("a bunch of stuff")
#insert code that doesn't care about the state of f

is also fine. It *remains* fine no matter what kind of object lifetime
policy we have. The very worst case is that the file will never be
closed. However, this is exactly the sort of guarantee that GC can't
make, just as it can't ensure that you won't run out of memory. That's
a general case argument about refcounting semantics vs GC semantics,
and there are benefits and disadvantages to both sides.

What I am arguing against are explicit assumptions based on implicit
behaviors. Those are always fragile, and doubly so when the implicit
behavior isn't guaranteed (and, in fact, is explicitly *not*
guaranteed, as with refcounting semantics).

Falcolas · Jul 5, 2007

I don't think anyone has suggested that. Let me be clear about *my*
position: When you need to ensure that a file has been closed by a
certain time, you need to be explicit about it. When you don't care,
just that it will be closed "soonish" then relying on normal object
lifetime calls is sufficient. This is true regardless of whether
object lifetimes are handled via refcount or via "true" garbage
collection. Relying on the specific semantics of refcounting to give
certain lifetimes is a logic error.

For example:

f = some_file() #maybe it's the file store for a database implementation
f.write('a bunch of stuff')
del f
#insert code that assumes f is closed.

This is the sort of code that I warn against writing.

f = some_file()
with f:
f.write("a bunch of stuff")
#insert code that assumes f is closed, but correctly this time

is better.

This has raised a few questions in my mind. So, here's my newbie
question based off this.

Is this:

f = open(xyz)
f.write("wheee")
f.close()
# Assume file is closed properly.

as "safe" as your code:

f = some_file()
with f:
f.write("a bunch of stuff")
#insert code that assumes f is closed, but correctly this time

Thanks!

G

John Nagle · Jul 5, 2007

We may need a guarantee that if you create a local object and
don't copy a strong reference to it to an outer scope, upon exit from
the scope, the object will be destroyed.

John Nagle

Douglas Alan · Jul 5, 2007

I don't think anyone has suggested that. Let me be clear about *my*
position: When you need to ensure that a file has been closed by a
certain time, you need to be explicit about it. When you don't care,
just that it will be closed "soonish" then relying on normal object
lifetime calls is sufficient. This is true regardless of whether
object lifetimes are handled via refcount or via "true" garbage
collection.

But it's *not* true at all when relying only on a "true GC"! Your
program could easily run out of file descriptors if you only have a
real garbage collector and code this way (and are opening lots of
files). This is why destructors are useless in Java -- you can't rely
on them *ever* being called. In Python, however, destructors are
quite useful due to the refcounter.

Relying on the specific semantics of refcounting to give
certain lifetimes is a logic error.

For example:

f = some_file() #maybe it's the file store for a database implementation
f.write('a bunch of stuff')
del f
#insert code that assumes f is closed.

That's not a logic error if you are coding in CPython, though I agree
that in this particular case the explicit use of "with" would be
preferable due to its clarity.

|>oug

Lenard Lindstrom · Jul 5, 2007

Falcolas said:
This has raised a few questions in my mind. So, here's my newbie
question based off this.

Is this:

f = open(xyz)
f.write("wheee")
f.close()
# Assume file is closed properly.

This will not immediately close f if f.write raises an exception since
the program stack is kept alive as a traceback.

as "safe" as your code:

f = some_file()
with f:
f.write("a bunch of stuff")
#insert code that assumes f is closed, but correctly this time

The with statement is designed to be safer. It contains an implicit
try/finally that lets the file close itself in case of an exception.

Chris Mellon · Jul 6, 2007

But it's *not* true at all when relying only on a "true GC"! Your
program could easily run out of file descriptors if you only have a
real garbage collector and code this way (and are opening lots of
files). This is why destructors are useless in Java -- you can't rely
on them *ever* being called. In Python, however, destructors are
quite useful due to the refcounter.

Sure, but thats part of the general refcounting vs GC argument -
refcounting gives (a certain level of) timeliness in resource
collection, GC often only runs under memory pressure. If you're saying
that we should keep refcounting because it provides better handling of
non-memory limited resources like file handles, I probably wouldn't
argue. But saying we should keep refcounting because people like to
and should write code that relies on implicit scope level object
destruction I very strongly argue against.

That's not a logic error if you are coding in CPython, though I agree
that in this particular case the explicit use of "with" would be
preferable due to its clarity.

I stand by my statement. I feel that writing code in this manner is
like writing C code that assumes uninitialized pointers are 0 -
regardless of whether it works, it's erroneous and bad practice at
best, and actively harmful at worst.

Douglas Alan · Jul 6, 2007

Chris Mellon said:
Sure, but thats part of the general refcounting vs GC argument -
refcounting gives (a certain level of) timeliness in resource
collection, GC often only runs under memory pressure. If you're
saying that we should keep refcounting because it provides better
handling of non-memory limited resources like file handles, I
probably wouldn't argue. But saying we should keep refcounting
because people like to and should write code that relies on implicit
scope level object destruction I very strongly argue against.

And why would you do that? People rely very heavily in C++ on when
destructors will be called, and they are in fact encouraged to do so.
They are, in fact, encouraged to do so *so* much that constructs like
"finally" and "with" have been rejected by the C++ BDFL. Instead, you
are told to use smart pointers, or what have you, to clean up your
allocated resources.

I so no reason not to make Python at least as expressive a programming
language as C++.

I stand by my statement. I feel that writing code in this manner is
like writing C code that assumes uninitialized pointers are 0 -
regardless of whether it works, it's erroneous and bad practice at
best, and actively harmful at worst.

That's a poor analogy. C doesn't guarantee that pointers will be
initialized to 0, and in fact, they typically are not. CPython, on
other other hand, guarantees that the refcounter behaves a certain
way.

There are languages other than C that guarantee that values are
initialized in certain ways. Are you going to also assert that in
those languages you should not rely on the initialization rules?

|>oug

Chris Mellon · Jul 9, 2007

And why would you do that? People rely very heavily in C++ on when
destructors will be called, and they are in fact encouraged to do so.
They are, in fact, encouraged to do so *so* much that constructs like
"finally" and "with" have been rejected by the C++ BDFL. Instead, you
are told to use smart pointers, or what have you, to clean up your
allocated resources.

For the record, C++ doesn't have a BDFL. And yes, I know that it's
used all the time in C++ and is heavily encouraged. However, C++ has
totally different object semantics than Python, and there's no reason
to think that we should use it because a different language with
different rules does it. For one thing, Python doesn't have the
concept of stack objects that C++ does.

I so no reason not to make Python at least as expressive a programming
language as C++.

I have an overwhelming urge to say something vulgar here. I'm going to
restrain myself and point out that this isn't a discussion about
expressiveness.

That's a poor analogy. C doesn't guarantee that pointers will be
initialized to 0, and in fact, they typically are not. CPython, on
other other hand, guarantees that the refcounter behaves a certain
way.

It's a perfect analogy, because the value of an uninitialized pointer
in C is *implementation dependent*. The standard gives you no guidance
one way or the other, and an implementation is free to assign any
value it wants. Including 0, and it's not uncommon for implementations
to do so, at least in certain configurations.

The Python language reference explicitly does *not* guarantee the
behavior of the refcounter. By relying on it, you are relying on an
implementation specific, non-specified behavior. Exactly like you'd be
doing if you rely on the value of uninitialized variables in C.

There are languages other than C that guarantee that values are
initialized in certain ways. Are you going to also assert that in
those languages you should not rely on the initialization rules?

Of course not. Because they *do* guarantee and specify that. C
doesn't, and neither does Python.

Douglas Alan · Jul 9, 2007

For the record, C++ doesn't have a BDFL.

Yes, I know.

http://dictionary.reference.com/browse/analogy

And yes, I know that it's used all the time in C++ and is heavily
encouraged. However, C++ has totally different object semantics than
Python,

That would depend on how you program in C++. If you use a framework
based on refcounted smart pointers, then it is rather similar.
Especially if you back that up in your application with a conservative
garbage collector, or what have you.

I have an overwhelming urge to say something vulgar here. I'm going
to restrain myself and point out that this isn't a discussion about
expressiveness.

Says who?

It's a perfect analogy, because the value of an uninitialized pointer
in C is *implementation dependent*.

Name one common C implementation that guarantees that uninitialized
pointers will be initialized to null. None that I have *ever* used
make such a guarantee. In fact, uninitialized values have always been
garbage with every C compiler I have ever used.

If gcc guaranteed that uninitialized variables would always be zeroed,
and you knew that your code would always be compiled with gcc, then
you would be perfectly justified in coding in a style that assumed
null values for uninitialized variables. Those are some big if's,
though.

The Python language reference explicitly does *not* guarantee the
behavior of the refcounter.

Are you suggesting that it is likely to change? If so, I think you
will find a huge uproar about it.

By relying on it, you are relying on an implementation specific,
non-specified behavior.

I'm relying on a feature that has worked fine since the early '90s,
and if it is ever changed in the future, I'm sure that plenty of other
language changes will come along with it that will make adapting code
that relies on this feature to be the least of my porting worries.

Exactly like you'd be doing if you rely on the value of
uninitialized variables in C.

Exactly like I'd be doing if I made Unix system calls in my C code.
After all, system calls are implementation dependent, aren't they?
That doesn't mean that I don't rely on them every day.

Of course not. Because they *do* guarantee and specify that. C
doesn't, and neither does Python.

CPython does by tradition *and* by popular will.

Also the language reference manual specifically indicates that CPython
uses a refcounter and documents that it collects objects as soon as
they become unreachable (with the appropriate caveats about circular
references, tracing, debugging, and stored tracebacks).

|>oug

Steve Holden · Jul 9, 2007

Douglas said:
Are you suggesting that it is likely to change? If so, I think you
will find a huge uproar about it.

I'm relying on a feature that has worked fine since the early '90s,
and if it is ever changed in the future, I'm sure that plenty of other
language changes will come along with it that will make adapting code
that relies on this feature to be the least of my porting worries.

Damn, it seems to be broken on my Jython/IronPython installations, maybe
I should complain. Oh no, I can't, because it *isn't* *part* *of* *the*
*language*. ...

Exactly like I'd be doing if I made Unix system calls in my C code.
After all, system calls are implementation dependent, aren't they?
That doesn't mean that I don't rely on them every day.

That depends on whether you program to a specific standard or not.

CPython does by tradition *and* by popular will.

But you make the mistake of assuming that Python is CPython, which it isn't.

Also the language reference manual specifically indicates that CPython
uses a refcounter and documents that it collects objects as soon as
they become unreachable (with the appropriate caveats about circular
references, tracing, debugging, and stored tracebacks).

Indeed, but that *is* implementation dependent. As long as you stick to
CPython you'll be fine. That's allowed. Just be careful about the
discussions you get into

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------

Douglas Alan · Jul 9, 2007

Damn, it seems to be broken on my Jython/IronPython installations,
maybe I should complain. Oh no, I can't, because it *isn't* *part*
*of* *the* *language*. ...

As I have mentioned *many* times, I'm coding in CPython 2.5, and I
typically make extensive use of Unix-specific calls. Consequently, I
have absolutely no interest in making my code compatible with Jython
or IronPython, since Jython is stuck at 2.2, IronPython at 2.4, and
neither provide full support for the Python Standard Library or access
to Unix-specific functionality.

I might at some point want to write some Jython code to make use of
Java libraries, but when I code in Jython, I will have absolutely no
interest in trying to make that code compatible with CPython, since
that cannot be if my Jython code calls libraries that are not
available to CPython.

That depends on whether you program to a specific standard or not.

What standard would that be? Posix is too restrictive.
BSD/OSX/Linux/Solaris are all different. I make my program work on
the platform I'm writing it for (keeping in mind what platforms I
might want to port to in the future, in order to avoid obvious
portability pitfalls), and then if the program needs to be ported
eventually to another platforms, I figure out how to do that when the
time comes.

But you make the mistake of assuming that Python is CPython, which it isn't.

I do not make that mistake. I refer to CPython as "Python" as does
99% of the Python community. When I talk about Jython, I call in
"Jython" and when I talk about "IronPython" I refer to it as
"IronPython". None of this implies that I don't understand that
CPython has features in it that a more strict interpretation of the
word "Python" doesn't necessarily have, just as when I call a tomato a
"vegetable" that doesn't mean that I don't understand that it is
really a fruit.

Indeed, but that *is* implementation dependent. As long as you stick
to CPython you'll be fine. That's allowed. Just be careful about the
discussions you get into

I've stated over and over again that all I typically care about is
CPython, and what I'm criticized for is for my choice to program for
CPython, rather than for a more generalized least-common-denominator
"Python".

When I program for C++, I also program for the compilers and OS'es
that I will be using, as trying to write C++ code that will compile
under all C++ compilers and OS'es is an utterly losing proposition.

|>oug

Python's "only one way to do it" philosophy isn't good?

Lenard Lindstrom

Douglas Alan

Douglas Alan

Lenard Lindstrom

Lenard Lindstrom

Jorgen Grahn

Chris Mellon

Falcolas

John Nagle

Douglas Alan

Lenard Lindstrom

Chris Mellon

Douglas Alan

Chris Mellon

Douglas Alan

Steve Holden

Douglas Alan

Members online

Forum statistics

Latest Threads