"Don't rebind built-in names*" - it confuses readers

T

Terry Jan Reedy

Many long-time posters have advised "Don't rebind built-in names*.

* Unless you really mean to mask it, or more likely wrap it, such as
wrapping print to modify some aspect of its operation than one cannot do
with its keyword parameters. The point for this post is that such
wrapping modify or extend the basic meaning of the builtin, but do not
abolish it.

Reasons have been given in various related forms: 'my long experience
tells me its bad', 'you may need the builtin later', 'you may forget
that you rebound the builtin, 'it can lead to subtle bugs, etc.

Leaving aside the code writer and code operation, I recently discovered
that it is not nice for readers, whether humans or programs.

For instance, open Lib/idlelib/GrepDialog.py in an editor that colorizes
Python syntax, such as Idle's editor, jump down to the bottom and read
up, and (until it is patched) find
list.append(fn)
with 'list' colored as a builtin. Stop. That looks wrong. List.append
needs two arguments: a list instance and an object to append to the
list. The 'solution' is in a previous line
list = []
Reading further, one sees that the function works with two lists, a list
of file names, unfortunately called 'list', and a list of
subdirectories, more sensibly call 'subdirs'. I was initially confused
and reading the code still takes a small bit of extra mental energy.
Idle stays confused and will wrongly color the list instance name until
it is changed. Calling the file list 'fnames' or 'filenames' would have
been clearer to both me and Idle.
 
R

rusi

Many long-time posters have advised "Don't rebind built-in names*.

* Unless you really mean to mask it, or more likely wrap it, such as
wrapping print to modify some aspect of its operation than one cannot do
with its keyword parameters. The point for this post is that such
wrapping modify or extend the basic meaning of the builtin, but do not
abolish it.

Reasons have been given in various related forms: 'my long experience
tells me its bad', 'you may need the builtin later', 'you may forget
that you rebound the builtin, 'it can lead to subtle bugs, etc.

Leaving aside the code writer and code operation, I recently discovered
that it is not nice for readers, whether humans or programs.

For instance, open Lib/idlelib/GrepDialog.py in an editor that colorizes
Python syntax, such as Idle's editor, jump down to the bottom and read
up, and (until it is patched) find
                     list.append(fn)
with 'list' colored as a builtin. Stop. That looks wrong. List.append
needs two arguments: a list instance and an object to append to the
list. The 'solution' is in a previous line
         list = []
Reading further, one sees that the function works with two lists, a list
of file names, unfortunately called 'list', and a list of
subdirectories, more sensibly call 'subdirs'. I was initially confused
and reading the code still takes a small bit of extra mental energy.
Idle stays confused and will wrongly color the list instance name until
it is changed. Calling the file list 'fnames' or 'filenames' would have
been clearer to both me and Idle.

Pascal introduced the idea of block structure -- introduce a name at
one level, override it at a lower level. [Ok ALgol introduced, Pascal
popularized].
This has caused more trouble than it has solved. And so languages
nowadays tend to 'protect' against this feature.

Here is Erlang's 'sticky' feature
http://books.google.co.in/books?id=...w#v=onepage&q=erlang sticky directory&f=false
that prevents a programmer from overriding a builtin module unless he
explicitly asks for that.
 
S

Steven D'Aprano

For instance, open Lib/idlelib/GrepDialog.py in an editor that colorizes
Python syntax, such as Idle's editor, jump down to the bottom and read
up, and (until it is patched) find
list.append(fn)
with 'list' colored as a builtin. Stop. That looks wrong. List.append
needs two arguments: a list instance and an object to append to the
list. The 'solution' is in a previous line
list = []
Reading further, one sees that the function works with two lists, a list
of file names, unfortunately called 'list', and a list of
subdirectories, more sensibly call 'subdirs'.

Yes, that is a poor choice of names.

But sometimes you are dealing with a generic list, and calling it
"filenames" would be equally inappropriate :)

I was initially confused
and reading the code still takes a small bit of extra mental energy.
Idle stays confused and will wrongly color the list instance name until
it is changed. Calling the file list 'fnames' or 'filenames' would have
been clearer to both me and Idle.

Correct. The downside of editors that colourise text is that sometimes
they colourise it wrong. In this case, how is the editor supposed to know
that list no longer refers to the built-in list?

This is yet another good argument for being cautious about shadowing
built-ins.
 
S

Steven D'Aprano

Pascal introduced the idea of block structure -- introduce a name at one
level, override it at a lower level. [Ok ALgol introduced, Pascal
popularized].
This has caused more trouble than it has solved.

I take it you have never programmed in a programming language with a
single, flat, global namespace? :)

And so languages nowadays tend to 'protect' against this feature.

Apart from Erlang, got any other examples? Because it seems to me that in
languages with nested scopes or namespaces, shadowing higher levels is
exactly the right thing to do. Certainly it would be a PITA, and defeat
the purpose of having nested scopes, if inner names had to be globally
unique. Wouldn't it be absolutely horrible if adding a global variable
"foo"[1] suddenly meant that all your functions that used "foo" as a
local variable stopped working?




[1] For some value of "foo".
 
R

rusi

Pascal introduced the idea of block structure -- introduce a name at one
level, override it at a lower level. [Ok ALgol introduced, Pascal
popularized].
This has caused more trouble than it has solved.
And so languages nowadays tend to 'protect' against this feature.

Apart from Erlang, got any other examples? Because it seems to me that in
languages with nested scopes or namespaces, shadowing higher levels is
exactly the right thing to do.

This is just opening up the definition of block-structure and saying
its a good thing.
Certainly it would be a PITA, and defeat
the purpose of having nested scopes, if inner names had to be globally
unique. Wouldn't it be absolutely horrible if adding a global variable
"foo"[1] suddenly meant that all your functions that used "foo" as a
local variable stopped working?

[1] For some value of "foo".

Your opinion.

Not so convincing if the sequence of composing the program was the
other-way-round:
if I have a global variable, say errno, and 'lose' it by introducing a
local variable errno.

And in fact for a reader of a program, the order of its writing should
not matter.
Which brings us pat into Terry's example. [Also notice that changing
from a 'parametric-semantic' name like foo to a more 'fixed-semantic'
name like 'errno' or 'list' changes the desirability of this feature.
I take it you have never programmed in a programming language with a
single, flat, global namespace? :)

Well Ive used Basic and Assembler -- which are fun in the way that
childhood and mountaineering respectively are fun.

What it seems you are not getting about Erlang's outlook about block-
structure is this: There are two separable aspects to it:
1. Names can be created in local scopes which dont leak into (more)
global scopes -- a desirable feature

2. Names in local scopes can override names in global scope -- a
dangerous feature [BTW which is what this thread is about]. And
Erlang's approach seems to be the most nuanced -- you can do it if you
go out of your way to say: "unstick the global namespace".

This is somewhat analogous to gotos in Pascal. For Pascal goto was a
sufficiently undesirable feature that using it was not completely
easy. However if you did surely want it, you had to declare the goto
label.

Or by example:

def foo(x)...
def bar(x,y)...
there is no reason to confuse the two xes.

Whereas

x = ...
def foo(x)...
Now there is!

The first should be encouraged, the second discouraged.
 
C

Chris Angelico

Apart from Erlang, got any other examples? Because it seems to me that in
languages with nested scopes or namespaces, shadowing higher levels is
exactly the right thing to do. Certainly it would be a PITA, and defeat
the purpose of having nested scopes, if inner names had to be globally
unique.

I agree, and it's one of the reasons that I like the explicitness of
C's variable declarations. Sure, Python makes it easier to write code;
but it's easier to figure out what's a global and what's not when
locals are all declared. (Yes, it's not that hard for a human to
recognize when a name is being rebound as opposed to merely used, but
it's extra work for a lexer/syntax highlighter.)

ChrisA
 
S

Steven D'Aprano

Certainly it would be a PITA, and defeat the purpose of having nested
scopes, if inner names had to be globally unique. Wouldn't it be
absolutely horrible if adding a global variable "foo"[1] suddenly meant
that all your functions that used "foo" as a local variable stopped
working?

[1] For some value of "foo".

Your opinion.

Well duh :)

Mind you, I don't hear very many people *defending* the idea that local
variables should be globally unique, or designing languages where this is
the case. So if it's just an opinion, it's an opinion shared by the
majority of programmers and language designers.

Not so convincing if the sequence of composing the program was the
other-way-round:
if I have a global variable, say errno, and 'lose' it by introducing a
local variable errno.

The consequences of inadvertent local-shadows-global are *much* less than
the other way around. Any harm is local to the one function.

If you've shadowed a global with a local, there are two possibilities:

- You intended to shadow the global, in which case, good for you. I'm not
going to past judgement and say you mustn't do this, so long as you're
aware of what you're doing and have your reasons.

- You didn't intend to shadow the global, in which case you've just made
a bug, and you'll soon find out and fix it.

And in fact for a reader of a program, the order of its writing should
not matter.

It doesn't. However, edits to working code can make it become non-
working. That's part of the business of being a programmer. Consider two
scenarios:

- Add a local variable, and suddenly the function which you were editing
stops working? Painful, but business as usual. At least you know that the
bug exists within the function you just edited.

- Add a global variable, and suddenly dozens of functions all over the
place stop working? Or worse, only a small handful of functions stop
working, and you don't find out for a while. It's a lot harder to fix a
bug caused by a new global shadowing your local when you might not even
know that global exists.

Which brings us pat into Terry's example. [Also notice that changing
from a 'parametric-semantic' name like foo to a more 'fixed-semantic'
name like 'errno' or 'list' changes the desirability of this feature.

Absolutely! It makes the ability to shadow globals *more desirable*.

def myfunc(arg, list=list):
do_this()
do_that()
return list(arg)


Now you have a nicely localised, safe, tame monkey-patch, without
compromising on the best name for "list".

Well Ive used Basic and Assembler -- which are fun in the way that
childhood and mountaineering respectively are fun.

What it seems you are not getting about Erlang's outlook about block-
structure is this: There are two separable aspects to it: 1. Names can
be created in local scopes which dont leak into (more) global scopes --
a desirable feature

I can see that it is desirable, although I don't know how this works in
practice in Erland. If you have a global x, and a local x, how do you
refer to them both?

x = x

Which one is which?

2. Names in local scopes can override names in global scope -- a
dangerous feature [BTW which is what this thread is about]. And
Erlang's approach seems to be the most nuanced -- you can do it if you
go out of your way to say: "unstick the global namespace".

I can see that this is also desirable, especially in a more "bondage and
discipline" language that makes you ask permission before doing anything.
I don't think it is desirable *in Python*, which is a lot more laisse
faire.

This is somewhat analogous to gotos in Pascal. For Pascal goto was a
sufficiently undesirable feature that using it was not completely easy.
However if you did surely want it, you had to declare the goto label.

Or by example:

def foo(x)...
def bar(x,y)...
there is no reason to confuse the two xes.

Whereas

x = ...
def foo(x)...
Now there is!

The first should be encouraged, the second discouraged.

Discouraging it means telling people that every time they need a local
variable, they should consider the entire global environment before
choosing a name. I call that bogus. Why shouldn't I call a local variable
"id" if that's the best name for it, just because there's a global "id"
that hardly anyone ever uses? If there's a global "x", and my function
doesn't use it, why shouldn't it reuse "x" for a local if "x" is the best
name in context?

Shadowing has both uses and abuses, pros and cons, and there's no doubt
that it can be confusing to beginners. There are arguments against it,
and I agree with them. But there are also arguments in favour, and I
agree with them too. A good programmer[1] will weigh up the pros and cons
of "use the most readable, descriptive name for the variable" versus
"shadow a global or built-in with the same name" and decide on the merits
of the specific case in question -- should I use a less-appropriate name
("mylist", blah) in the interest of not confusing some readers, or the
right name but risk shadowing a name in a higher scope?

Python takes a very hands-off approach to this. Other languages are more
in-your-face. There is room in the world for both philosophies.



[1] In my opinion of good *wink*
 
T

Terry Jan Reedy

Correct. The downside of editors that colourise text is that sometimes
they colourise it wrong. In this case, how is the editor supposed to know
that list no longer refers to the built-in list?

This is yet another good argument for being cautious about shadowing
built-ins.

After posting I remembered that there are also colorized text blocks on
web pages. Each person will have to decide for themselves whether the
convenience of reusing a builtin name is worth having their code
mis-colorized. As a reader, I decided that it is not.

tjr
 
C

Chris Angelico

Or by example:

def foo(x)...
def bar(x,y)...
there is no reason to confuse the two xes.

Whereas

x = ...
def foo(x)...
Now there is!

The first should be encouraged, the second discouraged.

Again, there can be good reason for it, such as snapshotting globals:

qwer=123
def asdf(qwer=qwer):
print("qwer",qwer)

asdf()
qwer=234
asdf()

Done for performance (avoiding lookups), could also be done for
stability (as depicted here) though I've never seen it needed for
that.

ChrisA
 
R

Rick Johnson

For instance, open Lib/idlelib/GrepDialog.py in an editor that colorizes
Python syntax, such as Idle's editor, jump down to the bottom and read
up, and (until it is patched) find
list.append(fn)
with 'list' colored as a builtin. Stop. That looks wrong. List.append
needs two arguments: a list instance and an object to append to the
list. The 'solution' is in a previous line
list = []
Reading further, one sees that the function works with two lists, a list
of file names, unfortunately called 'list', and a list of
subdirectories, more sensibly call 'subdirs'.
Yes, that is a poor choice of names. But sometimes you are
dealing with a generic list, and calling it "filenames"
would be equally inappropriate :)

I agree, however hopefully you're joking, because in the past you've arguedthat programmers should never use variables as generic as "list", "string", "integer", "float", etc... even though there are instances when all you need to know is what type your working with.
Correct. The downside of editors that colourise text is
that sometimes they colourise it wrong.

One of the most important side-effects of using an editor with colorizing capabilities is to show you that you're using a built-in or keyword as a variable! I love when people comment on subjects they have no direct experience with, like for instance, you commenting on colonizers or GUI's -- LOL!
In this case, how is the editor supposed to know that list
no longer refers to the built-in list?

Colonizers should ALWAYS colorize built-in (as built-in symbols) symbols EXCEPT when that symbol is part of a string or comment.
This is yet another good argument for being cautious about
shadowing built- ins.

In a language designed like Python, yes. Unfortunately Python not only decided to expose built-in functions for constructing types instead of class identifiers, they also stole the best generic names!

Sometimes all you need to know is the type of an object, not what it contains. I remember someone *cough-steven* talking about duck typing and how great it was to just treat a duck like a duck. Well, here we find ourselves treating a list like a list and your taking the opposite argument... why am inot surprised?

PS: Is that "D" in last name short for "DevilsAdvocate"? Steven "DevilsAdvocate" Prano.
 
S

Serhiy Storchaka

11.06.13 06:02, Steven D'Aprano напиÑав(ла):
Apart from Erlang, got any other examples?

C#? At least local variable can't shadow other local variable in outer
scope (and it looks reasonable). I'm not sure about globals and instance
fields.
 
S

Steven D'Aprano

I agree, however hopefully you're joking, because in the past you've
argued that programmers should never use variables as generic as "list",
"string", "integer", "float", etc... even though there are instances
when all you need to know is what type your working with.

Do you have a reference for me saying that one should NEVER use generic
names? That doesn't sound like something I would say. Sometimes you're
writing a generic function that operates on a generic variable in a
generic fashion, so of course you should use a generic name.

One of the most important side-effects of using an editor with
colorizing capabilities is to show you that you're using a built-in or
keyword as a variable!

I wouldn't exactly call it a "side-effect", since distinguishing tokens
in your source code by category is the whole purpose of colouring source
code in the first place.

I love when people comment on subjects they have
no direct experience with, like for instance, you commenting on
colonizers or GUI's -- LOL!

I must admit I have no experience with colonizers, although of course I
do have a colon of my very own. It works away absorbing water and
nutrients without my active supervision.
 
C

Chris Angelico

PS: Is that "D" in last name short for "DevilsAdvocate"? Steven "DevilsAdvocate" Prano.

I don't think so. Somehow it seems unlikely that he'll argue for you.

ChrisA
 
M

Mark Janssen

This has caused more trouble than it has solved.
I take it you have never programmed in a programming language with a
single, flat, global namespace? :)

Hey, the purpose a programming language (i.e. a language which has a
consistent lexical specification), is to provide some modicum of
structure. Yes, that implies that you're implicitly following a
language designers tacit philosophy (their "ObjectArchitecture") for
relating data to computers, but that's fine. People always have the
option of going back to assembly and starting over.
Apart from Erlang, got any other examples? Because it seems to me that in
languages with nested scopes or namespaces, shadowing higher levels is
exactly the right thing to do.
Really?
int="five"
[int(i) for i in ["1","2","3"]]
TypeError: str is not callable

Now how are you going to get the original int type back?
Certainly it would be a PITA, and defeat
the purpose of having nested scopes, if inner names had to be globally
unique. Wouldn't it be absolutely horrible if adding a global variable
"foo"[1] suddenly meant that all your functions that used "foo" as a
local variable stopped working?

Not necessarily, but this is what I'm talking about in defining a
ObjectArchitecture (or in some circles a "type system").
 
C

Chris Angelico

E

Ethan Furman

Steven said:
Apart from Erlang, got any other examples? Because it seems to me that in
languages with nested scopes or namespaces, shadowing higher levels is
exactly the right thing to do.

Really?

--> int="five"
--> [int(i) for i in ["1","2","3"]]
TypeError: str is not callable

Now how are you going to get the original int type back?

--> del int

Mark Janssen*, you would increase your credibility if you actually *learned* Python.
 
C

Chris Angelico

Magic. :)

Traceback (most recent call last):

5

Not sure of the magic necessary in Python 3. This is definitely
something you don't want to make a habit of...

Same thing works but with a different name:

from builtins import int as _int

ChrisA
 
S

Steven D'Aprano

Apart from Erlang, got any other examples? Because it seems to me that
in languages with nested scopes or namespaces, shadowing higher levels
is exactly the right thing to do.
Really?
int="five"
[int(i) for i in ["1","2","3"]]
TypeError: str is not callable

Yes, really. Not for the example shown above, of course, that's pretty
useless. But one might define a custom int() function, or more common,
you want to define a local variable x without caring whether or not there
is a global variable x.

If you, the programmer, have a good reason for re-defining int as the
string "five", then presumably you *wanted* to get that TypeError. If
not, then it's simply a bug, like any other bug: that you get when you
use the wrong name:

x = 23 # I want x to equal 23, always and forever.
x = 42 # I don't actually want to rebind x, but I can't help myself.
assert x == 23 # This now fails, because I am an idiot.

Should we conclude that, because somebody might accidentally assign a
value to a name without considering the consequences, that assigning
values to names should be forbidden? No, of course not. The solution is
to think before you code, or fix the bug afterwards.

Shadowing builtins is confusing to newbies, I get that. But anyone with
even a modicum of experience will be able to deal with such errors
trivially. If you (generic you) cannot work out what is going on, then
you're not a Python programmer. You're a Python dabbler.

Now how are you going to get the original int type back?

Trivial. Here are three ways:

py> int = "five"
py> int
'five'
py> del int
py> int("42")
42


Or:

py> int = "five"
py> int
'five'
py> type(5)("42")
42


Or:

py> int = "five"
py> import builtins # Use __builtin__ in Python 2.
py> builtins.int("42")
42
 
S

Steven D'Aprano

int="five"
[__builtins__.int(i) for i in ["1","2","3"]]

Don't use __builtins__, it's an implementation detail.

In Python 2.x, there is __builtins__ with an "s" in the global namespace
if you are running CPython, but not necessarily other implementations.
There is __builtin__ with no "s" which is defined by the language, but
you have to import it first.

In Python 3.x, you just import builtins with an "s" and no underscores,
no matter what implementation you use.

It's shadowed, not overwritten.

But even if you override it, you can get it back:


py> import builtins
py> builtins.int = "five" # My monkey has a patch.
py> int("42") # Oh-oh, trouble ahead
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'str' object is not callable
py> builtins.int = type(5)
py> int("42")
42


It may not be quite so simple to recover from *all* such monkey-patches,
but you can always exit Python, edit your code, and start it up again.
It's not like you've patched the actual compiler.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,983
Messages
2,570,187
Members
46,748
Latest member
MerryWhitm

Latest Threads

Top