Scope rule pecularities

Antoon Pardon · May 12, 2004

Op 2004-05-12 said:
Why wouldn't you just return the second value (or a copy of it) as a result
from the function? That is usually a more flexible choice since it gives
the caller the option of either replacing the original value or using the
modified value somewhere different.

Well because the object could be bound with more than one name.

Anyway, if you really need to do this then the mutable object should have
some sort of updateState method which takes the second object as a
parameter. That way the original object can have control over which
attributes get overwritten and which don't.

And I think it is a pain in the butt to always have to write such an
UpdateState method whenever you want one object to be copied in place
from an other.

Donn Cave · May 12, 2004

Antoon Pardon said:
Well because the object could be bound with more than one name.

And I think it is a pain in the butt to always have to write such an
UpdateState method whenever you want one object to be copied in place
from an other.

I really think this is a cases where you will write better
programs if you let yourself be guided by what you can do
in idiomatic Python.

In modern programming, now that we have pretty much adopted
the "structured" programming model, one of the main remaining
burdens on the programmer is "mutable state". You need to
be able to account for possible changes in values, if you
want to be able to reason about your program. Python isn't
any kind of extreme solution to this problem, it isn't a
"pure" functional language, but it does have some helpful
constraints and this is one of them - you know a function
can't rebind names that you supply as its parameters. The
inability to do this is a feature.

Donn Cave, (e-mail address removed)

Greg Ewing · May 13, 2004

Antoon said:
But it creates confusions because the semantics
is not consistent.

Practicality beats purity, though, and this does seem
to be very practical.

Besides, the semantics are as consistent as anything
else in Python, where the objects being operated on
get to determine the meaning of just about everything.

Antoon Pardon · May 13, 2004

Op 2004-05-12 said:
I really think this is a cases where you will write better
programs if you let yourself be guided by what you can do
in idiomatic Python.

I don't know. My attitude is that I don't program in a language.
I program by outlining structurers, functionality, algorithm,
etc which are then translated
into a language. I judge languages by how much I have to fight
them in getting this translation. Of course the knowledge of
languages has an influence in how I outline these things,
but I don't do the outline with a particular language in
mind.

In modern programming, now that we have pretty much adopted
the "structured" programming model, one of the main remaining
burdens on the programmer is "mutable state". You need to
be able to account for possible changes in values, if you
want to be able to reason about your program. Python isn't
any kind of extreme solution to this problem, it isn't a
"pure" functional language, but it does have some helpful
constraints and this is one of them - you know a function
can't rebind names that you supply as its parameters. The
inability to do this is a feature.

But I don't ask for rebinding, I ask for easy in place copying.

Lets name a new operator "@=" Now given variables A and C
of the same class I would like to see the following:
True

And maybe if D is of an other class
TypError

Antoon Pardon · May 13, 2004

Op 2004-05-13 said:
Practicality beats purity, though, and this does seem
to be very practical.

Well one could drop the immutability of certain classes
instead. One would then loose the immutabilty purity
and gain the consistency purity. In a choice between
those two purities I would choose the latter.

But I don't expect python development to follow
my preference.

Besides, the semantics are as consistent as anything
else in Python, where the objects being operated on
get to determine the meaning of just about everything.

So? Would you argue that the devellopers could just
as easily have implemented "a += b" as equivallent to
"a = a - b" with half of the core classes and called that
just as consistent as choosing it equivallent to
"a = a + b" for all core classes because the objects
being operated on get to determine the meaning?

Andrew Bennetts · May 13, 2004

But I don't ask for rebinding, I ask for easy in place copying.

Lets name a new operator "@=" Now given variables A and C
of the same class I would like to see the following:

True

It's already easy:

from copy import copy

b = a
a = copy(c)

[I'm using lowercase variables for instances, which is the usual
convention... uppercase suggests that it's a class name (or maybe a
constant) to me.]

No new syntax necessary.

And maybe if D is of an other class

TypError

Assignments in Python don't care about what the name is currently bound to,
if anything. They just (re)bind the name to an object -- so having an
assignment to D depend on the type of D's old value would be inconsistent
with the rest of Python.

Many classes support a convention where you can pass a single argument to
their constructor to construct a copy, e.g.:

l2 = list(l1)

So your example could become:

d = d.__class__(c)

Although in practice it's hard to think of an example where I'd want to
dynamically care about the type of a object I'm about to discard like
that... I certainly can't think of a time when I've wanted to do that.

-Andrew.

Andrew Bennetts · May 13, 2004

Op 2004-05-13, Greg Ewing schreef <[email protected]>: [...]

Besides, the semantics are as consistent as anything
else in Python, where the objects being operated on
get to determine the meaning of just about everything.

Click to expand...

So? Would you argue that the devellopers could just
as easily have implemented "a += b" as equivallent to
"a = a - b" with half of the core classes and called that
just as consistent as choosing it equivallent to
"a = a + b" for all core classes because the objects
being operated on get to determine the meaning?

Well, classes get to do whatever they think makes sense: in the end it's all
calls to methods of the object, whether via an actual method call
"obj.foo()" or by an operator "obj * 2" (which calls obj.__mul__(2)) or an
augmented assignment "obj += 'x'" (which calls obj.__iadd__('x')). It's up
to the objects to make sense, not the Python language.

Even though Python chooses to have some immutable builtin types (like
ints and strings) for a variety of practical reasons, augmented assignment
does what people expect in all these cases:

s = 'abc'
s += 'd'

i = 7
i += 3

l = [1, 2, 3]
l += [4, 5, 6]

Augmented assignments are still assignments, and that makes perfect sense to
me -- each of those behave exactly like their obvious longer versions:

s = 'abc'
s = s + 'd'

i = 7
i = i + 3

l = [1, 2, 3]
l = l + [4, 5, 6]

(Yes, there are tricky examples that do behave differently -- but I try to
avoid tricky things, because they tend to be hard to read. I almost only
find I want to use augmented assignment on integers and occasionally
strings.)

-Andrew.

Antoon Pardon · May 13, 2004

Op 2004-05-13 said:
It's already easy:

from copy import copy

b = a
a = copy(c)

Sorry, you are wrong. After this you get

False

[I'm using lowercase variables for instances, which is the usual
convention... uppercase suggests that it's a class name (or maybe a
constant) to me.]

No new syntax necessary.

And maybe if D is of an other class

TypError

Click to expand...

Assignments in Python don't care about what the name is currently bound to,
if anything.

This is not assignment (as is understood in python).

They just (re)bind the name to an object -- so having an
assignment to D depend on the type of D's old value would be inconsistent
with the rest of Python.

Many classes support a convention where you can pass a single argument to
their constructor to construct a copy, e.g.:

l2 = list(l1)

So your example could become:

d = d.__class__(c)

No to get the same effect as what I want you need something like this:

class M:

def cp_from(self, a):
for key in dir(self):
value = getattr(a,key)
setattr(self,key,value)

d.cp_from(c)

Antoon Pardon · May 13, 2004

Op 2004-05-13 said:
Op 2004-05-13, Greg Ewing schreef <[email protected]>: [...]

Besides, the semantics are as consistent as anything
else in Python, where the objects being operated on
get to determine the meaning of just about everything.

Click to expand...

So? Would you argue that the devellopers could just
as easily have implemented "a += b" as equivallent to
"a = a - b" with half of the core classes and called that
just as consistent as choosing it equivallent to
"a = a + b" for all core classes because the objects
being operated on get to determine the meaning?

Click to expand...

Well, classes get to do whatever they think makes sense: in the end it's all
calls to methods of the object, whether via an actual method call
"obj.foo()" or by an operator "obj * 2" (which calls obj.__mul__(2)) or an
augmented assignment "obj += 'x'" (which calls obj.__iadd__('x')). It's up
to the objects to make sense, not the Python language.

And how do you think objects can make sense where the language doesn't?
I don't want to imply the language doesn't make sense at all, but you
can't claim the responsibility is all for the objects if the language
or its core classes have defiencies or are not consistent with each
other.

Even though Python chooses to have some immutable builtin types (like
ints and strings) for a variety of practical reasons, augmented assignment
does what people expect in all these cases:

s = 'abc'
s += 'd'

i = 7
i += 3

l = [1, 2, 3]
l += [4, 5, 6]

Augmented assignments are still assignments, and that makes perfect sense to
me -- each of those behave exactly like their obvious longer versions:

s = 'abc'
s = s + 'd'

i = 7
i = i + 3

l = [1, 2, 3]
l = l + [4, 5, 6]

(Yes, there are tricky examples that do behave differently -- but I try to
avoid tricky things, because they tend to be hard to read. I almost only
find I want to use augmented assignment on integers and occasionally
strings.)

But these things are only hard to read (to you, I don't find it so) because
you are using constants in the expression.

a += b, is just as hard to read whether a and b are lists or integers.

The problem is that because the behaviour with strings and lists is
different I can't write a program in which the behaviour of += is
consistent for all classes because there will always be core classes
for which the behaviour will be different.

It is a bit like having some classes use "+" for substraction. In it self
that wouldn't be so bad, you just have to pay attention. But then
you want to write a class that will work with whatever number type
and it needs to do additions. Now suddenly things get difficult because
a + b doesn't behave consistently among the different number types.

You have the same kind of difficulty now if you want to write a
class/function/module that works with any kind of sequence.

Using the "+=" operator you can't guarantee any kind of consistency.
If I had code like this:

a = b
b += c

I would have no idea at all whether a was changed or not. So
writing code that can work with any sequence becomes a possible
problem spot if you use these kind of operators.

Andrew Bennetts · May 13, 2004

Sorry, you are wrong. After this you get

False

Oh, sorry, I misread -- as you can see, I assumed you meant that C was a
copy of A, when actually you wanted A to be a copy of C.

Now I understand what you mean about an "in-place copy".

This is not assignment (as is understood in python).

Indeed! But it *looks* like assignment -- it has an equals sign. In fact,
it's only one character different to Python's assignment. My point is that
the proposed syntax is confusing.

No to get the same effect as what I want you need something like this:

class M:

def cp_from(self, a):
for key in dir(self):
value = getattr(a,key)
setattr(self,key,value)

d.cp_from(c)

You can do that already, without defining any new methods:

d.__dict__.update(c.__dict__)

[This won't work for the minority of types that don't have __dict__s on
their instances, but it does work for virtually all user-defined classes.]

Assuming that d and c have the same set of instance variable names (which I
would expect most code following good style would do), I don't see why
modifying an existing instance of d to be a copy of c is any better than
simply making a copy from scratch with the clearer:

d = copy(c)

What's the big advantage to mutating an existing instance?

-Andrew.

Andrew Bennetts · May 13, 2004

And how do you think objects can make sense where the language doesn't?
I don't want to imply the language doesn't make sense at all, but you
can't claim the responsibility is all for the objects if the language
or its core classes have defiencies or are not consistent with each
other.

I'm saying that in this case, the objects have a responsibility to make
sense, and this is necessarily true in any language that allows operator
overloading. If an object insists on always returning 7 regardless of what
is added to it, then there's not much hope that the '+' operator will do
what users expect.

Even though Python chooses to have some immutable builtin types (like
ints and strings) for a variety of practical reasons, augmented assignment
does what people expect in all these cases:

s = 'abc'
s += 'd'

i = 7
i += 3

l = [1, 2, 3]
l += [4, 5, 6]

Augmented assignments are still assignments, and that makes perfect sense to
me -- each of those behave exactly like their obvious longer versions:

s = 'abc'
s = s + 'd'

i = 7
i = i + 3

l = [1, 2, 3]
l = l + [4, 5, 6]

(Yes, there are tricky examples that do behave differently -- but I try to
avoid tricky things, because they tend to be hard to read. I almost only
find I want to use augmented assignment on integers and occasionally
strings.)

Click to expand...

But these things are only hard to read (to you, I don't find it so) because
you are using constants in the expression.

I didn't say those examples are hard to read. Personally, I find both
forms equally readable.

I did say that there are cases where:

<lvalue> += x

does not behave the same way as:

<lvalue> = <lvalue> + x

But you need relatively tricky lvalues for this to be the case.

a += b, is just as hard to read whether a and b are lists or integers.

The problem is that because the behaviour with strings and lists is
different I can't write a program in which the behaviour of += is
consistent for all classes because there will always be core classes
for which the behaviour will be different.

It seems to be that your problem isn't really with "+=" as such, it's that
Python has immutable types at all, including in the commonly used builtins.
Immutability of strings, numbers and tuples has worked very well for Python
so far, so you'll have a struggle convincing many people that it should be
otherwise.

Personally, I like that when I do:

x = 7
func(x)

That I know that x will still be 7 after func returns, regardless of what
happens inside func. It's just not possible for func to accidentally change
the value of x in my scope, because I know that numbers are immutable in
Python.

It would take some *massive* benefits to convince me to change my mind.

It is a bit like having some classes use "+" for substraction. In it self
that wouldn't be so bad, you just have to pay attention. But then
you want to write a class that will work with whatever number type
and it needs to do additions. Now suddenly things get difficult because
a + b doesn't behave consistently among the different number types.

Again, if the objects you're using don't "make sense", you're stuck with a
difficult life. If a library's API is hard to use, then chances are the
library won't get used much.

You have the same kind of difficulty now if you want to write a
class/function/module that works with any kind of sequence.

Using the "+=" operator you can't guarantee any kind of consistency.
If I had code like this:

a = b
b += c

I would have no idea at all whether a was changed or not. So
writing code that can work with any sequence becomes a possible
problem spot if you use these kind of operators.

Then take a copy of the sequence first, using list(seq) or tuple(seq), and
then you'll know exactly what you have. Or even simpler -- use: "b = b + c"
instead. Then you know that a is unmodified.

If you don't know what kind of objects your functions are working with, how
can you expect them to behave correctly? If your API assumes that a
particular method receives a tuple, then either make sure it does by calling
tuple(x), or document that it does in the docstring.

Otherwise, a user of your library code might legitimately try to pass a
list, a dictionary, None, or perhaps their own custom type, unaware of your
hidden assumptions.

In my experience, I've never had the issue of mutable vs. immutable sequence
types cause the sort of problems you're worried about. Perhaps I'm lucky,
or subconciously careful, or it has happened and I'm just plain forgetful

.... Regardless, I'm not going to start worrying about it now. Python is
working just fine for me despite this.

-Andrew.

Antoon Pardon · May 13, 2004

Op 2004-05-13 said:
Oh, sorry, I misread -- as you can see, I assumed you meant that C was a
copy of A, when actually you wanted A to be a copy of C.

Now I understand what you mean about an "in-place copy".

Indeed! But it *looks* like assignment -- it has an equals sign. In fact,
it's only one character different to Python's assignment. My point is that
the proposed syntax is confusing.

Well it is a kind of assignment. It is how assignment works in
some other languages. Now I find both kind of assignments usefull.
I don't like it when a language restricts me to one specific kind.

If you don't like the proposed syntax, I'm open to suggestions,

class M:

def cp_from(self, a):
for key in dir(self):
value = getattr(a,key)
setattr(self,key,value)

d.cp_from(c)

Click to expand...

You can do that already, without defining any new methods:

d.__dict__.update(c.__dict__)

[This won't work for the minority of types that don't have __dict__s on
their instances, but it does work for virtually all user-defined classes.]

You can't expect people to type in that.

Assuming that d and c have the same set of instance variable names (which I
would expect most code following good style would do), I don't see why
modifying an existing instance of d to be a copy of c is any better than
simply making a copy from scratch with the clearer:

d = copy(c)

What's the big advantage to mutating an existing instance?

The difference (whether you want to call it an advantage or not
is up to you) is that with d = copy(c), d is rebound to a new
object. The object itself, which could still be bound to other
names, stay the same. What I sometimes want is a change of the
object itself so that the change is visible to all names with which
the object is currently bound.

Antoon Pardon · May 13, 2004

Op 2004-05-13 said:
Op 2004-05-13, Andrew Bennetts schreef <[email protected]>:

Augmented assignments are still assignments, and that makes perfect sense to
me -- each of those behave exactly like their obvious longer versions:

s = 'abc'
s = s + 'd'

i = 7
i = i + 3

l = [1, 2, 3]
l = l + [4, 5, 6]

(Yes, there are tricky examples that do behave differently -- but I try to
avoid tricky things, because they tend to be hard to read. I almost only
find I want to use augmented assignment on integers and occasionally
strings.)

Click to expand...

But these things are only hard to read (to you, I don't find it so) because
you are using constants in the expression.

Click to expand...

I didn't say those examples are hard to read. Personally, I find both
forms equally readable.

I did say that there are cases where:

<lvalue> += x

does not behave the same way as:

<lvalue> = <lvalue> + x

But you need relatively tricky lvalues for this to be the case.

a += b, is just as hard to read whether a and b are lists or integers.

The problem is that because the behaviour with strings and lists is
different I can't write a program in which the behaviour of += is
consistent for all classes because there will always be core classes
for which the behaviour will be different.

Click to expand...

It seems to be that your problem isn't really with "+=" as such, it's that
Python has immutable types at all, including in the commonly used builtins.
Immutability of strings, numbers and tuples has worked very well for Python
so far, so you'll have a struggle convincing many people that it should be
otherwise.

Well I didn't have this problem when I first encounterd python. It
had no "+=" operators then. But with the introduction of these
and the inconsistencies it introduces I have began to think
differently. You are probably right that people will be
hard to convince to change this, so I wont try. I'll just
use the language that suits me best for a particular task
and often enough it will be python, with the warts I think
it has.

Personally, I like that when I do:

x = 7
func(x)

That I know that x will still be 7 after func returns, regardless of what
happens inside func. It's just not possible for func to accidentally change
the value of x in my scope, because I know that numbers are immutable in
Python.

Well personally I like it to be that way for any object, unless func
has somehow announced that it can change the argument. I think a
copy in parameter is a far better way to assure you of that, because
it indeed works for any object, or maybe a seperate attribute that
allows any object to be mutable or immutable depending on the
needs of the moment.

It would take some *massive* benefits to convince me to change my mind.

Again, if the objects you're using don't "make sense", you're stuck with a
difficult life. If a library's API is hard to use, then chances are the
library won't get used much.

Well that is my point, "+=" operators don't make sense for
immutable types.

Then take a copy of the sequence first, using list(seq) or tuple(seq), and
then you'll know exactly what you have.

I consider that a work around.

Or even simpler -- use: "b = b + c"
instead. Then you know that a is unmodified.

Right. In order to write reliable code in a function, class or module
that can work with any sequence I better not write "b += c" but
write "b = b + c" because only the latter guarantees a particual result.
So how good are those "+=" operators if you can't use them in generic
code.

I don't find it convincing that it is convenient to be able to
write "a += b" if you then learn that in generic code you better
write "a = a + b" anyway.

If you don't know what kind of objects your functions are working with, how
can you expect them to behave correctly?

I know what kind of objects: sequences. It is not my fault the operators
on sequence are not consistent.

If your API assumes that a
particular method receives a tuple, then either make sure it does by calling
tuple(x), or document that it does in the docstring.

Otherwise, a user of your library code might legitimately try to pass a
list, a dictionary, None, or perhaps their own custom type, unaware of your
hidden assumptions.

But I want my code to work with any sequence type, the problem is that
the core library doesn't provide a consistent behaviour for all its
sequence types. That is not my fault.

In my experience, I've never had the issue of mutable vs. immutable sequence
types cause the sort of problems you're worried about. Perhaps I'm lucky,
or subconciously careful, or it has happened and I'm just plain forgetful
... Regardless, I'm not going to start worrying about it now. Python is
working just fine for me despite this.

Well that doesn't surprise me much, since the += operators are
relatively new and it is the introduction of those that can
cause such problems. Yes there are workaround such as taking
a copy with tuple or list, Just as you can embed a variable
in an object if you want to change something on an intermediate
scope. I consider the needs of such workarounds as signs that
the language would better be changed in these parts. It is
not that big a deal that it will have a big influence on the
language I'll use, but I still think of these kind of things
as warts.

Aahz · May 13, 2004

Well one could drop the immutability of certain classes instead. One
would then loose the immutabilty purity and gain the consistency
purity. In a choice between those two purities I would choose the
latter.

That won't fly. What do you use for dict keys?

Josiah Carlson · May 13, 2004

The real thing to remember is that Python isn't <insert the language

>
> I think it is reasonable to expect a language to behave consistently.
>
> I would like to be able the understand what += does to
> an object without the need of knowing it is mutable or
> not.

Well then, that is easy, += does whatever the object says it should do.
The only builtin type that is actually modified in place (via +=) are
lists. Everything else uses single name rebinding. Is that a simple
enough rule, because had you done /any/ sort of critical investigation,
you would have discovered it yourself. Your investigation would have
turned up that immutables behave as they should, and a list's __iadd__
method is an alias for its extend method.

>
> Python does more than enought things in an implicit way.

And automatically rebinding /every/ name that references an object is
more explicit? I don't think so.

If all you can say is, "I don't like it, it should be more consistant",
then I ask, "what would be more consistant?" Should we keep a list of
every name that references an object so that functionally, everything is
passed by reference and can be modified at will? Should we force
everything to be mutable, only interning (and making immutable) objects
used as keys in a dictionary? Honestly, I don't believe either option
is a good idea. Let us look at an example and find out why.

Hmmm, dictionaries. Dictionaries are hash tables that contain a key and
value pair. In order to handle dynamic expansion of the hash table, we
must keep the key (Python keeps a pointer to the key, it is much faster
that way). Right now, only immutables are able to be keys. Why is this
so? Let us imagine that we use a mutable string as a key in a
dictionary. Everything is fine and dandy, until we modify the string.
At that point, in order for the dictionary to stay consistant, we must
make a copy of that string to keep it as a key. No big deal, right?
Wrong. What if that string were 500k? 500M? It becomes a big deal.

Similarly, rebinding all names that point to an immutable, when that
immutable is modified via += is a fool's errand, /especially/ when we
can use l = 1000000*[0]. According to the rebinding method, all million
pointers would need to be rebound to the new object, which would be the
same as making an integer mutable, which has the problem described in
the previous paragraph.

I think you'll find Python's handling of all immutable values to
coincide with how C handles single integers (except for the
pass-by-reference in function calls).

>
> So? Is what Guido thinks should happen above criticism. I think
> that if what Guido thinks should happen makes the language
> behave inconsistenly then

No, Guido is not above criticism. The mistakes he believes he has made
will be fixed in the 3.0 release. If you can't wait for it, then
perhaps you should write the interpreter/compiler for the language you
want to use. It seems to be in vogue these days.

- Josiah

Antoon Pardon · May 14, 2004

Op 2004-05-13 said:
That won't fly. What do you use for dict keys?

You make a copy of the key and use the copy in the internals
of the dictionary.

Antoon Pardon · May 14, 2004

Op 2004-05-13 said:
Well then, that is easy, += does whatever the object says it should do.
The only builtin type that is actually modified in place (via +=) are
lists. Everything else uses single name rebinding. Is that a simple
enough rule, because had you done /any/ sort of critical investigation,
you would have discovered it yourself. Your investigation would have
turned up that immutables behave as they should, and a list's __iadd__
method is an alias for its extend method.

Well I think that if you need such a critical investigation, something
is wrong. I could just as well write a vector and matrix module that
uses "-" for addition and if people then complain tell them that they
should have done a critical investigation, then they would have found
out that the "-" works as it should.

And automatically rebinding /every/ name that references an object is
more explicit? I don't think so.

I didn't say that. I just want to point out that python does enough
things implicitly so that the "explicit is better than implicit"
argument is invalid.

If all you can say is, "I don't like it, it should be more consistant",
then I ask, "what would be more consistant?" Should we keep a list of
every name that references an object so that functionally, everything is
passed by reference and can be modified at will? Should we force
everything to be mutable, only interning (and making immutable) objects
used as keys in a dictionary? Honestly, I don't believe either option
is a good idea. Let us look at an example and find out why.

IMO python being the language that it is with immutable objects, the
only consistant solution would have been to just translate a += b
into something equivalent to a = a + b and leave it at that.
No __ixxx__ methods, no inplace modifications.

Hmmm, dictionaries. Dictionaries are hash tables that contain a key and
value pair. In order to handle dynamic expansion of the hash table, we
must keep the key (Python keeps a pointer to the key, it is much faster
that way).

I doubt it is much faster overall. The faster dictionary expansion
is at the expense of having to turn your mutable objects you
want to use as a key into an immutable equivallent.

Right now, only immutables are able to be keys. Why is this
so? Let us imagine that we use a mutable string as a key in a
dictionary. Everything is fine and dandy, until we modify the string.
At that point, in order for the dictionary to stay consistant, we must
make a copy of that string to keep it as a key. No big deal, right?
Wrong. What if that string were 500k? 500M? It becomes a big deal.

It already is a big deal. Because you can't mutate strings, that means
that every time you need a minor change to one a lot of copying will
go on instead of inplace modifications.

Each time you need to do something like name = name[:-3] + newsuffix
you make a copy, with strings that are 500k or 500M that is just
as big an issue.

So either you work with mutable values and then you need to copy
them into an immutable before they can be used as a key or you
work with immutables and then all manipulations you do on them
will involve copying. So Python uses immutables as key because
that is faster, is irrelevant to me because it is just a deviation
of the cost, not a reduction within the whole program.

And besides you can have muttable objects within immutables.

Similarly, rebinding all names that point to an immutable, when that
immutable is modified via += is a fool's errand, /especially/ when we
can use l = 1000000*[0]. According to the rebinding method, all million
pointers would need to be rebound to the new object, which would be the
same as making an integer mutable, which has the problem described in
the previous paragraph.

It would only be a problem with a specific kind of implementation.

I think you'll find Python's handling of all immutable values to
coincide with how C handles single integers (except for the
pass-by-reference in function calls).

Except that if we compare with C, then Python doesn't use integers
but pointers to integers with automatic referencing. And if in
C I do *a += 1, then all pointers to that integer will see the
new value and it wouldn't take a million rebindings in case
of an array/list with a million elements.

No, Guido is not above criticism. The mistakes he believes he has made
will be fixed in the 3.0 release. If you can't wait for it, then
perhaps you should write the interpreter/compiler for the language you
want to use. It seems to be in vogue these days.

I sometimes dream about that, but I fear I don't have the time
nor the motivation to finish such a project properly. I'll
just make a choice among the languages that are available.
In general I like python, the warts sometimes frustrate me,
but I see that as a token of affection, e.g. I don't care
about the warts of C++, because I find it so ugly and
complicated I avoid using it. So I'll wait what 3.0
has is store and see what I like about it.

Josiah Carlson · May 19, 2004

Well then, that is easy, += does whatever the object says it should do.

Well I think that if you need such a critical investigation, something
is wrong. I could just as well write a vector and matrix module that
uses "-" for addition and if people then complain tell them that they
should have done a critical investigation, then they would have found
out that the "-" works as it should.

In this case, you were talking about builtin Python datatypes.
Considering there are a limited number of them (import
types;dir(types)), 5 minutes of checking for yourself would have done
the job.

I didn't say that. I just want to point out that python does enough
things implicitly so that the "explicit is better than implicit"
argument is invalid.

Everything in Python is passed by reference. This makes passing any
kind of value quick and easy. However, how those references are handled
differs depending on the object. What seems to be happening is that we
are arguing over what is the best way to handle those references, after
we know that they are references.

As you mentioned later, in C, you can use *a = 1 to modify a single
instance of 'a'. However, immutables in Python never do the *a = 1
assignment. Mutables do so, which is why the final line of...
m = n = []
m.append(7)
m == n
....is true.

For immutables, on modification, a new instance of the immutable is
created, the value being set accordingly.

> IMO python being the language that it is with immutable objects, the
> only consistant solution would have been to just translate a += b
> into something equivalent to a = a + b and leave it at that.
> No __ixxx__ methods, no inplace modifications.

Now, whether or not list.__iadd__ should do the rough equivalent of
list.extend with a return, I would lean towards yes. += implies an
object modification, where object modifications make sense. In the case
of lists, it does make sense, because they are mutable. For immutables,
it doesn't make sense, because the objects themselves are immutable.

Now that we've gotten to the point, __iadd__ and friends, we know we
disagree. If you feel terribly strongly about it, post a bug report or
feature requrest to SF. I would venture a guess that within a week it
will be closed with a message stating, "yeah, we like it the way it is".

I doubt it is much faster overall. The faster dictionary expansion
is at the expense of having to turn your mutable objects you
want to use as a key into an immutable equivallent.

Except Python doesn't allow mutables as keys! Go ahead, try using a
mutable builtin Python datatype as a key. Notice how it doesn't work?
Yeah, that's because Python doesn't do it. One reason, as you mention,
is that the only practical thing to do when given a mutable is to
translate it (somehow) into an immutable object. Such a translation
would need to occur any time you wanted to use a mutable as a key, which
is one reason why Python doesn't allow such things.

Right now, only immutables are able to be keys. Why is this
so? Let us imagine that we use a mutable string as a key in a
dictionary. Everything is fine and dandy, until we modify the string.
At that point, in order for the dictionary to stay consistant, we must
make a copy of that string to keep it as a key. No big deal, right?
Wrong. What if that string were 500k? 500M? It becomes a big deal.

Click to expand...

It already is a big deal. Because you can't mutate strings, that means
that every time you need a minor change to one a lot of copying will
go on instead of inplace modifications.

Each time you need to do something like name = name[:-3] + newsuffix
you make a copy, with strings that are 500k or 500M that is just
as big an issue.
>
So either you work with mutable values and then you need to copy
them into an immutable before they can be used as a key or you
work with immutables and then all manipulations you do on them
will involve copying. So Python uses immutables as key because
that is faster, is irrelevant to me because it is just a deviation
of the cost, not a reduction within the whole program.

People who mutate large strings repeatedly, and know what the hell they
are doing, tend to use either the array or cStringIO module. For
repeated string manipulations, they tend to be quite fast (a few
encryption algorithms in Python use array).

And besides you can have muttable objects within immutables.

Python already covers this case. Immutable objects (that are
containers) recursively hash their contents.

>>> d = {}
>>> d[([],)] = 1

Click to expand...

Click to expand...

Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: list objects are unhashable

I sometimes dream about that, but I fear I don't have the time
nor the motivation to finish such a project properly. I'll
just make a choice among the languages that are available.
In general I like python, the warts sometimes frustrate me,
but I see that as a token of affection, e.g. I don't care
about the warts of C++, because I find it so ugly and
complicated I avoid using it. So I'll wait what 3.0
has is store and see what I like about it.

Don't hold your breath, 3.0 (by the sounds of it) is roughly 6+ years
off. Once you get used to the "warts", I think you'll find they
dissappear in Python quite quickly.

- Josiah

Terry Reedy · May 19, 2004

Josiah Carlson said:
As you mentioned later, in C, you can use *a = 1 to modify a single
instance of 'a'. However, immutables in Python never do the *a = 1
assignment. Mutables do so,

I am not sure what you mean to say that mutables 'do the *a=1 assignment'.
This does not strike me as a useful viewpoint for understanding Python.

which is why the final line of...
m = n = []

At this point, m is n, not merely m == n. Both names are bound to one and
the same object.

m.append(7)

At this point m is still n since mutation of the one and same object (a
list, in this case) may bind or rebind indexed *slots* within the
collection object but does not rebind either 'm', 'n', or any other name.
I don't see that the C-ism '*a=1' adds anything to this explanation.

m == n

Because m is n ==> m == n for anything without funny special methods that
disable the implication.

For immutables, on modification, a new instance of the immutable is
created, the value being set accordingly.

Since immutable objects cannot be modified, I am not sure what you are
saying. In any case,

name = expression

always binds or rebinds name to the object calculated by expression,
regardless of the mutability of the object, if any, previously bound to
name. Similarly, except for possible bizarre methods,

name.method(args)

never rebinds name, regardless of the mutability of the associated object.
(One possible exception: name is bound at global scope and method checks
globals to find all names bound to self and then modifies globals to rebind
all such names to something else. But no builtin method does anything like
this.)

Now that we've gotten to the point, __iadd__ and friends, we know we
disagree. If you feel terribly strongly about it, post a bug report or
feature requrest to SF.

No, please do not burden the volunteer developers with false bug reports or
useless feature requests. The semantics of += and family were debated when
added a few years ago and will not change for the foreseeable future. Keep
repeated debate and explanation of such things to c.l.p.

I would venture a guess that within a week it
will be closed with a message stating, "yeah, we like it the way it is".

Or it might hang around a couple of years while people focus on real
problems.

About dictionaries: keys must be hashable (have __hash__ method);
mutability is not directly involved (and there is indeed no direct test for
such). It would have been possible to key dictionaries by object id, and
one can do so for user classes with def __hash__(s): return id(s). But for
fairly obvious reasons, builtins are hashed and keyed by value. Perhaps
less obviously, this can be true even across types.

d={0:'zero'}
d[0.0] 'zero'
d[0+0j] 'zero'
d[''] = 'empty'
d[u'']

Click to expand...

Click to expand...

'empty'

Terry J. Reedy

Josiah Carlson · May 20, 2004

As you mentioned later, in C, you can use *a = 1 to modify a single

I am not sure what you mean to say that mutables 'do the *a=1 assignment'.
This does not strike me as a useful viewpoint for understanding Python.

If you read Antoon's post, he mentions that in C you can have multiple
integer pointers pointing to the same memory location, and changing the
contents of that one memory location (*a = 1) changes the "value" of
other names.

As you state, since Python only ever rebinds names on assignments, the
only even near equivalent would be using something like 'a[0] = 1' or
even 'a.p = 1' (for some mutable mapping or settable attribute
respectively), and claim that many names can all reference the same 'a'.

I agree it is not terribly useful for understanding Python, unless one
has background with C or C++, which seems to be the case with Antoon.

which is why the final line of...
m = n = []

Click to expand...

At this point, m is n, not merely m == n. Both names are bound to one and
the same object.

Indeed, I should have used 'is'.

Since immutable objects cannot be modified, I am not sure what you are
saying. In any case,

What I should have said is "On any /attempted/ mutation of an immutable
via the (+-/*|&^%)= (and any other that I have forgotten, if any)
operators, a new object is created with the appropriate modifications,
leaving all previously bound names bound to the original immutable, but
changing the binding of the current name to the new immutable. This is
in stark contrast with a mutable mutation, which will modify and return
the original object (via the __i***__ instance methods), leave all
previously bound names bound to the mutable."

[snip good example]

No, please do not burden the volunteer developers with false bug reports or
useless feature requests. The semantics of += and family were debated when
added a few years ago and will not change for the foreseeable future. Keep
repeated debate and explanation of such things to c.l.p.

I had no problem with the semantics in the first place. I was just
offering a place where the poster could go and get a definitive "go
home, the semantics are not changing". Thank you for giving him the
definitive answer here, saves someone some time on SF.

Or it might hang around a couple of years while people focus on real
problems.

Sorry about that.

About dictionaries: keys must be hashable (have __hash__ method);
mutability is not directly involved (and there is indeed no direct test for
such). It would have been possible to key dictionaries by object id, and
one can do so for user classes with def __hash__(s): return id(s). But for
fairly obvious reasons, builtins are hashed and keyed by value. Perhaps
less obviously, this can be true even across types.

[snip good example]

I agree, it also provides a very convenient method for looking up values
in a dictionary when you no longer have access to the key originally
used. My favorite example of this is...

>>> a = b = 1.0
>>> a is b True
>>> d = {a:'hello'}
>>> b = (b + 1) - 1
>>> a is b False
>>> d

Click to expand...

Click to expand...

'hello'

The problem with mutable keys for dictionaries is that the only real
solution either involves keeping an 'immutable copy' of the original
item for hash-by-value (which Antoon offers in another leaf of this
thread), or keeping a pointer to the mutable key. The first has various
issues, an the second requires that you must have the original object
itself in order to access the value again, which kind of defeats one of
the functional reasons for a dictionary structure.

- Josiah

Class decorator to capture the creation and deletion of objects	0	Feb 25, 2014
Scope of a class..help???	6	May 23, 2013
total_ordering behaviour	2	Jul 20, 2011
How to use a parameter in a class	4	May 3, 2008
Making special method names, work with __getattr__	3	Apr 23, 2010
Explaining Implementing a Binary Search Tree.	6	Jun 15, 2008
OrderedEnum examples	5	Jul 30, 2013
__eq__ problem with subclasses	1	Aug 22, 2008

Scope rule pecularities

Antoon Pardon

Donn Cave

Greg Ewing

Antoon Pardon

Antoon Pardon

Andrew Bennetts

Andrew Bennetts

Antoon Pardon

Antoon Pardon

Andrew Bennetts

Andrew Bennetts

Antoon Pardon

Antoon Pardon

Aahz

Josiah Carlson

Antoon Pardon

Antoon Pardon

Josiah Carlson

Terry Reedy

Josiah Carlson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads