copy on write

Steven D'Aprano · Feb 2, 2012

Yeah there's a word for that; INTUITIVE, And I've been preaching its
virtues (sadly in vain it seems!) to these folks for some time now.

Intuitive to whom?

Expert Python programmers?

VB coders?

Perl hackers?

School children who have never programmed before?

Mathematicians?

Babies?

Rocket scientists?

Hunter-gatherers from the Kalahari desert?

My intuition tells me you have never even considered that intuition
depends on who is doing the intuiting.

Devin Jeanpierre · Feb 2, 2012

Steven said:
Steven said:

Normally this is harmless, but there is one interesting little
glitch you can get:

t = ('a', [23])
t[1] += [42]
Traceback (most recent call last):
Â File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
t
('a', [23, 42])

Click to expand...

Click to expand...

IMHO, this is worthy of bug-hood: shouldn't we be able to conclude from the TypeError that the assignment failed?

It did fail. The mutation did not.

I can't think of any way out of this misleadingness, although if you
can that would be pretty awesome.

-- Devin

John O'Hagan · Feb 2, 2012

Steven D'Aprano wrote:
Normally this is harmless, but there is one interesting little
glitch you can get:

t = ('a', [23])
t[1] += [42]
Traceback (most recent call last):
Â File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
t
('a', [23, 42])

Click to expand...

IMHO, this is worthy of bug-hood: shouldn't we be able to conclude
from the TypeError that the assignment failed?

Click to expand...

It did fail. The mutation did not.

You're right, in fact, for me the surprise is that "t[1] +=" is interpreted as an assignment at all, given that for lists (and other mutable objects which use "+=") it is a mutation. Although as Steven says elsewhere, it actually is an assignment, but one which ends up reassigning to the same object.

But it shouldn't be both. I can't think of another example of (what appears to be but is not) a single operation failing with an exception, but still doing exactly what you intended.

I can't think of any way out of this misleadingness, although if you
can that would be pretty awesome.

In the case above, the failure of the assignment is of no consequence. I think it would make more sense if applying "+=" to a tuple element were treated (by the interpreter I suppose) only on the merits of the element, and not as an assignment to the tuple.

John

Steven D'Aprano · Feb 2, 2012

You're right, in fact, for me the surprise is that "t[1] +=" is
interpreted as an assignment at all, given that for lists (and other
mutable objects which use "+=") it is a mutation. Although as Steven
says elsewhere, it actually is an assignment, but one which ends up
reassigning to the same object.

But it shouldn't be both.

Do you expect that x += 1 should succeed? After all, "increment and
decrement numbers" is practically THE use-case for the augmented
assignment operators.

How can you expect x += 1 to succeed without an assignment? Numbers in
Python are immutable, and they have to stay immutable. It would cause
chaos and much gnashing of teeth if you did this:

x = 2
y = 7 - 5
x += 1
print y * 100
=> prints 300

So if you want x += 1 to succeed, += must do an assignment.

Perhaps you are thinking that Python could determine ahead of time
whether x[1] += y involved a list or a tuple, and not perform the finally
assignment if x was a tuple. Well, maybe, but such an approach (if
possible!) is fraught with danger and mysterious errors even harder to
debug than the current situation. And besides, what should Python do
about non-built-in types? There is no way in general to predict whether
x[1] = something will succeed except to actually try it.

I can't think of another example of (what
appears to be but is not) a single operation failing with an exception,
but still doing exactly what you intended.

Neither can I, but that doesn't mean that the current situation is not
the least-worst alternative.

In the case above, the failure of the assignment is of no consequence. I
think it would make more sense if applying "+=" to a tuple element were
treated (by the interpreter I suppose) only on the merits of the
element, and not as an assignment to the tuple.

How should the interpreter deal with other objects which happen to raise
TypeError? By always ignoring it?

x = [1, None, 3]
x[1] += 2 # apparently succeeds

Or perhaps by hard-coding tuples and only ignoring errors for tuples? So
now you disguise one error but not others?

Thomas Rachel · Feb 2, 2012

Am 13.01.2012 13:30 schrieb Chris Angelico:

It seems there's a distinct difference between a+=b (in-place
addition/concatenation) and a=a+b (always rebinding),

There is indeed.

a = a + b is a = a.__add__(b), while

a += b is a = a.__iadd__(b).

__add__() is supposed to leave the original object intact and return a
new one, while __iadd__() is free to modify (preference, to be done if
possible) or return a new one.

A immutable object can only return a new one, and its __iadd__()
behaviour is the same as __add__().

A mutable object, however, is free to and supposed to modify itself and
then return self.

Thomas

Hrvoje Niksic · Feb 2, 2012

Steven D'Aprano said:
Perhaps you are thinking that Python could determine ahead of time
whether x[1] += y involved a list or a tuple, and not perform the
finally assignment if x was a tuple. Well, maybe, but such an approach
(if possible!) is fraught with danger and mysterious errors even
harder to debug than the current situation. And besides, what should
Python do about non-built-in types? There is no way in general to
predict whether x[1] = something will succeed except to actually try
it.

An alternative approach is to simply not perform the final assignment if
the in-place method is available on the contained object. No prediction
is needed to do it, because the contained object has to be examined
anyway. No prediction is needed, just don't. Currently,
lhs[ind] += rhs is implemented like this:

item = lhs[ind]
if hasattr(item, '__iadd__'):
lhs.__setitem__(ind, item.__iadd__(rhs))
else:
lhs.__setitem__(ind, item + rhs)
# (Note item assignment in both "if" branches.)

It could, however, be implemented like this:

item = lhs[ind]
if hasattr(item, '__iadd__'):
item += rhs # no assignment, item supports in-place change
else:
lhs.__setitem__(ind, lhs[ind] + rhs)

This would raise the exact same exception in the tuple case, but without
executing the in-place assignment. On the other hand, some_list[ind] += 1
would continue working exactly the same as it does now.

In the same vein, in-place methods should not have a return value
(i.e. they should return None), as per Python convention that functions
called for side effect don't return values.

The alternative behavior is unfortunately not backward-compatible (it
ignores the return value of augmented methods), so I'm not seriously
proposing it, but I believe it would have been a better implementation
of augmented assignments than the current one. The present interface
doesn't just bite those who try to use augmented assignment on tuples
holding mutable objects, but also those who do the same with read-only
properties, which is even more reasonable. For example, obj.list_attr
being a list, one would expect that obj.list_attr += [1, 2, 3] does the
same thing as obj.list_attr.extend([1, 2, 3]). And it almost does,
except it also follows up with an assignment after the list has already
been changed, and the assignment to a read-only property raises an
exception. Refusing to modify the list would have been fine, modifying
it without raising an exception (as described above) would have been
better, but modifying it and *then* raising an exception is a surprise
that takes some getting used to.

88888 Dihedral · Feb 2, 2012

åœ¨ 2012å¹´1æœˆ14æ—¥æ˜ŸæœŸå…UTC+8ä¸Šåˆ6æ—¶48åˆ†29ç§’ï¼ŒEvan Driscollå†™é“ï¼š

I was talking about the combination of + and =, since the discussion is
about 'a = a + b' vs 'a += b', not 'a + b' vs 'a += b' (where the
differences are obvious).

And I stand by my statement. In 'a = a + b', operator+ obviously returns
a new object, but operator= should then go and assign the result to and
return a reference to 'a', just like how 'a += b' will return a
reference to 'a'.

The operation a+b means add(a,b) and returns a result instance, furthermorea and b can't be modified.

The expression a = a+b are two operations not one. But in C or C++ the problem is mixing operations and expressions in a free style allowed.

The operation a+=b means a modified by b and b can't be changed.
Note that no new instance is necessary in a+=b.

If you're working in C++ and overload your operators so that 'a += b'
and 'a = a + b' have different observable behaviors (besides perhaps
time), then either your implementation is buggy or your design is very
bad-mannered.

Evan

Do you mean the result instances after 'a+=b' and 'a=a+b' or
the actions of behaviors of instances involved in performing 'a+=b' and 'a=a+b'?

88888 Dihedral · Feb 2, 2012

åœ¨ 2012å¹´1æœˆ14æ—¥æ˜ŸæœŸå…UTC+8ä¸Šåˆ6æ—¶48åˆ†29ç§’ï¼ŒEvan Driscollå†™é“ï¼š

I was talking about the combination of + and =, since the discussion is
about 'a = a + b' vs 'a += b', not 'a + b' vs 'a += b' (where the
differences are obvious).

And I stand by my statement. In 'a = a + b', operator+ obviously returns
a new object, but operator= should then go and assign the result to and
return a reference to 'a', just like how 'a += b' will return a
reference to 'a'.

The operation a+b means add(a,b) and returns a result instance, furthermorea and b can't be modified.

The expression a = a+b are two operations not one. But in C or C++ the problem is mixing operations and expressions in a free style allowed.

The operation a+=b means a modified by b and b can't be changed.
Note that no new instance is necessary in a+=b.

If you're working in C++ and overload your operators so that 'a += b'
and 'a = a + b' have different observable behaviors (besides perhaps
time), then either your implementation is buggy or your design is very
bad-mannered.

Evan

Do you mean the result instances after 'a+=b' and 'a=a+b' or
the actions of behaviors of instances involved in performing 'a+=b' and 'a=a+b'?

John O'Hagan · Feb 2, 2012

You're right, in fact, for me the surprise is that "t[1] +=" is
interpreted as an assignment at all, given that for lists (and other
mutable objects which use "+=") it is a mutation. Although as Steven
says elsewhere, it actually is an assignment, but one which ends up
reassigning to the same object.

But it shouldn't be both.

Click to expand...

Do you expect that x += 1 should succeed? After all, "increment and
decrement numbers" is practically THE use-case for the augmented
assignment operators.

How can you expect x += 1 to succeed without an assignment?

I don't; obviously, for immutable objects assignment is the only possibility.

[...]

Perhaps you are thinking that Python could determine ahead of time
whether x[1] += y involved a list or a tuple, and not perform the
finally assignment if x was a tuple. Well, maybe, but such an
approach (if possible!) is fraught with danger and mysterious errors
even harder to debug than the current situation. And besides, what
should Python do about non-built-in types? There is no way in general
to predict whether x[1] = something will succeed except to actually
try it.

It's not so much about the type of x but that of x[1]. Wouldn't it be possible to omit the assignment simply if the object referred to by x[1] uses "+=" without creating a new object? That way, some_tuple += y will succeed if some_tuple is a list but not with, say, an int. That seems reasonable to me.

[...]

In the case above, the failure of the assignment is of no
consequence. I think it would make more sense if applying "+=" to a
tuple element were treated (by the interpreter I suppose) only on
the merits of the element, and not as an assignment to the tuple.

Click to expand...

How should the interpreter deal with other objects which happen to
raise TypeError? By always ignoring it?

x = [1, None, 3]
x[1] += 2 # apparently succeeds

Or perhaps by hard-coding tuples and only ignoring errors for tuples?
So now you disguise one error but not others?

Click to expand...

I'm not suggesting either of those. None can't be modified in place. But for objects which can, wouldn't omitting the final assignment prevent the TypeError in the first place?

John

MRAB · Feb 2, 2012

Steven D'Aprano said:
Steven D'Aprano said:

Perhaps you are thinking that Python could determine ahead of time
whether x[1] += y involved a list or a tuple, and not perform the
finally assignment if x was a tuple. Well, maybe, but such an approach
(if possible!) is fraught with danger and mysterious errors even
harder to debug than the current situation. And besides, what should
Python do about non-built-in types? There is no way in general to
predict whether x[1] = something will succeed except to actually try
it.

Click to expand...

An alternative approach is to simply not perform the final assignment if
the in-place method is available on the contained object. No prediction
is needed to do it, because the contained object has to be examined
anyway. No prediction is needed, just don't. Currently,
lhs[ind] += rhs is implemented like this:

item = lhs[ind]
if hasattr(item, '__iadd__'):
lhs.__setitem__(ind, item.__iadd__(rhs))
else:
lhs.__setitem__(ind, item + rhs)
# (Note item assignment in both "if" branches.)

It could, however, be implemented like this:

item = lhs[ind]
if hasattr(item, '__iadd__'):
item += rhs # no assignment, item supports in-place change
else:
lhs.__setitem__(ind, lhs[ind] + rhs)

This would raise the exact same exception in the tuple case, but without
executing the in-place assignment. On the other hand, some_list[ind] += 1
would continue working exactly the same as it does now.

In the same vein, in-place methods should not have a return value
(i.e. they should return None), as per Python convention that functions
called for side effect don't return values.

The alternative behavior is unfortunately not backward-compatible (it
ignores the return value of augmented methods), so I'm not seriously
proposing it, but I believe it would have been a better implementation
of augmented assignments than the current one.

[snip]
Could it not perform the assignment if the reference returned by
__iadd__ is the same as the current reference?

For example:

t[0] += x

would do:

r = t[0].__iadd__(x)
if t[0] is not r:
t[0] = r

Should failed assignment be raising TypeError? Is it really a type
error?

Devin Jeanpierre · Feb 2, 2012

Should failed assignment be raising TypeError? Is it really a type
error?

A failed setitem should be a TypeError as much as a failed getitem
should. Should 1[0] be a TypeError?

-- Devin

Terry Reedy · Feb 2, 2012

It's not so much about the type of x but that of x[1]. Wouldn't it be
possible to omit the assignment simply if the object referred to by
x[1] uses "+=" without creating a new object? That way, some_tuple
+= y will succeed if some_tuple is a list but not with, say, an
int. That seems reasonable to me.

There was considerable discussion of the exact semantics of augmented
operations when they were introduced. I do not remember if that
particular idea was suggested (and rejected) or not. You could try to
look at the PEP, if there is one, or the dicussion ( probably on pydev
list).

Evan Driscoll · Feb 2, 2012

Do you mean the result instances after 'a+= and 'a=a+b' or
the actions of behaviors of instances involved in performing 'a+= and 'a=a+b'?

I mean "if which operation you called is distinguishable in any way
besides the time it takes to run or by tracing it through in a debugger"

That means:

1. The value of 'a' should be the same after executing 'a+=b' and
'a=a+b'
2. The actual result of the expression should be the same in both cases
(in both cases it should be a reference to a)
3. Any additional side effects performed (ew!) should be the same in
both cases

Evan

John O'Hagan · Feb 3, 2012

It's not so much about the type of x but that of x[1]. Wouldn't it
be possible to omit the assignment simply if the object referred to
by x[1] uses "+=" without creating a new object? That way,
some_tuple += y will succeed if some_tuple is a list but not
with, say, an int. That seems reasonable to me.

Click to expand...

There was considerable discussion of the exact semantics of augmented
operations when they were introduced. I do not remember if that
particular idea was suggested (and rejected) or not. You could try to
look at the PEP, if there is one, or the dicussion ( probably on
pydev list).

I think we're 12 years late on this one. It's PEP 203 from 2000 and the key phrase was:

"The in-place function should always return a new reference, either
to the old `x' object if the operation was indeed performed
in-place, or to a new object."

If this had read:

"The in-place function should return a reference to a new object
if the operation was not performed in-place."

or something like that, we wouldn't be discussing this.

The discussion on py-dev at the time was quite limited but there was some lively debate on this list the following year (in the context of widespread controversy over new-fangled features which also included list comprehensions and generators), to which the BDFL's response was:

"You shouldn't think "+= is confusing because sometimes it modifies an
object and sometimes it does". Gee, there are lots of places where
something that's *spelled* the same has a different effect depending

t=([],)
l=t[0]
l is t[0] True
l+=[1]
t ([1],)
t[0]+=[1]

Click to expand...

Click to expand...

Click to expand...

Traceback (most recent call last):

File said:

t ([1, 1],)
l is t[0]

Click to expand...

Click to expand...

Click to expand...

True

Same object, same operator, different name, different outcome. Maybe that was obvious from the foregoing discussion, but it shocked me when put that way.

John

Steven D'Aprano · Feb 3, 2012

I think we're 12 years late on this one. It's PEP 203 from 2000 and the
key phrase was:

"The in-place function should always return a new reference, either to
the old `x' object if the operation was indeed performed in-place, or to
a new object."

If this had read:

"The in-place function should return a reference to a new object if the
operation was not performed in-place."

or something like that, we wouldn't be discussing this.

And what should it return if the operation *is* performed in-place?
"Don't return anything" is not an option, Python doesn't have procedures.
That implies that __iadd__ etc. should return None. But two problems come
to mind:

1) Using None as an out-of-band signal to the interpreter to say "don't
perform the assignment" makes it impossible for the augmented assignment
method to return None as the result. If we only think about numeric
operations like x += 1 then we might not care, but once you consider the
situation more widely the problem is clear:

x = Fact(foo)
y = Fact(bar)
x & y # Returns a composite Fact, or None if they are contradictory

With your suggestion, x &= y fails to work, but only sometimes. And when
it fails, it doesn't fail with an explicit exception, but silently fails
and then does the wrong thing. This makes debugging a horror.

2) And speaking of debugging, sometimes people forget to include the
return statement in methods. Normally, the left hand side of the
assignment then gets set to None, and the error is pretty obvious as soon
as you try to do something with it. But with your suggestion, instead of
getting an exception, it silently fails, and your code does the wrong
thing.

I suppose that they could have invented a new sentinel, or a special
exception to be raised as a signal, but that's piling complication on top
of complication, and it isn't clear to me that it's worth it for an
obscure corner case.

Yes, the current behaviour is a Gotcha, but it's a Gotcha that makes good
sense compared to the alternatives.

Ultimately, augmented assignment is *assignment*, just like it says on
the tin. t[1] += x is syntactic sugar for t[1] = t[1].__iadd__(x). It
can't and shouldn't fail to raise an exception if t is a tuple, because
tuple item assignment *must* fail.

The problem is that lists treat __iadd__ as an in-place optimization, and
this clashes with tuple immutability. But if lists *didn't* treat
__iadd__ as in-place, people would complain when they used it directly
without a tuple wrapper.

Perhaps lists shouldn't define += at all, but then people will complain
that mylist += another_list is slow. Telling them to use mylist.extend
instead just makes them cranky. After all, mylist + another_list works,
so why shouldn't += work?

Ultimately, there is no right answer, because the multitude of
requirements are contradictory. No matter what Python did, somebody would
complain.

Chris Angelico · Feb 3, 2012

No matter what Python did, somebody would complain.

+1

This is, I think, the ultimate truth of the matter.

ChrisA

Antoon Pardon · Feb 3, 2012

Ultimately, there is no right answer, because the multitude of
requirements are contradictory. No matter what Python did, somebody would
complain.

Which makes me wonder why it was introduced at all, or at least so fast
If you see the difference in speed in introducing augmented assignment
vs how long it took to get conditional expression I start thinking of a
bikeshed. In the first case we have something that raises semantic
questions that are difficult to resolve, in the second case the
semantics were clear, the big problem that delayed introduction was what
syntax to use.

But the second took a lot longer to become part of the language than the
first, which seems very odd to me.

John O'Hagan · Feb 3, 2012

And what should it return if the operation *is* performed in-place?

Not knowing anything about the inner workings of the interpreter, I'm agnostic on that as long as it's not "a new reference". Perhaps the old reference?

[...snip undoubted reasons why returning None wouldn't work...]

I don't know what would work. Maybe it is insoluble. But didn't Hrvoje Niksic's post in this thread suggest it could have been implemented to work the way I'm saying, even supplying code to demonstrate it?

All I'm saying is that however it's implemented, x += y should simply mutate x in-place if x implements that, otherwise it should do x = x + y. If I can say it under 25 words, surely it's implementable? (Whether it's practical to do so is another question.)

The x in x += y can be seen as a reference to an object to be incremented rather than an assignment (despite the name). In that view, whether the name x needs to be rebound to a new object, resulting in an assignment, depends on the capabilities of x, not x.

Yes, the current behaviour is a Gotcha, but it's a Gotcha that makes
good sense compared to the alternatives.

Click to expand...

I think it's worse than a Gotcha. IMHO a Gothcha is, for example, the mutable default arguments thing, which makes sense once you get it. This one has the bizarre consequence that what happens when you operate on an object depends on which name you use for the object. Not to mention that it succeeds after raising an exception.

Ultimately, augmented assignment is *assignment*, just like it says
on the tin. t[1] += x is syntactic sugar for t[1] = t[1].__iadd__(x).
It can't and shouldn't fail to raise an exception if t is a tuple,
because tuple item assignment *must* fail.

Click to expand...

That makes sense if we view it strictly as assignment (but in that case the mutation of t[1] should not occur either).
But isn't it equally true if we say that z = t[1], then t[1] += x is syntactic sugar for z = z.__iadd__(x)? Why should that fail, if z can handle it?

[...]

Ultimately, there is no right answer, because the multitude of
requirements are contradictory. No matter what Python did, somebody
would complain.

Click to expand...

Not complaining, just trying to contribute to the best of my ability.

John

Rick Johnson · Feb 3, 2012

+1

This is, I think, the ultimate truth of the matter.

People would not complain if they did not care. The only useless
complaint is people complaining about other people complaining. And
the only thing worse than that is rabid fanboi brown-nosing!

OKB (not okblacke) · Feb 3, 2012

Steven said:
Perhaps lists shouldn't define += at all, but then people will
complain that mylist += another_list is slow. Telling them to use
mylist.extend instead just makes them cranky. After all, mylist +
another_list works, so why shouldn't += work?

It would work, it just wouldn't work in-place.

--
--OKB (not okblacke)
Brendan Barnwell
"Do not follow where the path may lead. Go, instead, where there is
no path, and leave a trail."
--author unknown

Deeper copy than deepcopy	4	Oct 27, 2009
Locale bug?	3	Jan 3, 2012
Need help with this script	4	Mar 12, 2023
I Need Help with making a function that draws in a canvas using location data.	1	Dec 17, 2021
Python point location of intersect between two lines	0	Feb 28, 2018
equivalent to C pointer	0	Apr 18, 2013
Pass by reference and copy on write	22	Oct 3, 2010
Python code problem	2	Apr 23, 2023

copy on write

Steven D'Aprano

Devin Jeanpierre

John O'Hagan

Steven D'Aprano

Thomas Rachel

Hrvoje Niksic

88888 Dihedral

88888 Dihedral

John O'Hagan

MRAB

Devin Jeanpierre

Terry Reedy

Evan Driscoll

John O'Hagan

Steven D'Aprano

Chris Angelico

Antoon Pardon

John O'Hagan

Rick Johnson

OKB (not okblacke)

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads