copy on write

  • Thread starter Eduardo Suarez-Santana
  • Start date
E

Eduardo Suarez-Santana

I wonder whether this is normal behaviour.

I would expect equal sign to copy values from right to left. However, it
seems there is a copy-on-write mechanism that is not working.

Anyone can explain and provide a working example?

Thanks,
-Eduardo

$ python
Python 2.7.2 (default, Oct 31 2011, 11:54:55)
[GCC 4.5.3] on linux2
Type "help", "copyright", "credits" or "license" for more information..... def __init__(self, id, cont):
.... self.id = id;
.... self.cont = cont;
....
r={'a':1};
d={};
d['x']=r;
d['y']=r;
x1 = n('x',d['x']);
y1 = n('y',d['y']);
x1.cont['a']=2;
y1.cont {'a': 2}
 
S

Steven D'Aprano

I wonder whether this is normal behaviour.

I would expect equal sign to copy values from right to left.

Assignment in Python never copies values.
However, it
seems there is a copy-on-write mechanism that is not working.

There is no copy-on-write.

Assignment in Python is name binding: the name on the left hand side is
bound to the object on the right. An object can have zero, one or many
names. If the object is mutable, changes to the object will be visible
via any name:
x = [] # lists are mutable objects
y = x # not a copy of x, but x and y point to the same object
x.append(42) # mutates the object in place
print y
[42]

The same rules apply not just to names, but also to list items and dict
items, as well as attributes, and any other reference:
z = [x, y] # z is a list containing the same sublist twice
z[0].append(23)
print z
[[42, 23], [42, 23]]

When you work with floats, ints or strings, you don't notice this because
those types are immutable: you can't modify those objects in place. So
for example:
42
 
C

Chris Angelico

z = [x, y]  # z is a list containing the same sublist twice
z[0].append(23)
print z
[[42, 23], [42, 23]]

When you work with floats, ints or strings, you don't notice this because
those types are immutable: you can't modify those objects in place. So
for example:
42

I was about to say that it's a difference between ".append()" which is
a method on the object, and "+=" which is normally a rebinding, but
unfortunately:
a=[]
b=a
a+=[1]
a [1]
b [1]
b+=[2]
a [1, 2]
a [1, 2]
a=a+[3]
a [1, 2, 3]
b
[1, 2]

(tested in Python 3.2 on Windows)

It seems there's a distinct difference between a+=b (in-place
addition/concatenation) and a=a+b (always rebinding), which is sorely
confusing to C programmers. But then, there's a lot about Python
that's sorely confusing to C programmers.

ChrisA
 
S

Steven D'Aprano

It seems there's a distinct difference between a+=b (in-place
addition/concatenation) and a=a+b (always rebinding),

Actually, both are always rebinding. It just happens that sometimes a+=b
rebinds to the same object that it was originally bound to.

In the case of ints, a+=b creates a new object (a+b) and rebinds a to it.
In the case of lists, a+=b nominally creates a list a+b, but in fact it
implements that as an in-place operation a.extend(b), and then rebinds
the name a to the list already bound to a.

It does that because the Python VM doesn't know at compile time whether
a+=b will be in-place or not, and so it has to do the rebinding in order
to support the fall-back case of a+=b => a=a+b. Or something -- go read
the PEP if you really care :)

Normally this is harmless, but there is one interesting little glitch you
can get:
Traceback (most recent call last):
('a', [23, 42])



which is sorely
confusing to C programmers. But then, there's a lot about Python
that's sorely confusing to C programmers.

I prefer to think of it as "there's a lot about C that is sorely
confusing to anyone who isn't a C programmer" <wink>
 
D

Devin Jeanpierre

It seems there's a distinct difference between a+=b (in-place
addition/concatenation) and a=a+b (always rebinding), which is sorely
confusing to C programmers. But then, there's a lot about Python
that's sorely confusing to C programmers.

I think this is confusing to just about everyone, when they first encounter it.

-- Devin
 
G

Grant Edwards

I think this is confusing to just about everyone, when they first
encounter it.

That depends on what languages they've used in the past and whether
they skip reading any documentation and just assume that all languages
work the same way.

I would agree that for the majority of new users, they previously used
only languages where an assignment operator does a "copy value", and
that 90+ percent of the time those new users they assume all languages
work that way.

I'm not sure what we can do about that -- Python's semantics are well
documented.
 
D

Devin Jeanpierre

That depends on what languages they've used in the past and whether
they skip reading any documentation and just assume that all languages
work the same way.

I would agree that for the majority of new users, they previously used
only languages where an assignment operator does a "copy value", and
that 90+ percent of the time those new users they assume all languages
work that way.

That isn't what I was referring to. Specifically, it confuses almost
everyone the first time they encounter it that "a += b" is not the
same as "a = a + b".

And sure, it's documented. That's a bit of a cop-out though... it
isn't in the tutorial, and even if it were, it's not as if people
remember everything they read. It's not about whether you _can_ know
it as much as whether it is """obvious"". There's a bit of a feeling
that code should "do what it looks like" and be sort of understandable
without exactly understanding everything. Maybe this idea is wrong if
taken to an extreme (since it's really impossible to do completely),
but the feeling of it is probably decent. It's why we use "+" for
addition and "-" for subtraction, and not the other way around. You
don't need to know the details of operator overloading and
NotImplemented and so on to get what X + Y means for numbers, or even
for lists.

I feel like "a += b" is sort of implicitly understood by most
programmers to be the same as "a = a + b". If you asked someone what
it meant, their first answer would be "Oh, it means a = a + b"[*].
That is why it's confusing -- even to people that weren't already
exposed to that idea that these are equivalent, they get infected
fast. And then expectations get broken, because they're only *usually*
equivalent.

[*] Before posting this, I actually tried this on a Python IRC channel
-- and it happened exactly as so.

-- Devin
 
N

Neil Cerutti

That isn't what I was referring to. Specifically, it confuses
almost everyone the first time they encounter it that "a += b"
is not the same as "a = a + b".

If you've ever implemented operator=, operator+, and operator+=
in C++ you'll know how and why they are different. A C++
programmer would be wondering how either can work on immutable
objects, and that's where Python's magical rebinding semantics
come into play.
 
G

Grant Edwards

If you've ever implemented operator=, operator+, and operator+=
in C++ you'll know how and why they are different.

That assumes that C++ programmers understand C++.

;)
 
E

Ethan Furman

Steven said:
Normally this is harmless, but there is one interesting little glitch you
can get:
t = ('a', [23])
t[1] += [42]
Traceback (most recent call last):
('a', [23, 42])


There is one other glitch, and possibly my only complaint:

--> a = [1, 2, 3]
--> b = 'hello, world'
--> a = a + b
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can only concatenate list (not "str") to list
--> a += b
--> a
[1, 2, 3, 'h', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd']

IMO, either both + and += should succeed, or both should fail.

~Ethan~
 
E

Evan Driscoll

If you've ever implemented operator=, operator+, and operator+=
in C++ you'll know how and why they are different.

At the same time, you'd also know that that implementing them in such a
way that 'a += b' does *not* perform the same action as 'a = a + b' is
considered very bad-mannered.

In fact, it's often suggested (e.g. in "More Effective C++"'s Item 22,
though this is not the main thrust of that section) to implement
operator+ in terms of += to ensure that this is the case:
MyType operator+ (MyType left, MyType right) {
MyType copy = left; copy += right; return copy;
}
A C++
programmer would be wondering how either can work on immutable
objects, and that's where Python's magical rebinding semantics
come into play.

IMO a C++ programmer wouldn't be likely to wonder that much at all
because he or she wouldn't view the objects as immutable to begin with.
:) 'x = 5; x += 1;' makes perfect sense in C++, just for a somewhat
different reason.

Evan
 
G

Grant Edwards

I understand C++ very well. That's why I use Python or Pike.

(With apologies to Larry Wall)

Were one inclined to troll a bit, one might be tempted to claim that
using C++ is prima facie evidence of not understanding C++.

Not that I would ever claim something inflamitory like that...
 
N

Neil Cerutti

Were one inclined to troll a bit, one might be tempted to claim
that using C++ is prima facie evidence of not understanding
C++.

Not that I would ever claim something inflamitory like that...

On the Python newsgroup, it's funny. ;)
 
N

Neil Cerutti

At the same time, you'd also know that that implementing them
in such a way that 'a += b' does *not* perform the same action
as 'a = a + b' is considered very bad-mannered.

In fact, it's often suggested (e.g. in "More Effective C++"'s Item 22,
though this is not the main thrust of that section) to implement
operator+ in terms of += to ensure that this is the case:
MyType operator+ (MyType left, MyType right) {
MyType copy = left; copy += right; return copy;
}

They perform the same action, but their semantics are different.
operator+ will always return a new object, thanks to its
signature, and operator+= shall never do so. That's the main
difference I was getting at.
IMO a C++ programmer wouldn't be likely to wonder that much at
all because he or she wouldn't view the objects as immutable to
begin with. :) 'x = 5; x += 1;' makes perfect sense in C++,
just for a somewhat different reason.

I was thinking of const objects, but you are correct that
immutable isn't really a C++ concept.
 
8

88888 Dihedral

Ethan Furmanæ–¼ 2012å¹´1月14日星期六UTC+8上åˆ2時40分47秒寫é“:
Steven said:
Normally this is harmless, but there is one interesting little glitch you
can get:
t = ('a', [23])
t[1] += [42]
Traceback (most recent call last):
File said:
('a', [23, 42])


There is one other glitch, and possibly my only complaint:

--> a = [1, 2, 3]
--> b = 'hello, world'
--> a = a + b
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can only concatenate list (not "str") to list
--> a += b
--> a
[1, 2, 3, 'h', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd']

IMO, either both + and += should succeed, or both should fail.

~Ethan~

The += operator is not only for value types in the above example.

An operator of two operands and an operator of three operands of
general object types are two different operators.
 
8

88888 Dihedral

Ethan Furmanæ–¼ 2012å¹´1月14日星期六UTC+8上åˆ2時40分47秒寫é“:
Steven said:
Normally this is harmless, but there is one interesting little glitch you
can get:
t = ('a', [23])
t[1] += [42]
Traceback (most recent call last):
File said:
('a', [23, 42])


There is one other glitch, and possibly my only complaint:

--> a = [1, 2, 3]
--> b = 'hello, world'
--> a = a + b
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can only concatenate list (not "str") to list
--> a += b
--> a
[1, 2, 3, 'h', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd']

IMO, either both + and += should succeed, or both should fail.

~Ethan~

The += operator is not only for value types in the above example.

An operator of two operands and an operator of three operands of
general object types are two different operators.
 
E

Evan Driscoll

They perform the same action, but their semantics are different.
operator+ will always return a new object, thanks to its
signature, and operator+= shall never do so. That's the main
difference I was getting at.

I was talking about the combination of + and =, since the discussion is
about 'a = a + b' vs 'a += b', not 'a + b' vs 'a += b' (where the
differences are obvious).

And I stand by my statement. In 'a = a + b', operator+ obviously returns
a new object, but operator= should then go and assign the result to and
return a reference to 'a', just like how 'a += b' will return a
reference to 'a'.

If you're working in C++ and overload your operators so that 'a += b'
and 'a = a + b' have different observable behaviors (besides perhaps
time), then either your implementation is buggy or your design is very
bad-mannered.

Evan
 
J

John O'Hagan

Steven said:
Normally this is harmless, but there is one interesting little
glitch you can get:
t = ('a', [23])
t[1] += [42]
Traceback (most recent call last):
File said:
('a', [23, 42])

IMHO, this is worthy of bug-hood: shouldn't we be able to conclude from the TypeError that the assignment failed?
There is one other glitch, and possibly my only complaint:

--> a = [1, 2, 3]
--> b = 'hello, world'
--> a = a + b
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can only concatenate list (not "str") to list
--> a += b
--> a
[1, 2, 3, 'h', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd']

IMO, either both + and += should succeed, or both should fail.

~Ethan~


This also happens for tuples, sets, generators and range objects (probably any iterable), AFAIK only when the left operand is a list. Do lists get special treatment in terms of implicitly converting the right-hand operand?

The behaviour of the "in-place" operator could be more consistent across types:
a=[1,2]
a+=(3,4)
a [1, 2, 3, 4]
a=(1,2)
a+=(3,4)
a (1, 2, 3, 4)
a=(1,2)
a+=[3,4]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can only concatenate tuple (not "list") to tuple


John
 
R

Rick Johnson

There's a bit of a feeling
that code should "do what it looks like" and be sort of understandable
without exactly understanding everything.

Yeah there's a word for that; INTUITIVE, And I've been preaching its
virtues (sadly in vain it seems!) to these folks for some time now.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top