copy on write

Discussion in 'Python' started by Eduardo Suarez-Santana, Jan 13, 2012.

  1. I wonder whether this is normal behaviour.

    I would expect equal sign to copy values from right to left. However, it
    seems there is a copy-on-write mechanism that is not working.

    Anyone can explain and provide a working example?

    Thanks,
    -Eduardo

    $ python
    Python 2.7.2 (default, Oct 31 2011, 11:54:55)
    [GCC 4.5.3] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> class n:

    .... def __init__(self, id, cont):
    .... self.id = id;
    .... self.cont = cont;
    ....
    >>> r={'a':1};
    >>> d={};
    >>> d['x']=r;
    >>> d['y']=r;
    >>> x1 = n('x',d['x']);
    >>> y1 = n('y',d['y']);
    >>> x1.cont['a']=2;
    >>> y1.cont

    {'a': 2}
    >>>
    Eduardo Suarez-Santana, Jan 13, 2012
    #1
    1. Advertising

  2. On Fri, 13 Jan 2012 11:33:24 +0000, Eduardo Suarez-Santana wrote:

    > I wonder whether this is normal behaviour.
    >
    > I would expect equal sign to copy values from right to left.


    Assignment in Python never copies values.

    > However, it
    > seems there is a copy-on-write mechanism that is not working.


    There is no copy-on-write.

    Assignment in Python is name binding: the name on the left hand side is
    bound to the object on the right. An object can have zero, one or many
    names. If the object is mutable, changes to the object will be visible
    via any name:

    >>> x = [] # lists are mutable objects
    >>> y = x # not a copy of x, but x and y point to the same object
    >>> x.append(42) # mutates the object in place
    >>> print y

    [42]

    The same rules apply not just to names, but also to list items and dict
    items, as well as attributes, and any other reference:

    >>> z = [x, y] # z is a list containing the same sublist twice
    >>> z[0].append(23)
    >>> print z

    [[42, 23], [42, 23]]

    When you work with floats, ints or strings, you don't notice this because
    those types are immutable: you can't modify those objects in place. So
    for example:

    >>> a = 42 # binds the name 'a' to the object 42
    >>> b = a # a and b point to the same object
    >>> a += 1 # creates a new object, and binds it to a
    >>> print b # leaving b still pointing to the old object

    42


    --
    Steven
    Steven D'Aprano, Jan 13, 2012
    #2
    1. Advertising

  3. On Fri, Jan 13, 2012 at 11:10 PM, Steven D'Aprano
    <> wrote:
    >>>> z = [x, y]  # z is a list containing the same sublist twice
    >>>> z[0].append(23)
    >>>> print z

    > [[42, 23], [42, 23]]
    >
    > When you work with floats, ints or strings, you don't notice this because
    > those types are immutable: you can't modify those objects in place. So
    > for example:
    >
    >>>> a = 42  # binds the name 'a' to the object 42
    >>>> b = a  # a and b point to the same object
    >>>> a += 1  # creates a new object, and binds it to a
    >>>> print b  # leaving b still pointing to the old object

    > 42


    I was about to say that it's a difference between ".append()" which is
    a method on the object, and "+=" which is normally a rebinding, but
    unfortunately:

    >>> a=[]
    >>> b=a
    >>> a+=[1]
    >>> a

    [1]
    >>> b

    [1]
    >>> b+=[2]
    >>> a

    [1, 2]
    >>> a

    [1, 2]
    >>> a=a+[3]
    >>> a

    [1, 2, 3]
    >>> b

    [1, 2]

    (tested in Python 3.2 on Windows)

    It seems there's a distinct difference between a+=b (in-place
    addition/concatenation) and a=a+b (always rebinding), which is sorely
    confusing to C programmers. But then, there's a lot about Python
    that's sorely confusing to C programmers.

    ChrisA
    Chris Angelico, Jan 13, 2012
    #3
  4. On Fri, 13 Jan 2012 23:30:56 +1100, Chris Angelico wrote:

    > It seems there's a distinct difference between a+=b (in-place
    > addition/concatenation) and a=a+b (always rebinding),


    Actually, both are always rebinding. It just happens that sometimes a+=b
    rebinds to the same object that it was originally bound to.

    In the case of ints, a+=b creates a new object (a+b) and rebinds a to it.
    In the case of lists, a+=b nominally creates a list a+b, but in fact it
    implements that as an in-place operation a.extend(b), and then rebinds
    the name a to the list already bound to a.

    It does that because the Python VM doesn't know at compile time whether
    a+=b will be in-place or not, and so it has to do the rebinding in order
    to support the fall-back case of a+=b => a=a+b. Or something -- go read
    the PEP if you really care :)

    Normally this is harmless, but there is one interesting little glitch you
    can get:

    >>> t = ('a', [23])
    >>> t[1] += [42]

    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: 'tuple' object does not support item assignment
    >>> t

    ('a', [23, 42])




    > which is sorely
    > confusing to C programmers. But then, there's a lot about Python
    > that's sorely confusing to C programmers.


    I prefer to think of it as "there's a lot about C that is sorely
    confusing to anyone who isn't a C programmer" <wink>



    --
    Steven
    Steven D'Aprano, Jan 13, 2012
    #4
  5. On Fri, Jan 13, 2012 at 7:30 AM, Chris Angelico <> wrote:
    > It seems there's a distinct difference between a+=b (in-place
    > addition/concatenation) and a=a+b (always rebinding), which is sorely
    > confusing to C programmers. But then, there's a lot about Python
    > that's sorely confusing to C programmers.


    I think this is confusing to just about everyone, when they first encounter it.

    -- Devin
    Devin Jeanpierre, Jan 13, 2012
    #5
  6. On 2012-01-13, Devin Jeanpierre <> wrote:
    > On Fri, Jan 13, 2012 at 7:30 AM, Chris Angelico <> wrote:
    >> It seems there's a distinct difference between a+=b (in-place
    >> addition/concatenation) and a=a+b (always rebinding), which is sorely
    >> confusing to C programmers. But then, there's a lot about Python
    >> that's sorely confusing to C programmers.

    >
    > I think this is confusing to just about everyone, when they first
    > encounter it.


    That depends on what languages they've used in the past and whether
    they skip reading any documentation and just assume that all languages
    work the same way.

    I would agree that for the majority of new users, they previously used
    only languages where an assignment operator does a "copy value", and
    that 90+ percent of the time those new users they assume all languages
    work that way.

    I'm not sure what we can do about that -- Python's semantics are well
    documented.

    --
    Grant Edwards grant.b.edwards Yow! If our behavior is
    at strict, we do not need fun!
    gmail.com
    Grant Edwards, Jan 13, 2012
    #6
  7. On Fri, Jan 13, 2012 at 10:13 AM, Grant Edwards <> wrote:
    > On 2012-01-13, Devin Jeanpierre <> wrote:
    >> On Fri, Jan 13, 2012 at 7:30 AM, Chris Angelico <> wrote:
    >>> It seems there's a distinct difference between a+=b (in-place
    >>> addition/concatenation) and a=a+b (always rebinding), which is sorely
    >>> confusing to C programmers. But then, there's a lot about Python
    >>> that's sorely confusing to C programmers.

    >>
    >> I think this is confusing to just about everyone, when they first
    >> encounter it.

    >
    > That depends on what languages they've used in the past and whether
    > they skip reading any documentation and just assume that all languages
    > work the same way.
    >
    > I would agree that for the majority of new users, they previously used
    > only languages where an assignment operator does a "copy value", and
    > that 90+ percent of the time those new users they assume all languages
    > work that way.


    That isn't what I was referring to. Specifically, it confuses almost
    everyone the first time they encounter it that "a += b" is not the
    same as "a = a + b".

    And sure, it's documented. That's a bit of a cop-out though... it
    isn't in the tutorial, and even if it were, it's not as if people
    remember everything they read. It's not about whether you _can_ know
    it as much as whether it is """obvious"". There's a bit of a feeling
    that code should "do what it looks like" and be sort of understandable
    without exactly understanding everything. Maybe this idea is wrong if
    taken to an extreme (since it's really impossible to do completely),
    but the feeling of it is probably decent. It's why we use "+" for
    addition and "-" for subtraction, and not the other way around. You
    don't need to know the details of operator overloading and
    NotImplemented and so on to get what X + Y means for numbers, or even
    for lists.

    I feel like "a += b" is sort of implicitly understood by most
    programmers to be the same as "a = a + b". If you asked someone what
    it meant, their first answer would be "Oh, it means a = a + b"[*].
    That is why it's confusing -- even to people that weren't already
    exposed to that idea that these are equivalent, they get infected
    fast. And then expectations get broken, because they're only *usually*
    equivalent.

    [*] Before posting this, I actually tried this on a Python IRC channel
    -- and it happened exactly as so.

    -- Devin
    Devin Jeanpierre, Jan 13, 2012
    #7
  8. Eduardo Suarez-Santana

    Neil Cerutti Guest

    On 2012-01-13, Devin Jeanpierre <> wrote:
    > On Fri, Jan 13, 2012 at 10:13 AM, Grant Edwards <> wrote:
    >> On 2012-01-13, Devin Jeanpierre <> wrote:
    >>> On Fri, Jan 13, 2012 at 7:30 AM, Chris Angelico <> wrote:
    >>>> It seems there's a distinct difference between a+=b (in-place
    >>>> addition/concatenation) and a=a+b (always rebinding), which is sorely
    >>>> confusing to C programmers. But then, there's a lot about Python
    >>>> that's sorely confusing to C programmers.
    >>>
    >>> I think this is confusing to just about everyone, when they first
    >>> encounter it.

    >>
    >> That depends on what languages they've used in the past and whether
    >> they skip reading any documentation and just assume that all languages
    >> work the same way.
    >>
    >> I would agree that for the majority of new users, they previously used
    >> only languages where an assignment operator does a "copy value", and
    >> that 90+ percent of the time those new users they assume all languages
    >> work that way.

    >
    > That isn't what I was referring to. Specifically, it confuses
    > almost everyone the first time they encounter it that "a += b"
    > is not the same as "a = a + b".


    If you've ever implemented operator=, operator+, and operator+=
    in C++ you'll know how and why they are different. A C++
    programmer would be wondering how either can work on immutable
    objects, and that's where Python's magical rebinding semantics
    come into play.

    --
    Neil Cerutti
    Neil Cerutti, Jan 13, 2012
    #8
  9. On 2012-01-13, Neil Cerutti <> wrote:

    > If you've ever implemented operator=, operator+, and operator+=
    > in C++ you'll know how and why they are different.


    That assumes that C++ programmers understand C++.

    ;)

    > A C++ programmer would be wondering how either can work on immutable
    > objects, and that's where Python's magical rebinding semantics come
    > into play.


    --
    Grant Edwards grant.b.edwards Yow! Thousands of days of
    at civilians ... have produced
    gmail.com a ... feeling for the
    aesthetic modules --
    Grant Edwards, Jan 13, 2012
    #9
  10. On Sat, Jan 14, 2012 at 5:15 AM, Grant Edwards <> wrote:
    > That assumes that C++ programmers understand C++.


    I understand C++ very well. That's why I use Python or Pike.

    (With apologies to Larry Wall)

    ChrisA
    Chris Angelico, Jan 13, 2012
    #10
  11. Eduardo Suarez-Santana

    Ethan Furman Guest

    Steven D'Aprano wrote:
    > Normally this is harmless, but there is one interesting little glitch you
    > can get:
    >
    >>>> t = ('a', [23])
    >>>> t[1] += [42]

    > Traceback (most recent call last):
    > File "<stdin>", line 1, in <module>
    > TypeError: 'tuple' object does not support item assignment
    >>>> t

    > ('a', [23, 42])



    There is one other glitch, and possibly my only complaint:

    --> a = [1, 2, 3]
    --> b = 'hello, world'
    --> a = a + b
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: can only concatenate list (not "str") to list
    --> a += b
    --> a
    [1, 2, 3, 'h', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd']

    IMO, either both + and += should succeed, or both should fail.

    ~Ethan~
    Ethan Furman, Jan 13, 2012
    #11
  12. On 01/13/2012 10:54 AM, Neil Cerutti wrote:
    > If you've ever implemented operator=, operator+, and operator+=
    > in C++ you'll know how and why they are different.


    At the same time, you'd also know that that implementing them in such a
    way that 'a += b' does *not* perform the same action as 'a = a + b' is
    considered very bad-mannered.

    In fact, it's often suggested (e.g. in "More Effective C++"'s Item 22,
    though this is not the main thrust of that section) to implement
    operator+ in terms of += to ensure that this is the case:
    MyType operator+ (MyType left, MyType right) {
    MyType copy = left; copy += right; return copy;
    }

    > A C++
    > programmer would be wondering how either can work on immutable
    > objects, and that's where Python's magical rebinding semantics
    > come into play.


    IMO a C++ programmer wouldn't be likely to wonder that much at all
    because he or she wouldn't view the objects as immutable to begin with.
    :) 'x = 5; x += 1;' makes perfect sense in C++, just for a somewhat
    different reason.

    Evan
    Evan Driscoll, Jan 13, 2012
    #12
  13. On 2012-01-13, Chris Angelico <> wrote:
    > On Sat, Jan 14, 2012 at 5:15 AM, Grant Edwards <> wrote:
    >> That assumes that C++ programmers understand C++.

    >
    > I understand C++ very well. That's why I use Python or Pike.
    >
    > (With apologies to Larry Wall)


    Were one inclined to troll a bit, one might be tempted to claim that
    using C++ is prima facie evidence of not understanding C++.

    Not that I would ever claim something inflamitory like that...

    --
    Grant Edwards grant.b.edwards Yow! Thousands of days of
    at civilians ... have produced
    gmail.com a ... feeling for the
    aesthetic modules --
    Grant Edwards, Jan 13, 2012
    #13
  14. Eduardo Suarez-Santana

    Neil Cerutti Guest

    On 2012-01-13, Grant Edwards <> wrote:
    > On 2012-01-13, Chris Angelico <> wrote:
    >> On Sat, Jan 14, 2012 at 5:15 AM, Grant Edwards
    >> <> wrote:
    >>> That assumes that C++ programmers understand C++.

    >>
    >> I understand C++ very well. That's why I use Python or Pike.
    >>
    >> (With apologies to Larry Wall)

    >
    > Were one inclined to troll a bit, one might be tempted to claim
    > that using C++ is prima facie evidence of not understanding
    > C++.
    >
    > Not that I would ever claim something inflamitory like that...


    On the Python newsgroup, it's funny. ;)

    --
    Neil Cerutti
    Neil Cerutti, Jan 13, 2012
    #14
  15. Eduardo Suarez-Santana

    Neil Cerutti Guest

    On 2012-01-13, Evan Driscoll <> wrote:
    > On 01/13/2012 10:54 AM, Neil Cerutti wrote:
    >> If you've ever implemented operator=, operator+, and operator+=
    >> in C++ you'll know how and why they are different.

    >
    > At the same time, you'd also know that that implementing them
    > in such a way that 'a += b' does *not* perform the same action
    > as 'a = a + b' is considered very bad-mannered.
    >
    > In fact, it's often suggested (e.g. in "More Effective C++"'s Item 22,
    > though this is not the main thrust of that section) to implement
    > operator+ in terms of += to ensure that this is the case:
    > MyType operator+ (MyType left, MyType right) {
    > MyType copy = left; copy += right; return copy;
    > }


    They perform the same action, but their semantics are different.
    operator+ will always return a new object, thanks to its
    signature, and operator+= shall never do so. That's the main
    difference I was getting at.

    >> A C++ programmer would be wondering how either can work on
    >> immutable objects, and that's where Python's magical rebinding
    >> semantics come into play.

    >
    > IMO a C++ programmer wouldn't be likely to wonder that much at
    > all because he or she wouldn't view the objects as immutable to
    > begin with. :) 'x = 5; x += 1;' makes perfect sense in C++,
    > just for a somewhat different reason.


    I was thinking of const objects, but you are correct that
    immutable isn't really a C++ concept.

    --
    Neil Cerutti
    Neil Cerutti, Jan 13, 2012
    #15
  16. Ethan Furmanæ–¼ 2012å¹´1月14日星期六UTC+8上åˆ2時40分47秒寫é“:
    > Steven D'Aprano wrote:
    > > Normally this is harmless, but there is one interesting little glitch you
    > > can get:
    > >
    > >>>> t = ('a', [23])
    > >>>> t[1] += [42]

    > > Traceback (most recent call last):
    > > File "<stdin>", line 1, in <module>
    > > TypeError: 'tuple' object does not support item assignment
    > >>>> t

    > > ('a', [23, 42])

    >
    >
    > There is one other glitch, and possibly my only complaint:
    >
    > --> a = [1, 2, 3]
    > --> b = 'hello, world'
    > --> a = a + b
    > Traceback (most recent call last):
    > File "<stdin>", line 1, in <module>
    > TypeError: can only concatenate list (not "str") to list
    > --> a += b
    > --> a
    > [1, 2, 3, 'h', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd']
    >
    > IMO, either both + and += should succeed, or both should fail.
    >
    > ~Ethan~


    The += operator is not only for value types in the above example.

    An operator of two operands and an operator of three operands of
    general object types are two different operators.
    88888 Dihedral, Jan 13, 2012
    #16
  17. Ethan Furmanæ–¼ 2012å¹´1月14日星期六UTC+8上åˆ2時40分47秒寫é“:
    > Steven D'Aprano wrote:
    > > Normally this is harmless, but there is one interesting little glitch you
    > > can get:
    > >
    > >>>> t = ('a', [23])
    > >>>> t[1] += [42]

    > > Traceback (most recent call last):
    > > File "<stdin>", line 1, in <module>
    > > TypeError: 'tuple' object does not support item assignment
    > >>>> t

    > > ('a', [23, 42])

    >
    >
    > There is one other glitch, and possibly my only complaint:
    >
    > --> a = [1, 2, 3]
    > --> b = 'hello, world'
    > --> a = a + b
    > Traceback (most recent call last):
    > File "<stdin>", line 1, in <module>
    > TypeError: can only concatenate list (not "str") to list
    > --> a += b
    > --> a
    > [1, 2, 3, 'h', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd']
    >
    > IMO, either both + and += should succeed, or both should fail.
    >
    > ~Ethan~


    The += operator is not only for value types in the above example.

    An operator of two operands and an operator of three operands of
    general object types are two different operators.
    88888 Dihedral, Jan 13, 2012
    #17
  18. On 01/13/2012 03:20 PM, Neil Cerutti wrote:
    > They perform the same action, but their semantics are different.
    > operator+ will always return a new object, thanks to its
    > signature, and operator+= shall never do so. That's the main
    > difference I was getting at.


    I was talking about the combination of + and =, since the discussion is
    about 'a = a + b' vs 'a += b', not 'a + b' vs 'a += b' (where the
    differences are obvious).

    And I stand by my statement. In 'a = a + b', operator+ obviously returns
    a new object, but operator= should then go and assign the result to and
    return a reference to 'a', just like how 'a += b' will return a
    reference to 'a'.

    If you're working in C++ and overload your operators so that 'a += b'
    and 'a = a + b' have different observable behaviors (besides perhaps
    time), then either your implementation is buggy or your design is very
    bad-mannered.

    Evan
    Evan Driscoll, Jan 13, 2012
    #18
  19. Eduardo Suarez-Santana

    John O'Hagan Guest

    On Fri, 13 Jan 2012 10:40:47 -0800
    Ethan Furman <> wrote:

    > Steven D'Aprano wrote:
    > > Normally this is harmless, but there is one interesting little
    > > glitch you can get:
    > >
    > >>>> t = ('a', [23])
    > >>>> t[1] += [42]

    > > Traceback (most recent call last):
    > > File "<stdin>", line 1, in <module>
    > > TypeError: 'tuple' object does not support item assignment
    > >>>> t

    > > ('a', [23, 42])


    IMHO, this is worthy of bug-hood: shouldn't we be able to conclude from the TypeError that the assignment failed?

    > There is one other glitch, and possibly my only complaint:
    >
    > --> a = [1, 2, 3]
    > --> b = 'hello, world'
    > --> a = a + b
    > Traceback (most recent call last):
    > File "<stdin>", line 1, in <module>
    > TypeError: can only concatenate list (not "str") to list
    > --> a += b
    > --> a
    > [1, 2, 3, 'h', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd']
    >
    > IMO, either both + and += should succeed, or both should fail.
    >
    > ~Ethan~



    This also happens for tuples, sets, generators and range objects (probably any iterable), AFAIK only when the left operand is a list. Do lists get special treatment in terms of implicitly converting the right-hand operand?

    The behaviour of the "in-place" operator could be more consistent across types:

    >>> a=[1,2]
    >>> a+=(3,4)
    >>> a

    [1, 2, 3, 4]
    >>> a=(1,2)
    >>> a+=(3,4)
    >>> a

    (1, 2, 3, 4)
    >>> a=(1,2)
    >>> a+=[3,4]

    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: can only concatenate tuple (not "list") to tuple


    John
    John O'Hagan, Feb 2, 2012
    #19
  20. Eduardo Suarez-Santana

    Rick Johnson Guest

    On Jan 13, 10:48 am, Devin Jeanpierre <> wrote:
    > On Fri, Jan 13, 2012 at 10:13 AM, Grant Edwards <>wrote:
    > > On 2012-01-13, Devin Jeanpierre <> wrote:
    > >> On Fri, Jan 13, 2012 at 7:30 AM, Chris Angelico <> wrote:

    > There's a bit of a feeling
    > that code should "do what it looks like" and be sort of understandable
    > without exactly understanding everything.


    Yeah there's a word for that; INTUITIVE, And I've been preaching its
    virtues (sadly in vain it seems!) to these folks for some time now.
    Rick Johnson, Feb 2, 2012
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. John Blair
    Replies:
    4
    Views:
    1,073
    John Saunders
    Jan 12, 2005
  2. Steve Richter
    Replies:
    4
    Views:
    5,427
    Steve Richter
    Apr 18, 2005
  3. Steve Franks

    Copy Web tool does not copy sub dirs?

    Steve Franks, Sep 14, 2005, in forum: ASP .Net
    Replies:
    3
    Views:
    372
    Steve Franks
    Sep 15, 2005
  4. Alex
    Replies:
    2
    Views:
    1,195
  5. Replies:
    26
    Views:
    2,075
    Roland Pibinger
    Sep 1, 2006
Loading...

Share This Page