confused about Python assignment

H

Haoyu Zhang

Dear Friends,
Python assignment is a reference assignment. However, I really can't
explain the difference in the following example. When the object is
a list, the assignment seems to be a reference. However, when the
object is a number, the story is so different. I don't understantd
for what reason Python deals with them differently. The example is
as follows:
a = [3, 2,1]
b = a
a,b ([3, 2, 1], [3, 2, 1])
a[1]=0
a,b ([3, 0, 1], [3, 0, 1])
c = 1
d = c
c,d (1, 1)
c=0
c,d
(0, 1)

Thanks a lot.

Best,
Haoyu
 
E

Emile van Sebille

Haoyu Zhang said:
Dear Friends,
Python assignment is a reference assignment. However, I really can't
explain the difference in the following example. When the object is
a list, the assignment seems to be a reference. However, when the
object is a number, the story is so different. I don't understantd
for what reason Python deals with them differently. The example is
as follows:

bind the label 'a' to the mutable list [3,2,1]

bind 'b' to what 'a' is bound to
a,b ([3, 2, 1], [3, 2, 1])
a[1]=0

replace index 1 in what 'a' is bound to with '0'
a,b ([3, 0, 1], [3, 0, 1])
c = 1

bind c to the immutable integer 1

bind 'd' to what 'c' is bound to

now bind c to the immutable integer 0


HTH,

Emile van Sebille
(e-mail address removed)
 
S

Stephen Horne

Dear Friends,
Python assignment is a reference assignment. However, I really can't
explain the difference in the following example. When the object is
a list, the assignment seems to be a reference. However, when the
object is a number, the story is so different. I don't understantd
for what reason Python deals with them differently. The example is
as follows:

I've been confused by this in the past, but it turns out that there
isn't much to be confused about once you understand what is happening.

Python assignment binds the right-hand-side value to the
left-hand-side placeholder (variable, slot within list or whatever).
It doesn't affect whatever value used to be referenced by that
placeholder except in terms of Pythons internal garbage-collection
housekeeping.

This assignment is *ALWAYS* by reference, including with integers, but
with immutable values such as integers the by-reference thing isn't so
obvious.

It is hard to make this clear with integers (at least for simple
examples, there seems only ever to be one object with the value '1'
for instance) but it can be made obvious with floats, which are also
immutable...
True

At this stage, x and y don't just have equivalent values - they refer
to exactly the same object.
True

At this stage, although the value of y is equal to the value of x,
these two values are held in different objects, at different locations
in memory.

Unlike a list, however, there is no way to modify the internals of a
float in-place - that is the whole point of being immutable - and so
the fact that the assignment does not make copies is unlikely to be
noticed unless you specifically look for it using the 'is' operator.

There is a significant error in the logic of your example...
a = [3, 2,1]
b = a
a,b ([3, 2, 1], [3, 2, 1])
a[1]=0

The line above is an in-place modifier of the object.
a,b ([3, 0, 1], [3, 0, 1])
c = 1
d = c
c,d (1, 1)
c=0

The line above is an assignment to the variable, which doesn't care
about the object that variable previously referenced.

In other words, this code is comparing chalk with cheese. The
equivalent code using lists to compare to your integer example would
be...
a=[3,2,1]
b=a
a,b ([3, 2, 1], [3, 2, 1])
a=[3,0,1]
a,b
([3, 0, 1], [3, 2, 1])

This comparison is fair because the same assignment method (simple
assignment of value to variable) is used for the second assignment to
a, just as is done in your integer code. An in-place modification
cannot be done to an integer because it is immutable, and so its use
in the list case is very much apples compared to oranges.

I'm not a big fan of this myself, and I've ended up making these same
mistakes a number of times in the past - though I think the point has
now been beaten in well enough for a lifetime. It can be difficult for
those of us who find copy semantics more intuitive. But really, I
suppose this is something most programmers will need to get used to -
it's not just Python that works like this, and the pattern seems to be
catching on as higher level languages get more popular. For example,
C# and Java - both supposed to be higher level replacements for C++ -
have similar semantics for assignments with all class instances and
most non-basic types.
 
P

Peter Hansen

Stephen said:
[snip]
This assignment is *ALWAYS* by reference, including with integers, but
with immutable values such as integers the by-reference thing isn't so
obvious.

It is hard to make this clear with integers (at least for simple
examples, there seems only ever to be one object with the value '1'
for instance) but it can be made obvious with floats, which are also
immutable...

It's equally "obvious" with integers, if you pick values above 99 or
below -5 (found by experimentation this time, as my memory is always
poor, and using Python 2.3.1).

Try

a = 99
b = 99
a is b (returns True)

a = 100
b = 100
a is b (returns False)

Values between -5 and 99 inclusive are basically pre-created and
"interned" so that you never get multiple integer instances with the
same values in this range. Those outside this range are not treated
specially like this.

-Peter
 
F

Francis Avila

Haoyu Zhang said:
Dear Friends,
Python assignment is a reference assignment.

Not strictly.

In Python, there are names and there are objects.

Rule1: Names refer to objects and ONLY to objects.
Rule2: Names are not objects.

This doesn't mean that names aren't 'things', though.
Making a name refer to an object is called "binding" the name.

(Note that everything that follows may be a simplification or in some way
wildly inaccurate. I'm not an internals hacker, so I don't know much
beyond what I need to use Python itself. I'm only explaining phenomenon
here.)

Ramifications of the rules:

When you have an assignment statement like this:

A = B

You are NOT saying, "name A refers to name B", with the implication that
whenever you say "A", Python looks at B, and so on until it resolves to a
"real" object. This breaks Rule1, because you are assuming that Rule2 is
false.

What this assignment says is "A refers to that object, to which name B also
refers." So the names 'A' and 'B' never have any long-standing relationship
to one-another, or any relationship at all.

So, you're juggling around names, but strictly speaking, you rarely *see*
the objects, except for those built-in types that have literals, like 1, [],
{}, etc. (See the Language Reference.) Even then, there is a certain way in
which you never truly see those objects either. Objects are invisible: you
only know they exist because they act and are acted upon--their behavior
can be observed, *through names that point to them*.

And although names are not objects, names can be *in* objects. (In fact,
its impossible for there to be a name outside an object.) A name is merely a
thing that refers to an object--nothing prevents it from being contained in
an object. Sometimes these names are exposed--we call them "attributes" of
the object in that case. Sometimes they aren't exposed in any direct way.
I'll fudge a bit and call these "anonymous names".

create a 3, 2, and 1 object.
create a list, whose members contain three "anonymous names" which refer to
3, 2, 1, respectively.
Make name 'a' refer to the list object.
([3, 2, 1], [3, 2, 1])

Create a tuple object, with two members, each of which refers to the list
above.
But don't make any name refer to this new tuple object. (This object is now
a candidate for removal by the garbage collector, because nothing refers to
it, and hence it is impossible for a Python program to use it.)

This is a slight simplification (there's actually a 'slice' object involved
here), but:
Create a 0 object.
Modify the list object which 'a' represents (and also 'b', if you remember),
making its second anonymous name refer to that 0 object.
([3, 0, 1], [3, 0, 1])

Create a tuple object, with two anonymous names, which refer to the list
referred to by 'a' and by 'b', respectively.

But 'a' and 'b' refer to the same object. When that object was modified two
commands previous, the assignment may have used the 'a' *name* to refer to
that object, but it was the *object* that was modified.

Thus we see, an object can have several different names point to it.
Suppose someone has a dog. He calls him 'spot'. His neighbor knows that
same dog. He calls him 'that damn dog'. One day the neighbor shoots and
kills the dog. When the owner questions his neighbor about it, and the
neighbor says, 'yeah, I shot that damn dog', will the owner say, 'oh, well,
at least you didn't shoot spot'?
Create a 1 object. (As an optimization, Python will sometimes not recreate
immutable objects if one like it already exists. Such objects are said to be
'interned'. But, since such objects are immutable, it makes absolutely no
difference, except to 'is' and id(), which compare object *identities*
(i.e., addresses in memory) and not mere equivalence. You can make your own
interned objects with intern().)
(Again, that is an oversimplification. Interned objects are *pre*-created,
before any names refer to them. They are thus immortal.)
Make name 'c' refer to that 1 object.
Make name 'd' refer to that 1 object that 'c' refers to.
(1, 1)

Create a tuple, whose members are anonymous names, each pointing to the 1
object that c and d refer to.
Create a 0. (Again, not usually.)
Make name 'c' refer to that 0 object.
(0, 1)

Make a tuple...well, you know the rest. :)


Now, see, that's easy stuff. It's once you realize that functions, classes,
types, *everything*, even *bytecode*, is an object, that things get a bit
hairy.

Here's a common slip-up:.... return a
....

This creates a function object, which refers to a code object, which the
Python compiler created. Also in the function object is the name 'a', which
refers to a dictionary object containing an integer object mapped to the
singleton None object (But 'None' is just a name for that None object. You
can rebind it like any other name.)
This runs the code object (creating a new, derived object, called an
'execution frame', which is code object + state) to which 'g' refers, and
gives back the object that the code object spits out--namely, that to which
'a' refers, which is the dictionary object.

Now, lets try this:.... a[None] = a[None] + 1
.... return a
....{None: 3}

The function g *changes* the dictionary to which 'a' refers, because 'a' is
in the *function* object (which persists) and *not* in the execution frame
derived from the function's code object, which is created anew each time the
function is called.
.... a = {None:0}
.... a[None] = a[None] + 1
.... return a
....{None: 1}

See the difference?

The same thing with class objects. Attributes defined in the class object,
instead of in code blocks of methods, are shared among all instances of the
class. Further, instances of class objects don't typically call their own
methods, but call the methods of their class. That's what 'self' is.
Python is making "instance.method(arg)" into "classobj.method(instance,
arg)" behind your back. To make an instance is to make a new object filled
with names that all refer to the class, and *then* to call
classobj.__init__(newobject). So, strictly speaking, 'init' isn't a
constructor, because it doesn't *make* the instance, but only changes it,
*from the outside*.

Now, fill in all my hand-waving and myth by reading the Language Reference,
if you want the whole story. Otherwise, you know everything you need to
know for 90% of anything you'll ever do in Python.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top