Explanation of list reference

dave em · Feb 14, 2014

Hello,

Background: My twelve y/o son and I are still working our way through Invent Your Own Computer Games with Python, 2nd Edition.
(We finished the Khan Academy Javascript Tutorials is the extent of our experience)

He is asking a question I am having trouble answering which is how a variable containing a value differs from a variable containing a list or more specifically a list reference.

I tried the to explain as best I can remember is that a variable is assigned to a specific memory location with a value inside of it. Therefore, the variable is kind of self contained and if you change the variable, you change the value in that specific memory location.

However, when a variable contains a list reference, the memory location of the variable points to a separate memory location that stores the list. Itis also possible to have multiple variable that point to the memory location of the list reference. And all of those variable can act upon the list reference.

Question: Is my explanation correct? If not please set me straight

And does anyone have an easier to digest explanation?

Thanks in advance,
Dave

Jussi Piitulainen · Feb 14, 2014

dave said:
He is asking a question I am having trouble answering which is how a
variable containing a value differs from a variable containing a
list or more specifically a list reference.

My quite serious answer is: not at all. In particular, a list is a
value.

All those pointers to references to locations are implementation
details. The user of the language needs to understand that an object
keeps its identity when it's passed around: passed as an argument,
returned by a function, stored in whatever location, retrieved from
whatever location.

Ryan Gonzalez · Feb 14, 2014

Hello,

Background: My twelve y/o son and I are still working our way through Invent Your Own Computer Games with Python, 2nd Edition.
(We finished the Khan Academy Javascript Tutorials is the extent of our experience)

He is asking a question I am having trouble answering which is how a variable containing a value differs from a variable containing a list or more specifically a list reference.

I tried the to explain as best I can remember is that a variable is assigned to a specific memory location with a value inside of it. Therefore, the variable is kind of self contained and if you change the variable, you change the value in that specific memory location.

However, when a variable contains a list reference, the memory location of the variable points to a separate memory location that stores the list. It is also possible to have multiple variable that point to the memory location of the list reference. And all of those variable can act upon the list reference.

Question: Is my explanation correct? If not please set me straight

And does anyone have an easier to digest explanation?

Thanks in advance,
Dave

You've got it backwards. In Python, /everything/ is a reference. The
variable is just a "pointer" to the actual value. When you change a
variable, you're just changing the memory location it points to.

Strings, ints, tuples, and floats behave differently because they're
/immutable/. That means that they CANNOT modify themselves. That's why
all of the string methods return a new string. It also means that, when
you pass one two a function, a /copy/ of it is made and passed instead.

So, back to the original subject. Everything is a reference. When you do
this:

|x = [1,2,3]
x = [4,5,6]
|

x now points to a different memory location. And, when you do this:

|x[0] =99000
x[0] =100
|

you're just changing the memory location that |x[0]| points to.

Ned Batchelder · Feb 14, 2014

Hello,

Background: My twelve y/o son and I are still working our way through Invent Your Own Computer Games with Python, 2nd Edition.
(We finished the Khan Academy Javascript Tutorials is the extent of our experience)

He is asking a question I am having trouble answering which is how a variable containing a value differs from a variable containing a list or more specifically a list reference.

I tried the to explain as best I can remember is that a variable is assigned to a specific memory location with a value inside of it. Therefore, the variable is kind of self contained and if you change the variable, you change the value in that specific memory location.

However, when a variable contains a list reference, the memory location of the variable points to a separate memory location that stores the list. It is also possible to have multiple variable that point to the memory location of the list reference. And all of those variable can act upon the list reference.

Question: Is my explanation correct? If not please set me straight

And does anyone have an easier to digest explanation?

Thanks in advance,
Dave

Names in Python refer to values. Thinking in terms of memory locations
might just confuse things.

This is my best explanation of the details:
http://nedbatchelder.com/text/names.html

dave em · Feb 14, 2014

dave em writes:

My quite serious answer is: not at all. In particular, a list is a

value.

All those pointers to references to locations are implementation

details. The user of the language needs to understand that an object

keeps its identity when it's passed around: passed as an argument,

returned by a function, stored in whatever location, retrieved from

whatever location.

Jessi,

Thanks for your quick response. I'm still not sure we understand. The code below illustrates the concept we are trying to understand.

Case 1: Example of variable with a specific value from P 170 of IYOCGWP
42

Case 2: Example of variable with a list reference from p 170

spam = [0, 1, 2, 3, 4, 5]
cheese = spam
cheese[1] = 'Hello!'
spam [0, 'Hello!', 2, 3, 4, 5]
cheese

Click to expand...

Click to expand...

[0, 'Hello!', 2, 3, 4, 5]

What I am trying to explain is this, why in case 1 when acting on spam (changing the value from 42 to 100) only affects spam and not cheese. Meanwhile, in case two acting on cheese also affects spam.

Thanks and v/r,
Dave

Denis McMahon · Feb 14, 2014

dave em writes:

My quite serious answer is: not at all. In particular, a list is a

value.

All those pointers to references to locations are implementation

details. The user of the language needs to understand that an object

keeps its identity when it's passed around: passed as an argument,

returned by a function, stored in whatever location, retrieved from

whatever location.

Click to expand...

Jessi,

Thanks for your quick response. I'm still not sure we understand. The
code below illustrates the concept we are trying to understand.

Case 1: Example of variable with a specific value from P 170 of IYOCGWP
42

Case 2: Example of variable with a list reference from p 170

spam = [0, 1, 2, 3, 4, 5]
cheese = spam cheese[1] = 'Hello!' spam [0, 'Hello!', 2, 3, 4, 5]
cheese

Click to expand...

Click to expand...

[0, 'Hello!', 2, 3, 4, 5]

What I am trying to explain is this, why in case 1 when acting on spam
(changing the value from 42 to 100) only affects spam and not cheese.
Meanwhile, in case two acting on cheese also affects spam.

A list is a container for multiple values, when you do:

cheese = spam

You're pointing cheese and spam at the same container. Now anything you
do to the container (whether by referencing it as cheese or spam) will
affect the container.

If you want cheese and spam to start out as separate copies of the same
list that you can manipulate independently, then you can use:

cheese = [ x for x in spam ]
eggs = spam[:]
ham = list( spam )

spam = [1,2,3,4,5]
cheese = [ x for x in spam ]
ham = list( spam )
eggs = spam[:]
spam [1, 2, 3, 4, 5]
cheese [1, 2, 3, 4, 5]
ham [1, 2, 3, 4, 5]
eggs [1, 2, 3, 4, 5]
cheese[3] = "fred"
ham[4] = 'ham'
eggs[4] ='eggs'
spam [1, 2, 3, 4, 5]
cheese [1, 2, 3, 'fred', 5]
ham [1, 2, 3, 4, 'ham']
eggs

Click to expand...

Click to expand...

[1, 2, 3, 4, 'eggs']

Marko Rauhamaa · Feb 14, 2014

dave em said:
Case 1: Example of variable with a specific value from P 170 of IYOCGWP
42

Case 2: Example of variable with a list reference from p 170

spam = [0, 1, 2, 3, 4, 5]
cheese = spam
cheese[1] = 'Hello!'
spam [0, 'Hello!', 2, 3, 4, 5]
cheese

Click to expand...

Click to expand...

[0, 'Hello!', 2, 3, 4, 5]

What I am trying to explain is this, why in case 1 when acting on spam
(changing the value from 42 to 100) only affects spam and not cheese.
Meanwhile, in case two acting on cheese also affects spam.

A very good question! Elementary and advanced at the same time.

There are two fundamentally different kinds of values in Python: "small"
values and "big" values. A variable can only hold a small value. A list
element can only hold a small value. A dictionary entry can only hold a
small value. The same is true for an object member (aka field).

So we have four kinds of (memory) slots: variables, list elements,
dictionary entries and fields. Any slot can only hold a small value.

The small values include numbers, booleans (True or False) and
references. All other values are big, too big to fit in a slot. They
have to be stored in a "vault" big enough to hold them. This vault is
called the heap. Big values cannot be stored in slots directly; instead,
references to big values are used.

Let me now annotate your excellent example:

spam = 42 # put the small value 42 (number) in a memory slot,
# namely a variable named "spam"

cheese = spam # copy the contents of the variable "spam" into
# another memory slot, a variable named "cheese;"
# now both variables contain the same small value 42

spam = 100 # replace the contents of the variable "spam" with the
# small value 100; leave the contents of the variable
# "cheese" intact

spam

100 # as expected
cheese
42 # ditto

spam = [0, 1, 2, 3, 4, 5]
# a list is a "big" value; the statement creates a
# list of six slots in the heap an puts a number in
# each slot; then, a reference to the list is placed
# in the variable "spam"

cheese = spam # copy the reference to the six-element list from the
# variable "spam" into the variable "cheese;" the heap
# still contains only one list, and the two variables
# refer to the same one

# (rationale: big values take time and space to copy
# in full, and almost always copying references is
# good for the problem at hand; if a full copy is
# needed, Python has ways to do that, too)

cheese[1] = 'Hello!'
# a character string (text snippet) is a "big" value;
# the statement creates the six-character string
# 'Hello!' in the heap; then, a reference to the
# string is placed in the second element of the list
# referred to by the variable "cheese"

# (that's a complicated sentence with lots to chew
# even though the Python statement looks so innocently
# simple)

# there still is a single list in the heap; the list
# is still referred to by both variables; however the
# second slot of the list, which used to hold the
# number 1, has been replaced with a reference to the
# "big" string 'Hello!'

spam

[0, 'Hello!', 2, 3, 4, 5]

# as expected, right?

cheese

[0, 'Hello!', 2, 3, 4, 5]

# right?

The final situation is represented by this picture of Python's memory:

spam cheese
+-----+ +-----+
| . | | . |
+--+--+ +--+--+
| |
| | VARIABLES
= = =|= = = = = = =|= = = = = = = = = = = = = = = = = = =
| / THE HEAP
| ---------
| /
| |
v v
+-----+-----+-----+-----+-----+-----+
| 0 | . | 2 | 3 | 4 | 5 | a list
+-----+--+--+-----+-----+-----+-----+
|
|
|
|
v a string
+--------+
| Hello! | a string
+--------+

Marko

Ian Kelly · Feb 14, 2014

Thanks for your quick response. I'm still not sure we understand. The code below illustrates the concept we are trying to understand.

Case 1: Example of variable with a specific value from P 170 of IYOCGWP
42

Case 2: Example of variable with a list reference from p 170

spam = [0, 1, 2, 3, 4, 5]
cheese = spam
cheese[1] = 'Hello!'
spam [0, 'Hello!', 2, 3, 4, 5]
cheese

Click to expand...

Click to expand...

[0, 'Hello!', 2, 3, 4, 5]

What I am trying to explain is this, why in case 1 when acting on spam (changing the value from 42 to 100) only affects spam and not cheese. Meanwhile, in case two acting on cheese also affects spam.

In the first case, after the assignment "cheese = spam", the names
spam and cheese are bound to the same object (42). If you were to
modify the object 42 (which you cannot do in this case, because ints
are immutable) then you would see the change reflected in the object
regardless of which name you used to access it. You then rebind the
name "spam" to a different object (100), which does not affect the
binding of the name "cheese" at all; the names end up referring to
different objects.

In the second case, after the assignment "cheese = spam", the names
again are bound to the same object, a list. The assignment "cheese[1]
= 'Hello!'" then *modifies* that list, without rebinding cheese.
cheese and spam continue to refer to the same object, and since it was
modified you can see the change in that object regardless of which
name you used to access it.

If in the second case, you were to explicitly copy the list (e.g.
"cheese = list(spam)") prior to modifying it, then the two names would
instead be bound to different objects, and so subsequently modifying
one would not affect the other.

So the short answer is that there is no difference at all between the
way that names are bound to ints and the way they are bound to lists.
There only superficially appears to be a difference because ints are
immutable and lists are not.

Ian Kelly · Feb 14, 2014

There are two fundamentally different kinds of values in Python: "small"
values and "big" values. A variable can only hold a small value. A list
element can only hold a small value. A dictionary entry can only hold a
small value. The same is true for an object member (aka field).

So we have four kinds of (memory) slots: variables, list elements,
dictionary entries and fields. Any slot can only hold a small value.

The small values include numbers, booleans (True or False) and
references. All other values are big, too big to fit in a slot. They
have to be stored in a "vault" big enough to hold them. This vault is
called the heap. Big values cannot be stored in slots directly; instead,
references to big values are used.

This is nonsense. Python the language makes no such distinction
between "big" and "small" values. *All* objects in CPython are stored
internally on the heap. Other implementations may use different
memory management schemes.

Jussi Piitulainen · Feb 14, 2014

dave said:
Thanks for your quick response. I'm still not sure we understand.
The code below illustrates the concept we are trying to understand.

Case 1: Example of variable with a specific value from P 170 of IYOCGWP

42

In case 1, you have only assignments to variables. After spam = 100,
the value of spam is another number. The previous number 42 itself is
still 42 - number are not mutable, and no attempt was made to change
the number.

In cheese = spam, cheese is the variable while spam is a variable
reference and stands for 42.

Case 2: Example of variable with a list reference from p 170

spam = [0, 1, 2, 3, 4, 5]
cheese = spam
cheese[1] = 'Hello!'
spam [0, 'Hello!', 2, 3, 4, 5]
cheese

Click to expand...

Click to expand...

[0, 'Hello!', 2, 3, 4, 5]

The first two statements in case 2 are assignments to variables, just
like in case 1, but the third statement is different: it doesn't
change the value of the variable (the value is still the same object)
but it does change the value (replaces one element of the list with
another).

You don't need to mention the variable cheese here. Note how the value
of spam is now different (though still the same object), even though
you didn't mention spam at all when you changed it.

Python syntax to replace an element of a list looks much like an
assignment - many languages do this - but it's not. Behind the scenes
it's a method call

where cheese is a variable reference. You are calling a method of the
list that is the value of the variable.

What I am trying to explain is this, why in case 1 when acting on
spam (changing the value from 42 to 100) only affects spam and not
cheese. Meanwhile, in case two acting on cheese also affects spam.

Would it help to say that in case 1 the relevant statement acts on the
variable while in case 2 it acts on the value of the variable? This is
accurate, I just don't know if it happens to be the thing that helps.

One last thing: a variable is not an object.

Ned Batchelder · Feb 14, 2014

This is nonsense. Python the language makes no such distinction
between "big" and "small" values. *All* objects in CPython are stored
internally on the heap. Other implementations may use different
memory management schemes.

Marko, I have to agree with Ian. While I really like the picture you
drew, the distinction between big and small values is pure fiction.
CPython does not store ints directly in list elements, for example.

All names are references to values. All list elements (and dictionary
values, dictionary keys, set elements, etc) are references to values. A
value can be another container like a list, or it can be something
"simple" like an int.

This covers all the details, including pictures like Marko's, but with
an explanation why we draw the ints inside the boxes:
http://nedbatchelder.com/text/names.html

Marko Rauhamaa · Feb 14, 2014

Ian Kelly said:
This is nonsense. Python the language makes no such distinction
between "big" and "small" values. *All* objects in CPython are stored
internally on the heap. Other implementations may use different memory
management schemes.

You're right, of course. Conceptually, the "everything is a reference"
and the "small"/"big" distinction are equivalent (produce the same
outcomes). The question is, which model is easier for a beginner to
grasp.

Say you write:

1 + 2

You may not find it most intuitive to follow through the object
instantiation and reference manipulation implicit in the "everything is
a reference" model when you think you understand numbers but have little
idea of memory, objects, heap, allocation etc.

Marko

Chris Angelico · Feb 14, 2014

Say you write:

1 + 2

You may not find it most intuitive to follow through the object
instantiation and reference manipulation implicit in the "everything is
a reference" model when you think you understand numbers but have little
idea of memory, objects, heap, allocation etc.

I don't object to a bit of handwaving where it doesn't matter
(especially as regards language design versus language interpreter
design - I'll happily talk about "storing an object on the heap"
without going into the details of allocating memory, managing
reference counts, and so on; the details of how CPython goes about
storing stuff on the heap isn't particularly significant), but be
careful of simplifications that will cause problems down the line.
Distinguishing "small values" from "big values" leads to the obvious
question: Which is which? And why doesn't this work?
True

Seems legit... you set z equal to x, and then z is the same as x.
Okay, let's try that slightly differently.
False

What's different? How come I can do comparisons with 'is' sometimes
but not other times? (And just to make things more confusing, if you
do this in CPython with small numbers, it'll *seem* to work.)

The only way to explain it thoroughly is to fully distinguish between
names and objects, and explain what assignment actually means. Then
it's obvious that, in the first case, the identity check passes, while
in the second case, it doesn't.

ChrisA

Marko Rauhamaa · Feb 14, 2014

Marko Rauhamaa said:
You're right, of course. Conceptually, the "everything is a reference"
and the "small"/"big" distinction are equivalent (produce the same
outcomes). The question is, which model is easier for a beginner to
grasp.

In fact, if you adjust my annotations to the given example from the
"everything is a reference" point of view, you'll see that the
explanations become a whole deal longer and probably more confusing.
Also, the picture becomes much messier with more strings attached.

Marko

Terry Reedy · Feb 14, 2014

He is asking a question I am having trouble answering which is how a
variable containing a value differs from a variable containing a list
or more specifically a list reference.

I tried the to explain as best I can remember is that a variable is
assigned to a specific memory location with a value inside of it.

The data model of C and other memory-oriented languages is *different*
from the object model of Python and similar languages. The concept of
'memory address' is fundamental to C and absent from Python. To
understand Python, one must think in terms of its referenced object
model and not C's memory-address model.

Consider name-binding statements, also called assignment statements.
The simplest form is
<name> = <expression>

As always, the expression evaluates to a Python object. It may be an
existing object or it may by a newly created. Either way, the name is
associated with the object. Subsequent to the name binding, the name, as
an expression, evaluates to the object it points to.

An object can have 0 to many names associated with it. That set of names
can change. A name can be associated with only one object at a time, but
can be rebound to other objects.

All objects have class (type) and a value. The value of some objects can
be changed (lists, sets, dicts, most user-defined classes), others are
fixed (numbers, strings, tuples, frozensets).

a = 1 # a points to int object with value 1; a == 1
b = a # b points to the same int object; b == 1 and b is a
a = '1' # a now points to a str object with value '1'; a == '1'

c = [1,2,3] # c points to a list object whose value is the
# given sequence of int objects; c == [1,2,3]
d = c # d is the same list object as c
d[1] = '2' # This associates the indicated 'slot' in the
collection object d (which is also c) with a '2' string object.

'Variable' in C (a fixed memory location with a mutable value) is quite
different from 'variable' in Python. Most people in Python mean a fixed
name, potentially associated with a sequence of objects. In CPython,
each object would usually be at a different C location. 'Variable' in
math has multiple meanings.

Marko Rauhamaa · Feb 14, 2014

Chris Angelico said:
be careful of simplifications that will cause problems down the line.

Sure. Let it be said, though, that sometimes you learn through
inaccuracies, a technique used intentionally by Knuth's TeXBook, for
example. In fact, you get through highschool mathematics successfully
without knowing what numbers and variables actually are.

Distinguishing "small values" from "big values" leads to the obvious
question: Which is which? And why doesn't this work?

This is related to the recent id(string) question on this forum.

Unfortunately neither the "everything is a reference" model nor the
"small/big" model help you predict the value of an "is" operator in the
ambiguous cases.

Back to the original question, though. Python, I think, is a great
introductory programming language to a complete newbie. Explaining
Python's memory model at some level is necessary right off the bat.
However, it is far from easy to understand. I'm not sure the small/big
way is the best approach, but it seeks to bridge the gap from the naive
understanding of tutorial day one to the presented question (tutorial
day two).

Marko

pecore · Feb 14, 2014

dave em said:
He is asking a question I am having trouble answering which is how a
variable containing a value differs from a variable containing a
list or more specifically a list reference.

s/list/mutable object/

# Mr Bond and Mr Tont are two different ob^H^H persons
james_bond = SecretAgent()
james_tont = SecretAgent()

# in some circles, Mr Bond is know as agent 007
agent_007 = james_bond

# Mr Bond, aka 007, is sent to the Caribbeans to crush Spectre
agent_007.move_to('Barbados')
print agent_007.location
print james_bond.location

# Mr Bond, alas, retires and his place in the Mi5 is taken, alas, by Mr Tont
agent_007 = james_tont

# Mr Tont, aka 007, is sent to Hong Kong to, to, whatever...
agent_007.move_to('Hong Kong')
print agent_007.location
print james_bond.location

Ned Batchelder · Feb 15, 2014

Sure. Let it be said, though, that sometimes you learn through
inaccuracies, a technique used intentionally by Knuth's TeXBook, for
example. In fact, you get through highschool mathematics successfully
without knowing what numbers and variables actually are.

Yes, sometimes for teaching reasons, you have to over-simplify or even
introduce artificial constructs. I'd recommend acknowledging them as such.

When you say, "There are two fundamentally different kinds of values in
Python," or "So we have four kinds of (memory) slots," you aren't
letting on that this is a teaching construct. It sounds like you mean
that this is how Python actually works.

I'd use words like, "This is an oversimplification, but might help...",
or "You can think of it like ...".

dave em · Feb 15, 2014

All,

Thanks for the excellent explanations and for sharing your knowledge. I definitely have a better understanding than I did this morning.

Best regards,
Dave

Chris Angelico · Feb 15, 2014

Unfortunately neither the "everything is a reference" model nor the
"small/big" model help you predict the value of an "is" operator in the
ambiguous cases.

Can you give an example of an ambiguous case? Fundamentally, the 'is'
operator tells you whether its two operands are exactly the same
object, nothing more and nothing less, so I assume your "ambiguous
cases" are ones where it's possible for two things to be either the
same object or two indistinguishable ones.

The only situation I can think of is that immutables are allowed to be
interned, which is why this comes up True (in CPython) when it would
come up False with larger values (as I demonstrated earlier):
True

ChrisA

What's the detailed explanation for why the 1st function is correct and the 2nd is wrong?	3	Dec 16, 2022
Uncaught Reference Errors	1	Oct 9, 2022
Average of MultiMode of a list of a list	1	Oct 28, 2022
How does a HEAD pointer end up pointing to the first node in a linked list?	3	Jan 24, 2023
Select files based on text list of filenames(part of the name:date) with condition	0	May 4, 2022
Need help solving this problem	1	May 9, 2023
Select Eof extension files based on text list of filenames with if condition	0	May 4, 2022
Pass variable by reference	6	May 6, 2014

Explanation of list reference

dave em

Jussi Piitulainen

Ryan Gonzalez

Ned Batchelder

dave em

Denis McMahon

Marko Rauhamaa

Ian Kelly

Ian Kelly

Jussi Piitulainen

Ned Batchelder

Marko Rauhamaa

Chris Angelico

Marko Rauhamaa

Terry Reedy

Marko Rauhamaa

pecore

Ned Batchelder

dave em

Chris Angelico

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads