Finding the instance reference of an object

Steve Holden · Nov 7, 2008

Steven said:
Steven said:

On Thu, 06 Nov 2008 09:59:37 -0700, Joe Strout wrote: [...]
And by definition, "call by value" means that the parameter is a copy.
So if you pass a ten megabyte data structure to a function using
call-by- value semantics, the entire ten megabyte structure is copied.

Since this does not happen in Python, Python is not a call-by-value
language. End of story.

Without knowing that, you don't know what assignments to the formal
parameter will do, or even what sort of arguments are valid. Answer:
it's a copy of it.
Lies, all lies. Python doesn't copy variables unless you explicitly ask
for a copy. That some implementations of Python choose to copy pointers
rather than move around arbitrarily large blocks of memory instead is
an implementation detail. It's an optimization and irrelevant to the
semantics of argument passing in Python.

Click to expand...

[...]

Are you sure you meant to write this?

Click to expand...

I'm not sure I understand what you're getting at. The only thing I can
imagine is that you think there's some sort of contradiction between my
two statements. I don't see why you think so. In principle one could
create an implementation of Python with no stack at all, where everything
lives in the heap, and all argument passing is via moving the actual
objects ("large blocks of memory") rather than references to the objects.

I don't mean to imply that this is a practical way to go about
implementing Python, particularly given current computer hardware where
registers are expensive and the call stack is ubiquitous. But there are
other computing models, such as the "register machine" model which uses a
hypothetically-infinite number of registers, no stack and no heap. With
sufficient effort, one could implement a Python compiler using only a
Turing Machine. Turing Machines don't have call-by-anything semantics.
What would that say about Python?

Alternatively, you're commenting about my comment about copying pointers.
I thought I had sufficiently distinguished between what takes place at
the level of the Python VM (nothing is copied without an explicit request
to copy it) and what takes place in the VM's implementation (the
implementation is free to copy whatever pointers it wants as an
optimization). If I failed to make that clear, I hope I have done so now.

Pointers do not exist at the level of the Python VM, but they clearly
exist in the C implementation. We agree that C is call-by-value. CPython
is (obviously) built on top of C's call-by-value semantics, by passing
pointers (or references if you prefer) to objects. But that doesn't imply
that Python's calling semantics are the same as C's.

Algol is call-by-name. Does anyone doubt that we could develop an Algol
implementation of Python that used nothing but call-by-name semantics at
the implementation level? Would that mean that Python was call-by-name?

The point is that Joe's argument is based on a confusion between the C
implementation (calling a function with argument x makes a copy of a
pointer to x and puts it into the stack for the function) and what
happens at the level of the Python language.

I am probably egregiously misunderstanding. The practical difficulty
with the "moving huge blocks of data" approach would appear to emerge
when a function that gets passed an instance of some container then
calls another function that references the same container as a global,
for example.

Again, I don't think we disagree about how Python actually works, but it
seemed to me that your statement obscured rather than helped.

regards
Steve

Steven D'Aprano · Nov 7, 2008

I think we're saying the same thing. What's a name? It's a string of
characters used to refer to something. That which refers to something
is a reference.

In some sense, I have to agree with that. "Reference" as a plain English
word is very abstract.

The thing it refers to is a referent. I was trying to
avoid saying "the value of an object reference is a reference to an
object" since that seems tautological and you don't like my use of the
word "value," but I see you don't like "contains" either.

I'm happier with the idea that a name in Python "refers to" an object
than your claim that "variables" contain a reference to an object. Let me
explain:

In languages such as C and Pascal, a variable is a named memory location
with an implied size. For the sake of the argument, let's assume
variables are all two bytes in size, i.e. they can hold a single short
integer. So, if the name 'x' refers to location 0x23A782, and the two
bytes at that location are 0x0001, then we can legitimately say that the
location 0x23A782 (otherwise known as 'x') _contains_ 1 because the byte
pattern representing 1 is at that memory location.

But in Python, what you've been calling "variables" is explicitly a
*mapping* between a name and a value. Unlike variables above, the
compiler can't map a name to a memory location. At run time, the VM has
to search a namespace for the name. If you disassemble Python byte-code,
you will see things like:

LOAD_NAME 1 (x)

If you want to talk about something containing the value, that something
would be the namespace, not the name: the name is the key in the hash
table, and is a separate piece of data to the value. The key and the
value are at different locations, you can't meaningfully say that the
value is contained by the key, for the same reason that given a list

[10, 11, 12, 13, 14, 15]

you wouldn't say that the int 12 was contained by the number 2.

Maybe we can try something even wordier: a variable in Python is, by
some means we're not specifying, associated with an object. (This is
what I mean when I say it "refers" to the object.) Can we agree that
far?

So far.

Now, when you pass a variable into a method, the formal parameter gets
associated with same object the actual parameter was associated with.
Agreed.

I like to say that the object reference gets copied into the formal
parameter, since that's a nice, simple, clear, and standard way of
describing it. I think you object to this way of saying it. But are we
at least in agreement that this is what happens?

At the implementation level of CPython, yes. In abstract, no. In
abstract, we can't make *any* claims about what happens beyond the Python
code, because we don't know how the VM is implemented. Perhaps it is a
giant pattern in Conway's cellular automata "Life", which is Turing
Complete.

In practice, any reasonable implementation of Python on existing computer
hardware is going to more-or-less do what the CPython implementation
does. But just because every existing implementation does something
doesn't mean it's not an implementation detail.

But putting that aside, consider the Python code "x = 1". Which
statement would you agree with?

(A) The value of x is 1.

Click to expand...

Only speaking loosely (which we can get away with because numbers are
immutable types, as pointed out in the last section of [1]).

Why "speaking loosely"?

At the level of Python code, the object you have access to is nothing
more or less than 1. There is a concrete representation of the abstract
Platonic number ONE, and that concrete representation is written as 1.

The fact that the object 1 is immutable rather than mutable is
irrelevant. After x = [] the value of x is the empty list.

As far as I am concerned, this is one place where the plain English
definition of the word "value" is the only meaningful definition: "what
is denoted by a symbol". Or if you prefer, "what the symbol represents".
At the language level, x=1 means that x represents the object 1, nothing
more and nothing less, regardless of how the mechanics of that
representation are implemented.

The value of a variable is whatever thing you assign to that variable. If
that thing is the int 1, then the value is the int 1. If the thing is a
list, the value is that list. If the thing is a pointer, the value is a
pointer. (Python doesn't give you access to pointers, but other languages
do.) Whatever mechanism is used to implement that occurs at a deeper
level. In Pascal and C, bytes are copied into memory locations (which is
actually implemented by flipping bits at one location to match the state
of bits at another location). In CPython and Java, pointers or references
are created and pointed at complex data structures called objects. That's
an implementation detail, just like flipping bits is an implementation
detail.

If you don't agree with me on this, I'm afraid that your understanding of
value is so different from mine that I fear we will never find any common
ground. I'm afraid that in this context I consider any other definition
of "value" to be obtuse and obfuscatory and out-and-out harmful.

Hmm, this might be true to somebody working at the implementation level,

Okay, we agree on that.

but I think we're all agreed that that's not the level of this
discussion. What's relevant here is how the language actually behaves,
as observable by tests written in that language.

As far as I can see, that implementation level _is_ the level you are
talking at. You keep arguing that the value of x is a reference to the
object, a reference which is implementation specific and determined at
runtime.

Correct again.

Well. I'm not sure what else I can say to that other than, "Why on earth
would you prefer a pointless, obtuse and obfuscatory claim over one which
is equally true but far more useful and simple?"

My answer is:

(C) The value of x is a reference to an immutable object with the value
of 1. (That's too wordy for casual conversation so we might casually
reduce this to (A), as long as we all understand that (A) is not
actually true. It's a harmless fiction as long as the object is
immutable; it becomes important when we're dealing with mutable
objects.)

But it doesn't matter. And that's important. I've seen this before, in
other people. They get hung up about the difference between mutable and
immutable objects and start assuming a difference in Python's behaviour
that simply isn't there. When assigning to a name, Python makes no
distinction between mutable and immutable objects:

dis.dis( compile('x=set([1]); y=frozenset([1])', '', 'exec') )

Click to expand...

Click to expand...

1 0 LOAD_NAME 0 (set)
3 LOAD_CONST 0 (1)
6 BUILD_LIST 1
9 CALL_FUNCTION 1
12 STORE_NAME 1 (x)
15 LOAD_NAME 2 (frozenset)
18 LOAD_CONST 0 (1)
21 BUILD_LIST 1
24 CALL_FUNCTION 1
27 STORE_NAME 3 (y)
30 LOAD_CONST 1 (None)
33 RETURN_VALUE

This is explicitly stated in
the Python docs [1], yet many here seem to want to deny it.

Click to expand...

[1] http://www.python.org/doc/2.5.2/ext/refcounts.html

Click to expand...

You have a mysterious and strange meaning of the word "explicitly".
Would you care to quote what you imagine is this explicit claim?

Click to expand...

A few samples: "The chosen method is called reference counting. The
principle is simple: every object contains a counter, which is
incremented when a reference to the object is stored somewhere, and
which is decremented when a reference to it is deleted. When the counter
reaches zero, the last reference to the object has been deleted and the
object is freed. ...Python uses the traditional reference counting
implementation..."

Implementation details again. It says nothing about what is visible at
the level of Python code.

This seems like a point we really shouldn't need to argue. Do you
really want to defend the claim that Python does not use references?

Python does not use references. Python uses names and objects. The
CPython implementation implements such names and objects using references
(pointers). Other implementations are free to make other choices at the
implementation level.

At the Python level, the programmer has access to objects:

(1, 3, [], None)

is a tuple (an object) consisting of four objects 1, 3, an empty list and
None. There's no capacity to request a reference to an object: if there
was, it would be like Pascal's var parameters.

At the implementation level, the above tuple is implemented (in part) by
four pointers. But that's invisible at the Python level. It doesn't exist
at the Python level: you can't access those pointers, you can't do
anything with them, except indirectly by manipulating names and objects.

Here's an analogy: at the Python level we say that strings are immutable:
they can't be changed. But at the implementation level that's clearly
nonsense: strings are merely bytes no different from any other bytes, and
they are as mutable as any others. But from Python code, the programmer
has no way to mutate a string. Such behaviour isn't part of Python. We
can rightly say that Python has no mutable strings, even though at the
implementation level strings are mutable.

Whew! That's a relief. A week ago (or more?), it certainly sounded
like some here were claiming that Python is c-b-r (usually followed by
some extended hemming and hawing and except-for-ing to explain why you
couldn't do the above).

Yes. Such confusion is very common, because people discover that Python
isn't call-by-value since mutations to arguments in a function are
visible outside of the function, and assume that therefore Python must be
call-by-reference.

The canonical test for "call by value" semantics is if you can write a
function like this:

x = [1] # an object that supports mutation
mutate(x)
assert x == [1]

If mutations to an argument in a function are *not* reflected in the
caller's scope, then your language may be call-by-value. But if
mutations
are visible to the caller, then your language is definitely not c-b-v.

Click to expand...

Aha. So, in your view, neither C, nor C++, nor Java, nor VB.NET are c-
b-v, since all of those support passing an object reference into a
function, and using that reference to mutate the object.

Take C out of that list. C is explicitly call-by-value, since it always
does copying of values. If you want to avoid copying the value, you have
to write your function to accept a pointer to the value you care about
and then dereference the pointer inside the function.

As for C++, Java and VB.NET, I would argue that using the term call-by-
value for them is misleading. I'm not the only such person who believes
so:

"As in Java, the calling semantics are call-by-sharing: the formal
argument variable and the actual argument share the same value, at least
until the argument variable is assigned to. Assignments to the argument
variable do not affect the value passed; however, if the value passed was
an array, assignments to elements of that array will be visible from the
calling context as well, since it shares the same array object."

http://www.cs.cornell.edu/courses/cs412/2001sp/iota/iota.html

Trust me, I didn't write that.

Your view is at odds with the standard definition, though; in fact I'm
pretty sure we could dig up C and Java specs that explicitly spell out
their c-b-v semantics, and RB and VB.NET pretty clearly mean "ByVal" to
indicate by-value in those languages.

I accept that C (like Pascal) is c-b-v. Given the following C code:

int count;
int x[1000];
for( count = 0; count < 1000; count++ )
x[count] = count;

the value of the variable 'x' is an array of ints 0,1,2,...999. When you
call a function with argument x, the entire array is copied. Call-by-
value: the variable's value is copied into the function's scope.

For Java to be considered c-b-v, we have to agree that the value of x
following that assignment is not the array of ints that the source code
suggests it is, but some arbitrary pointer to that array. If we agree on
that definition, we can agree that Java is c-b-v, but I maintain it is a
foolish definition.

I can't imagine what reason people had for tossing out the commonsense
meaning of the word "value". What benefit does it give? For those whose
first language was Pascal or Fortran or C and had only heard of two
calling conventions it avoids the need to learn the name for a third
convention, but the cost is that c-b-v no longer has a single meaning. It
now has at least two meanings: C call-by-value is different from Java
call-by-value, because when you call a C function with an array the
entire array is copied, and when you call a Java function with an array
the entire array is *not* copied. Different program behaviour with the
same name.

That's bad enough when it happens with an element of language syntax but
it is unforgivable when it happens to something which is supposed to be
generic to all languages.

Not only do you lose the regular dictionary meaning of the word value,
but you introduce a second meaning to call-by-value. When you say
"Language Foo is call-by-value", you have no way of telling what
listeners will understand by that. If they come from a Pascal or C
background, they will understand one thing ("values are copied"), and if
they come from a Java or VB.NET background they will understand something
very different ("sometimes values are copied, and sometimes pointers to
the value are copied").

The canonical test of c-b-v is whether a *reassignment* of the formal
parameter is visible to the caller.

That's just the same test for c-b-r: it's a variation of swap(x, y) but
using only one variable. That's equivalent to assuming that c-b-r and c-b-
v are a dichotomy: if a language isn't one, it must be the other. Wrong,
false, harmful!

Simply using the parameter for
something (such as dereferencing it to find and change data that lives
on the heap) doesn't prove anything at all about how the parameter was
passed.

But it does: if a mutation to the argument is visible to the caller, then
you know you haven't mutated a copy. If no copy was made, then it isn't
call-by-value.

That can be true only if at least one of the following is true:

1. Python's semantics are different from C/C++ (restricted to pointers),
Java, and RB/VB.NET; or

Why would you restrict C/C++ to pointers? Are you now going to argue that
these languages have different calling conventions depending on the type
of the argument?

In C, the calling convention is precisely the same whether the argument
is a pointer or an int. In both cases, the value is copied.

Python's semantics are different from C since values are not copied when
you pass them to a function, but (as fair as I know) the same as that of
the others.

2. C/C++ (restricted to pointers), Java, and RB/VB.NET are not call-by-
value.

Take C out of that list and I will agree.

I asked you before which of these you believed to be the case, so we
could focus on that, but I must have missed your reply. Can you please
clarify?

From your above c-b-v test, I guess you would argue point 2, that
none of those languages are c-b-v. If so, then we can proceed to
examine that in more detail. Is that right?

Sure.

That would indeed be nonsense. But it's also not what I'm saying. See
[2] again for a detailed discussion and examples. Call-by-value and
call-by-reference are quite distinct.

Click to expand...

And also a false dichotomy.

Click to expand...

I've never claimed these are the only options; just that they're the
only ones actually used in any of the languages under discussion. If
you think Python uses call by name, call by need, call by macro
expansion,

No, none of those.

or something else at [2], please do say which one. "Call by
object", as far as I can tell, is just a made-up term for call-by- value
when the value is an object reference.

And you think that "call-by-reference" or "call-by-value" is anything
other than "made-up"? What, you think that these terms have always
existed, as far back as language? They're made-up terms too. I would say
that the value is the object, full stop. The reference is an
implementation detail.

As we've repeatedly said, "call-by-sharing" has a good pedigree: it goes
back at least to Barbara Liskov and CLU in 1974, in the dawn of object-
oriented programming.

http://en.wikipedia.org/wiki/Barbara_Liskov

The real question isn't why Python should use the term "call-by-sharing",
but why the Java and VB people didn't use it.

(And I'm reasonably OK with that
as long as we're all agreed that that is what it means.)

False dichotomy again. The correct answer is, "Neither. It _is_ the
actual parameter." We can prove this by using Python's scoping rules:
.... print x is y
....

x = ["something arbitrary"]
func(x)

Click to expand...

Click to expand...

True

In case you think this is an artifact of mutable objects:
True

The value of x and the value of y are the same object. But the names are
different. (They would be different even if the function parameter was
called 'x', because it is in a different namespace.) Nothing we do to the
name y can affect the name x. But things that we do to the object bound
to y *can* affect the object bound to x, because they are the same object.

Right. And if (as is more sensible) you pass a reference to a ten MB
data structure to a function using call-by-value, then the reference is
copied.

Sure. I agree. But in this case, the reference is the value. In Pascal,
you would do something like this:

x := enormous_array(); {returns a 10MB array}
y := func(^x);

if you couldn't or didn't write func to use a var parameter.

So your claim is that any language that includes references (which is
all OOP languages, as far as I'm aware), is not call-by-value?

No, I'm not making that claim. Calling conventions and the existence of
references in the implementation of an OO language are orthogonal: one
does not imply anything about the other.

Consider a hypothetical OO language where variables are memory locations.
x=Foo would set the value of the variable x to some object Foo. As an
implementation detail, this might be implemented just as you say: the
variable (memory location) x contains a reference to the object Foo, just
like Java, except without primitive types.

Now we call a function func(x). What happens next?

Well, knowing that the language is OO doesn't tell us *anything* about
what happens next. We can probably make an educated guess that, if the
language is running on a von Neumann machine (a safe bet in the real
world!), there's probably a stack involved, and the implementation will
probably work by pushing references on the stack. What else can we
predict? Nothing. Here are a couple of alternative calling behaviours:

- The implementation makes a copy of the object Foo, creates a new
reference to it, and assigns that reference to the formal parameter of
the function. I would call that "call-by-value". What would you call it?

- The implementation makes a copy of the reference to object Foo, and
assigns it to the formal parameter of the function. Following Barbara
Liskov, I would call that "call-by-sharing". Following the effbot, I
would also accept "call-by-object".

- The implementation makes a reference to the location of x, rather than
a reference to the object, and passes that to the function. Reassignments
to the formal parameter are reflected in the caller's scope. That would
be call-by-reference.

Any of these would be reasonable choices for a designer to make, although
the first one has all the disadvantages of call-by-value in languages
like C and Pascal. Nevertheless, if you want that behaviour in your OOP
language, you can have it.

Hmm, I'm struggling to understand why you would say this. Perhaps you
mean that Python doesn't copy *objects* unless you explicitly ask for a
copy. That's certainly true.

We can agree on this.

But it does copy references in many circumstances, including in
assignment statements, and parameter passing.

At the implementation level, not at the Python level.

I agree that under the hood, there are probably other ways to get the
same behavior. What's important is to know whether the formal parameter
is an alias of the actual parameter, or its own independent local
variable that (let me try to say it more like your way here) happens to
be initially associated with the same referent as the actual parameter.
This obviously has behavioral consequences, as you showed above.

A concise way to describe the behavior of Python and other languages is
to simply say: the object reference is copied into the formal parameter.

Hence the alias for "call-by-sharing", "call-by-object-reference".
Personally I find that term too long and unwieldy and prefer call-by
either sharing or object.

Well I can trivially refute that by counterexample: I come from a
background in C, Pascal, and FORTRAN, and I agree with it.

Well there you go. Serves me right for making a sweeping generalization.

As for where I get my definitions from, I draw from several sources:

1. Dead-tree textbooks
2. Wikipedia [2] (and yes, I know that has to be taken with a grain of
salt, but it's so darned convenient)
3. My wife, who is a computer science professor and does compiler
research
4. http://javadude.com/articles/passbyvalue.htm (a brief but excellent
article)
5. Observations of the "ByVal" (default) mode in RB and VB.NET 6. My own
experience implementing the RB compiler (not that implementation details
matter, but it forced me to think very carefully about references and
parameter passing for a very long time)

Which makes you so close to the trees that you can't see the forest.
You're not think at the level of Python code. You're not even thinking at
the level of the Python virtual machine. You're thinking about what
happens to make the Python virtual machine work.

Not that I'm trying to argue from authority; I'm trying to argue from
logic. I suspect, though, that your last comment gets to the crux of
the matter, and reinforces my guess above: you don't think c-b-v means
what most people think it means.

No, I make no claim about who is in a majority. What I argue is that the
Java and VB communities have taken a term with an established meaning,
and are using it for something which is at best pedantically true at a
deeper implementation level. I think they are foolish to have done so,
and their actions have led to imprecision in language: "call-by-value" is
no longer a single strategy with a single observable behaviour, but now
refers to at least two incompatible behaviours: Pascal/C style, and Java/
VB style. I don't accept that alternative meaning, because I insist that
at the level of Python code, after x=1 the value of x is 1 and not an
arbitrary, artificial, implementation-dependent reference to 1.

Steven D'Aprano · Nov 7, 2008

I am probably egregiously misunderstanding. The practical difficulty
with the "moving huge blocks of data" approach would appear to emerge
when a function that gets passed an instance of some container then
calls another function that references the same container as a global,
for example.

I have no doubt whatsoever that such an implementation would be fragile,
complicated, convoluted and slow. In other words, it would be terrible.
But that's merely a Quality of Implementation issue. It would still be
Python.

Steve Holden · Nov 7, 2008

Steven said:
I have no doubt whatsoever that such an implementation would be fragile,
complicated, convoluted and slow. In other words, it would be terrible.
But that's merely a Quality of Implementation issue. It would still be
Python.

OK, more specifically: I don't see how changes to the copy of the
(structure referenced by) the argument would be reflected in the global
structure. In other words, it seems to involve a change of semantics to
me, so I think I am misunderstanding you.

regards
Steve

Steven D'Aprano · Nov 7, 2008

I think of it this way: every variable is an object reference; no
special syntax needed for it because that's the only type of variable
there is. (Just as with Java or .NET, when dealing with any class type;
Python is just a little more extreme in that even simple things like
numbers are wrapped in objects.)

Note: I tried to say "name" above instead of "variable" but I couldn't
bring myself to do it -- "name" seems to generic to do that job. Lots
of things have names that are not variables: modules have names, classes
have names, methods have names, and so do variables.

But modules, classes and methods are also objects, and they can be bound
to names.

What's the name of the module after this piece of code?

import math
foo = math
del math

Unfortunately, the term "name" is *slightly* ambiguous in Python. There
are names, and then there are objects which have a name attribute, which
holds a string. This attribute is usually called __name__ but sometimes
it's called other things, like func_name.

The __name__ attribute of objects is an arbitrary label that the object
uses for display purposes. But names are the entities that Python code
usually uses to refer to objects.

[...]

what
argument is there that the Python community should use its own unique
terminology for concepts that apply equally well to other languages?
Wouldn't communication be easier and smoother if we adopted standard
terms for standard behavior?

You're assuming that call-by-sharing isn't a standard term. That's not
true. It's a standard term that is sadly not well known, I believe
because of the ignorance of most coders to the history of their own
discipline. (I include myself in that.)

You're also assuming that the use of "call-by-value" to refer to very
different behaviours in C and Java somehow makes communication easier and
smoother. I don't think it does.

[...]

In a language that supports
integers and doubles as simple types, stored directly in a variable,
then it is an obvious generalization that in the case of an object
type, the value is a reference to an object.

How is it a generalization?

Using Python syntax instead of Java, but let's pretend that Python ints
are primitives just like in Java, but floats are not. I do this to avoid
any confusion over mutable/immutable, or container types. Nice simple
values, except one is an object and one is a primitive type.

x = 1 # the value of x is 1
x = 1.0 # the value of x is 0x34a5f0

Where is the generalization that you speak of? In a generalization, you
extend the same rule to cover slightly different circumstances. But
that's not what happens here. In the first place, you start with the rule
that the value of x is the thing you assign to x. In the second place,
you change the rule to say that the value of x is an arbitrary
implementation-dependent pointer to the thing you assign to x. That's not
a generalization, it's an anti-generalization.

But Python doesn't have those simple types, so there is a temptation
to try to skip this generalization and say that references are not
values, but rather the values are the objects themselves (despite the
dereferencing step that is still required to get any data out of
them).

What dereferencing step? There's no such thing in Python code. You just
use the name, as normal.

Oh sure, at the deep implementation level there's a dereferencing step,
but that applies for primitive types in C too. When you refer to a
variable x in C, the CPU has to look into a memory location to find out
what the value of x is. That applies whether x is an int on the stack or
a object in the heap. But it's not something that the C programmer has to
worry about, it's an implementation detail handled by the compiler and
the runtime environment. It's not part of your C program.

In the same way the dereferencing of Python names is not part of your
Python program. It's an implementation detail.

However, it seems to me that when you start denying that the value of
an object reference is a reference to an object, this is when you get
led into a quagmire of contradictions.

Let's hear some of these contradictions.

Personally, they would have to be pretty big to make me give up saying
that the value of x after x=1 is 1.

[...]

But continuing to attempt to gloss over that fact, when you come to
parameter passing, you're then stuck trying to avoid describing it as
call by value, since if you claim that what a variable contains is the
object itself, then that doesn't fit (since clearly the object itself
is not copied).

Your reasoning is backwards. Call-by-value by definition implies that the
value is copied. If the value isn't copied, it can't be call-by-value.
You shouldn't change the definition of "value" in order to hammer the
square peg of your language into the round hole of "call-by-value",
especially since there has been a perfectly fine alternative name for the
behaviour since at least 1974.

What the Java and VB.NET communities have essentially done is redefine
"horse" to mean "internal combustion engine" simply to avoid accepting
that there are such things as horseless carriages.

You also have to describe the assignment operator as
different from all other languages, since clearly that's not copying
the object either.

A slight pedantic note: Python doesn't have an assignment operator, it is
a statement. You can't override assignment in Python.

*If* somebody reliably told me that assignment in "all other
languages" (what, Forth, Lisp, Brainf*ck, Intercal, Algol-60, Ruby, *all*
of them???) meant copying, then I'd quite happily accept that Python
assignment is different, since it clearly doesn't copy the value.
Assignment in Python operates on *names*, not objects. "x=y" means
"change the name 'x' so that it refers to the current value of 'y'", not
copy the value of y to x.

So, while I'm trying this path on for size (and will continue to mull
it over further), please try on this approach: boldly admit that
they're [variables] references, and embrace that fact.

Why on earth would I want to give up saying that the value of x after x=1
is 1 just to satisfy people who can't cope with the existence of
horseless carriages?

Perhaps we could
round up 20 newbies, divide them into two groups of 10, give each one
a 1-page explanation either based on passing object references by-
value, or passing values sort-of-kind-of-by-reference, and then check
their comprehension by predicting the output of some code snippets.
That'd be very interesting. It's hard for me to believe that the
glossing-over-references approach really is easier for anybody, but
maybe I'm wrong.

Well Joe, you've seen for yourself at least one person in this thread
read YOUR explanation of Python's behaviour and conclude from that that
Python is call-by-reference.

Then there's these:

"I was under the assumption that everything in python was a reference.
So if I code this: ... I though the contents of lst would be modified."

http://mail.python.org/pipermail/python-list/2006-January/360222.html

"Python passes references to objects by value (like Java), and everything
in Python is an object. This sounds simple, but then you will notice that
some data types seem to exhibit pass-by-value characteristics, while
others seem to act like pass-by-reference... what's the deal?"

http://www.goldb.org/goldblog/CommentView,guid,4eb92070-c279-44b3-
ac2a-5d1c4f3e8115.aspx

(The deal is, p-b-v and p-b-r are not the only two choices.)

And here:

"Python passes all arguments using 'pass by reference'."

http://www.penzilla.net/tutorials/python/functions/

And from the one thread:

"Python uses bog standard call-by-reference" -- Nick Maclaren,
University of Cambridge Computing Service:
http://mail.python.org/pipermail/python-list/2000-April/030077.html

"I would describe Python parameter passing as call-by-value, with the
wrinkle that the value being passed is always a reference to an object."
-- Greg Ewing, Computer Science Dept, University of Canterbury (NZ)
http://mail.python.org/pipermail/python-list/2000-April/030315.html

Steven D'Aprano · Nov 7, 2008

OK, more specifically: I don't see how changes to the copy of the
(structure referenced by) the argument would be reflected in the global
structure. In other words, it seems to involve a change of semantics to
me, so I think I am misunderstanding you.

What, you want me to come up with an implementation? Okay, fine.

The VM keeps a list of the namespaces that contain such "cloned" objects.
After each Python statement is executed, before the next one gets to run,
the VM runs through that list and synchronizes each object in the
caller's scope with the value of its clone in the function scope.

Marc 'BlackJack' Rintsch · Nov 7, 2008

OK, more specifically: I don't see how changes to the copy of the
(structure referenced by) the argument would be reflected in the global
structure. In other words, it seems to involve a change of semantics to
me, so I think I am misunderstanding you.

Of course every copy has a list of references to other copies and time
stamps associated with every attribute. So when you access an attribute
the implementation first chases all the links to the copies, and checks
if the attribute has a newer time stamp there, so the value has to be
copied back. =

)

More serious: If you start to design distributed object oriented systems
with call-by-sharing semantics you may end up with actually copying the
value and propagating changes or checking for them in other copies.

Ciao,
Marc 'BlackJack' Rintsch

Joe Strout · Nov 7, 2008

But modules, classes and methods are also objects, and they can be
bound
to names.

OK, that's a good point. It strikes me as a generally Bad Idea to
actually take advantage of that (i.e., to reassign to a class or
module name), but I guess Python allows it.

Unfortunately, the term "name" is *slightly* ambiguous in Python.
There
are names, and then there are objects which have a name attribute,
which
holds a string. This attribute is usually called __name__ but
sometimes
it's called other things, like func_name.

The __name__ attribute of objects is an arbitrary label that the
object
uses for display purposes. But names are the entities that Python code
usually uses to refer to objects.

Right. This may lead to the confusion that started this thread, i.e.
someone asking how to find the "name" of an arbitrary object (by which
they meant the entry in some namespace that refers to it, what I like
to call a variable name). If you think of objects as "having" names
(which as you point out is sort of true in some cases but not in
general), rather than names as referring to objects, then you get into
this trouble.

You're also assuming that the use of "call-by-value" to refer to very
different behaviours in C and Java somehow makes communication
easier and
smoother. I don't think it does.

OK, so now we're getting down to it: you think that Python's behavior
is like Java, but Java's behavior is different from C, right?

What would it take to convince you that Java and C have exactly the
same semantics, and differ only in syntax? Would equivalent code
snippets that do the same thing do the job?

How is it a generalization?

Because you start with, say, integers, and make such observations as:

1. x = y copies the integer value from y into x.
2. foo(x), where foo's parameter is by-value, copies x into the formal
parameter.
3. foo(x), where foo's parameter is by-reference (e.g. using & in C++,
or using ByRef in RB/VB.NET), makes the formal parameter an alias of x.

Then you look at a declaration of (speaking loosely) object type, such
as Java's

SomeClass x;

or C++'s

SomeClassPtr x; // where typedef SomeClass* SomeClassPtr;

Then you ask, how do the above situations 1-3 apply to this? Well,
they apply just fine, except that what is being copied or aliased is
the reference to an object rather than an integer.

That's the obvious generalization I was thinking of.

Using Python syntax instead of Java, but let's pretend that Python
ints
are primitives just like in Java, but floats are not. I do this to
avoid
any confusion over mutable/immutable, or container types. Nice simple
values, except one is an object and one is a primitive type.

x = 1 # the value of x is 1
x = 1.0 # the value of x is 0x34a5f0

Strictly true, but irrelevant if float objects are immutable.
Immutable objects can be treated as values; it's only by mutating an
object that you can tell that you're dealing with references to shared
data.

What dereferencing step? There's no such thing in Python code. You
just
use the name, as normal.

If you just use the name, then you're accessing the reference. To get
to the actual data, you have to dereference it with ".". (Granted,
this dereferencing is often done within operator methods so that it's
mostly hidden from you in many cases, especially when dealing with
number- and string-like objects.)

Oh sure, at the deep implementation level there's a dereferencing
step,
but that applies for primitive types in C too. When you refer to a
variable x in C, the CPU has to look into a memory location to find
out
what the value of x is.

That's not what I'm talking about. I'm talking about "person.age" in
Python, Java, or .NET, or "person->age" in C/C++.

Personally, they would have to be pretty big to make me give up saying
that the value of x after x=1 is 1.

I've stated repeatedly that this is fine shorthand. You only get into
trouble when, instead of assigning 1, you are assigning (a reference
to) some mutable object. Then you have to think about whether you are
copying the data or just copying a reference to it.

Your reasoning is backwards. Call-by-value by definition implies
that the
value is copied. If the value isn't copied, it can't be call-by-value.

Quite right. We just disagree on what "value" means. I think this is
because Python is so restricted: everything is an object reference,
and these are always passed by value. So you try to gloss over the
details which may be more obvious in languages that have other data
types and evaluation strategies.

I'd be fine with that if it actually simplified things for Python
newbies, but I haven't seen that it does. Pretending that Python
variables actually contain their objects then requires you to launch
into long explanations of such things as why assignment doesn't make a
copy of the data, whether objects have names, and so on.

You shouldn't change the definition of "value" in order to hammer the
square peg of your language into the round hole of "call-by-value"

I'm not changing the definition of "value," I'm merely being precise
about it. You want to be loose about it -- and keep trying to support
that practice by citing examples where such looseness works fine --
but when dealing with mutable types, it is NOT fine, and leads people
into trouble. If the "value" of x is a person named Sam of species
Hobbit, then

y = x
y.species = 'Elf'

would not change Sam into an elf. But in Python, it does. At this
point, you will launch into your explanation of how, not only is
Python's calling convention different from other languages, but its
assignment operator is quite different too, so the above doesn't do
what you would expect it to do when dealing with object values.

But all that extra explanation becomes unnecessary if you just admit
that x and y are mere references, and assignment statements (or
parameter calls) copy the reference, not the object.

especially since there has been a perfectly fine alternative name
for the
behaviour since at least 1974.

Where, in the LISP community? Why can't I find this venerated name in
any of my CS references?

What the Java and VB.NET communities have essentially done is redefine
"horse" to mean "internal combustion engine" simply to avoid accepting
that there are such things as horseless carriages.

No, they've realized that no new term is needed. "int foo;" declares
an integer. "Person foo;" declares a person reference. That there is
no explicit syntax needed to make this a reference (unlike C++) is
mere streamlining of the syntax, since it was realized that you ALWAYS
want to handle objects via references. And, once you have such
references, you can copy them or alias them, just like any other
type. Nothing new here.

*If* somebody reliably told me that assignment in "all other
languages" (what, Forth, Lisp, Brainf*ck, Intercal, Algol-60, Ruby,
*all*
of them???) meant copying, then I'd quite happily accept that Python
assignment is different, since it clearly doesn't copy the value.

Well, I can't vouch for all of them. But I can vouch for quite a few.

But here you go again: you're forced to claim that Python's parameter
passing is different, AND its assignment is different, with the net
result that the behavior is *exactly the same* as Java, .NET, and so
on. Doesn't that strike you as odd? Why is it so different in so
many ways, that just happen to cancel out and result in
indistinguishable semantics?

Well Joe, you've seen for yourself at least one person in this thread
read YOUR explanation of Python's behaviour and conclude from that
that
Python is call-by-reference.

Who was that?

Then there's these:

"I was under the assumption that everything in python was a reference.
So if I code this: ... I though the contents of lst would be
modified."

http://mail.python.org/pipermail/python-list/2006-January/360222.html

That guy's confused all right, but he'll do no better with your
unusual definition of what "assignment" means. Either way, we have to
explain that "i = 4" makes i refer to something new, and does not
affect whatever it referred to before.

"Python passes references to objects by value (like Java), and
everything
in Python is an object. This sounds simple, but then you will notice
that
some data types seem to exhibit pass-by-value characteristics, while
others seem to act like pass-by-reference... what's the deal?"

http://www.goldb.org/goldblog/CommentView,guid,4eb92070-c279-44b3-
ac2a-5d1c4f3e8115.aspx

This guy is right. He's just pointing out that mutating an object
tells you NOTHING about how the reference to it was passed. That's a
red herring that I believe you have brought up a few times too.

And here:

"Python passes all arguments using 'pass by reference'."
http://www.penzilla.net/tutorials/python/functions/

Well this guy's just wrong, as can be easily demonstrated. And I know
you don't think that, but I do think he may have gotten that
impression from listening to your explanation -- it's certainly what I
thought you thought for a while. (But I no longer think that, so I
guess that's another sign of progress!)

"Python uses bog standard call-by-reference" -- Nick Maclaren,
University of Cambridge Computing Service:
http://mail.python.org/pipermail/python-list/2000-April/030077.html

That guy's sure confused, isn't he?

"I would describe Python parameter passing as call-by-value, with the
wrinkle that the value being passed is always a reference to an
object."
-- Greg Ewing, Computer Science Dept, University of Canterbury (NZ)
http://mail.python.org/pipermail/python-list/2000-April/030315.html

And this guy's spot on (and his comments apply to every other modern
OOP language, too, except for those like VB.NET which have an option
to pass parameters, including references, by reference if you really
want to).

But if nothing else, you've shown that there is a lot of confusion on
this point, and it'd be great if we could come to some consensus and
promote that. I'm sure it doesn't help that we're calling it
different things. (And, IMHO, it also doesn't help if we call it
something different from what the exact same behavior is called in
other languages.)

Best,
- Joe

Terry Reedy · Nov 7, 2008

Joe said:
On Nov 6, 2008, at 10:35 PM, Steve Holden wrote:

Note: I tried to say "name" above instead of "variable" but I couldn't
bring myself to do it -- "name" seems to generic to do that job.

Python has two types of names. Some complex objects -- modules,
classes, and functions, and wrappers and subclasses thereof, have
'definition names' that are used instead of a 'value' to print a
representation. Otherwise, names are identifiers, the term used in the
grammar.

But I agree even two meaning is one too many. Maybe 'label' would be a
better short form for 'identifier'. 'Labeling' might be clearer for
many beginners than 'name-binding'.

> Lots
of things have names that are not variables: modules have names, classes
have names, methods have names, and so do variables. If I say "name,"
an astute listener would reasonably say "name of what"

Common nouns, when instantiated, become the name of whatever particular
object they are associated with.

-- and I don't
want to have to say "name of some thing in a name space which can be
flexibly associated with an object" when the simple term "variable"
seems to work as well.

'Variable' has many more meanings than 'name' or 'label' and overall
seems to be more confusing, not less. I say this as an empirical
observation of c.l.p postings. You yourself switched back and forth
between two different meanings.

I'm with you there. To me, the consistent simplicity is exactly this:
all variables are object references, and these are always passed by value.

The me, this implies that the corresponding parameter should become a
reference to that reference value. In any case, the last 10 years of
this newsgroups shows that describing Python calling as 'by value'
confuses people so that they are surprised that mutating a list inside a
function results in the list being mutated outside the function.

But Python doesn't have those simple types, so there is a temptation to
try to skip this generalization and say that references are not values,

The Python Language Reference uses the word 'reference' but does not
define it. I take it to be 'whatever information an interpreter uses to
associate a name or collection slot with an object and to retrieve the
object (and possibly its value) when requested'.

Well of course. I'm pretty sure I've said repeatedly that Python
variables refer to objects on the heap.

Built-in objects are not allocated on the heap.

(Please replace "heap" with "object space" if you prefer.)

They are in abstract object space, although CPython occasionally leaks
the abstraction in error messages about not being able to modify
non-heap objects.

> I'm only saying that Python variables
don't contain any other type of value than references.

One problem with the word 'variable' is that variables are somethings
thought of as 'containing', as you did above. So you switch back and
forth between 'variables *are* references' and 'variables *contain*
references'. Whereas a name (noun) definitely *is* a reference and not
a container.

Ditto right back at you. So maybe here's the trouble: since all
Python variables are references,

Back to *is*.

But continuing to attempt to gloss over that fact, when you come to
parameter passing,

I believe most of us here try to follow Knuth's lead and use 'parameter'
for the local names and 'argument' for the value/object that gets
associated with the name.

not copied). You also have to describe the assignment operator as

In Python, assignment is not an operator by intentional design.

different from all other languages, since clearly that's not copying the
object either.

Python's assignment *statement* does what we routinely do in everyday
life when we assign new labels to things, whether permanently or
temporarily.

> An assignment copies the RHS
reference into the LHS variable, nothing more or less.

Back to variable as container.
An assignment associates the RHS object(s) with the LHS target(s).

> A parameter copies the argument reference into the formal parameter,

A function call associates objects derived from the argument expressions
and stored defaults with the function parameters.

Isn't that simple, clear, and far easier to explain?

Applied to my version, I agree ;-).

I wonder if that could be tested systematically. Perhaps we could round
up 20 newbies, divide them into two groups of 10, give each one a 1-page
explanation either based on passing object references by-value, or
passing values sort-of-kind-of-by-reference, and then check their
comprehension by predicting the output of some code snippets. That'd be
very interesting.

Except for your garbling of the alternative to your version, I agree.
I suspect that different people might do better with different
explanations, depending on background.

In my case, my understanding of Python became clear only once I stopped
listening to all the confusing descriptions here, and realized that
Python is no different from other OOP languages I already knew.

Whereas I learned Python without any OOP experience but enough knowledge
of C++ to know I did not want to go there.

Terry Jan Reedy

Steve Holden · Nov 7, 2008

Steven said:
What, you want me to come up with an implementation? Okay, fine.

The VM keeps a list of the namespaces that contain such "cloned" objects.
After each Python statement is executed, before the next one gets to run,
the VM runs through that list and synchronizes each object in the
caller's scope with the value of its clone in the function scope.

Right, so we replace references to values with references to namespaces?

Anyway, thanks. I'm really glad you stated up-front that this was not a
practical execution scheme.

regards
Steve

Terry Reedy · Nov 7, 2008

Unfortunately, the term "name" is *slightly* ambiguous in Python. There
are names, and then there are objects which have a name attribute, which
holds a string. This attribute is usually called __name__ but sometimes
it's called other things, like func_name.

3.0 corrects that divergence.

'foo'

You're assuming that call-by-sharing isn't a standard term. That's not
true. It's a standard term that is sadly not well known, I believe
because of the ignorance of most coders to the history of their own
discipline. (I include myself in that.)

-------------------------
http://en.wikipedia.org/wiki/Call_by_something#Call_by_sharing

Call by sharing

Also known as "call by object" or "call by object-sharing" is an
evaluation strategy first named by Barbara Liskov et al for the language
CLU in 1974[1]. It is used by languages such as Python[2] and Iota and
(as argued by some[3]) Java, although the term is not in common use by
the Java community. Call-by-sharing implies that values in the language
are based on objects rather than primitive types.

The semantics of call-by-sharing differ from call-by-reference in that
assignments to function arguments within the function aren't visible to
the caller (unlike by-reference sematics). However since the function
has access to the same object as the caller (no copy is made), mutations
to those objects within the function are visible to the caller, which
differs from call-by-value semantics.

Although this term has widespread usage in the Python community,
identical semantics in other languages such as Java and Visual Basic are
often described as call-by-value, where the value is implied to be a
reference to the object.

---------------------------------
http://www.pmg.csail.mit.edu/papers/thetaref/node34.html

Call by Sharing
The caller and called routine communicate only through the argument and
result objects; routines do not have access to any variables of the caller.

After the assignments of actual arguments to formal arguments, the
caller and the called routine share objects. If the called routine
modifies a shared object, the modification is visible to the caller on
return. The names used to denote the shared objects are distinct in the
caller and called routine; if a routine assigns an object to a formal
argument variable, there is no effect on the caller. From the point of
view of the invoked routine, the only difference between its formal
argument variables and its other local variables is that the formals are
initialized by its caller.

==========================
Python object *do* have access to surrounding scopes, but the second
paragraph exact described Python, as does the corresponding Wikipedia entry.
==================================
myweb.lmu.edu/dondi/fall2004/cmsi585/subroutines-in-depth.pdf

• Call by sharing
– Used by languages where variables are already references to objects
– Parameters are references to objects, but assignments to those parameters
don’t change them at the level of the caller — e.g. Java uses call-by-value
for primitives (int, char) and uses call-by-sharing for Objects

Joe Strout · Nov 7, 2008

Python has two types of names. Some complex objects -- modules,
classes, and functions, and wrappers and subclasses thereof, have
'definition names' that are used instead of a 'value' to print a
representation. Otherwise, names are identifiers, the term used in
the grammar.

But I agree even two meaning is one too many. Maybe 'label' would
be a better short form for 'identifier'. 'Labeling' might be
clearer for many beginners than 'name-binding'.

I actually rather like "identifier," though it seems a little too
technical for casual use. On the other hand, "label" seems too
arbitrary -- "variable" is a common term, and accurate enough to me.

'Variable' has many more meanings than 'name' or 'label' and overall
seems to be more confusing, not less. I say this as an empirical
observation of c.l.p postings. You yourself switched back and forth
between two different meanings.

Did I? What were they? I thought I had a clear idea what I mean by
it (and, as far as I know, it's consistent with what everybody else
means by it too). The only oddity in Python is that things you might
not expect to be variables (such as module names) actually are. But
that's a succinct way to express that observation; saying "module
names are variables too" seems to imply all the right things (e.g.
that you can assign new values to them, or assign them to other
variables).

To me, this implies that the corresponding parameter should become a
reference to that reference value.

Really? You've just described passing an object reference by
reference. If "passing a reference by value" implies adding an
additional reference to it, then what would "passing a reference by
reference" mean to you? And, examining your thought process, can you
explain where that implication came from?

In any case, the last 10 years of this newsgroups shows that
describing Python calling as 'by value' confuses people so that they
are surprised that mutating a list inside a function results in the
list being mutated outside the function.

Yes, but that would happen ONLY when they also think that a variable
actually contains the object data. And since there are many here
trying to claim exactly that, I think that is the root cause of the
problem, not calling parameter passing "by value."

And this belief (that the value of a variable is an object) leads to
other mistaken beliefs too, such as that assignment should make a copy:

x = y
x.foo = bar

Anyone surprised by the list mutation inside a function should also be
surprised to find that y.foo = bar. I think we should simply address
the root cause of that confusion, not try to redefine their
understanding of how parameters are passed, and ALSO redefine what
assignment means.

The Python Language Reference uses the word 'reference' but does not
define it. I take it to be 'whatever information an interpreter
uses to associate a name or collection slot with an object and to
retrieve the object (and possibly its value) when requested'.

Sure, that's fine. It doesn't really matter how it's implemented.
What matters is that a name or collection slot somehow refers to an
object, and whatever the form of that reference is, it is copied (*)
to the LHS of an assignment or to the formal parameter of a function
call.

(*) I was going to say "transferred" but when something is transferred
from A to B, it implies that B gains it and A loses it. "Copied" has
the right implication: that afterwards, both A and B now have it.

One problem with the word 'variable' is that variables are
somethings thought of as 'containing', as you did above. So you
switch back and forth between 'variables *are* references' and
'variables *contain* references'.

Perhaps this is the two meanings you mentioned above. I'm not sure I
see any useful dichotomy here, though. In C, we say that this
variable is an integer, that variable is a character, the one over
there is a double. This is unambiguous shorthand for "the declared
type of this variable is such-and-so", and each such variable contains
data of that type. So there's no harm in saying that a variable is an
integer, or that a variable contains an integer, as the context
requires.

In Python, AFAICT, there is only one type, the object reference. So,
the type of every variable is 'reference', and each one contains a
reference.

Whereas a name (noun) definitely *is* a reference and not a
container.

Yes, I'll grant you that about it. "Label" and "identifier" has that
connotation as well. But as long as we recognize that what a Python
variable contains is a reference, I don't see that it matters. And
this is a very useful recognition, since it implies the correct things
about assignment and parameter-passing.

I believe most of us here try to follow Knuth's lead and use
'parameter' for the local names and 'argument' for the value/object
that gets associated with the name.

Fine with me. (I've also seen "formal parameter" and "actual
parameter" respectively, but hey, I'm flexible.)

Python's assignment *statement* does what we routinely do in
everyday life when we assign new labels to things, whether
permanently or temporarily.

Super, that's great, but not very helpful, at least to someone who
knows any other programming language. For them, it's better to point
out that it does what those other languages routinely do when they
assign an expression to an identifier.

Except for your garbling of the alternative to your version, I agree.
I suspect that different people might do better with different
explanations, depending on background.

Could be.

Maybe we could run such an experiment at PyCon, pulling in non-
attendees from the hallway and getting them to take the test in
exchange for a free donut or coffee.

Best,
- Joe

Joe Strout · Nov 7, 2008

http://en.wikipedia.org/wiki/Call_by_something#Call_by_sharing

Call by sharing

Also known as "call by object" or "call by object-sharing" is an
evaluation strategy first named by Barbara Liskov et al for the
language CLU in 1974[1]. It is used by languages such as Python[2]
and Iota and (as argued by some[3]) Java, although the term is not
in common use by the Java community. Call-by-sharing implies that
values in the language are based on objects rather than primitive
types.

The semantics of call-by-sharing differ from call-by-reference in
that assignments to function arguments within the function aren't
visible to the caller (unlike by-reference sematics). However since
the function has access to the same object as the caller (no copy is
made), mutations to those objects within the function are visible to
the caller, which differs from call-by-value semantics.

Although this term has widespread usage in the Python community,
identical semantics in other languages such as Java and Visual Basic
are often described as call-by-value, where the value is implied to
be a reference to the object.

You know, people rip on Wikipedia, but I'm often surprised at how
frequently it succinctly and accurately describes a topic, even when
it is a topic of much controversy. The above summary seems spot on.
My difficulty, I guess, is that I'm coming from the Java/VB/RB
community, where the semantics are exactly the same as Python, and
where it makes perfect sense to call it call-by-value (since it is a
trivial generalization of the very same evaluation strategy used on
other types). But I get here, and there is widespread use of this
call-by-sharing term (though my initial impressions were much more
disparate and confusing than the above concise summary makes it sound).

Thanks for pointing this out. I'd only found the entry on evaluation
strategies, which makes no mention of call-by-sharing.

http://www.pmg.csail.mit.edu/papers/thetaref/node34.html

Call by Sharing
The caller and called routine communicate only through the argument
and result objects; routines do not have access to any variables of
the caller.

After the assignments of actual arguments to formal arguments, the
caller and the called routine share objects. If the called routine
modifies a shared object, the modification is visible to the caller
on return. The names used to denote the shared objects are distinct
in the caller and called routine; if a routine assigns an object to
a formal argument variable, there is no effect on the caller. From
the point of view of the invoked routine, the only difference
between its formal argument variables and its other local variables
is that the formals are initialized by its caller.

==========================
Python object *do* have access to surrounding scopes, but the second
paragraph exact described Python, as does the corresponding
Wikipedia entry.

Well yes, but it exactly describes VB.NET, RB, Java, and C++ as well
(when talking about object types rather than primitive types).

==================================
myweb.lmu.edu/dondi/fall2004/cmsi585/subroutines-in-depth.pdf

• Call by sharing
– Used by languages where variables are already references to objects
– Parameters are references to objects, but assignments to those
parameters
don’t change them at the level of the caller — e.g. Java uses call-
by-value
for primitives (int, char) and uses call-by-sharing for Objects

Well, not according to the Java specification, it doesn't. But these
sources finally make it clear that "call-by-sharing" is just a special
case of call-by-value when the value is an object reference. So, it's
accurate to say that Java (and all the other languages under
discussion) always use call-by-value, but some of those calls
(depending on the parameter data type) may also be described as call-
by-sharing.

I suppose it's reasonable that if you use a language where some types
are references and some types are not, the simplest thing is to
describe all calls as call-by-value (except for those with an optional
by-reference syntax, like C++ and RB/.NET). But if you come from a
language where all types are references, then it's reasonable to
describe all calls as call-by-sharing (provided the language doesn't
also support call-by-reference).

So. How about this for a summary?

"Python uses call-by-sharing. That's a special case of call-by-value
where the variables are references to objects; it is these references
that are copied to the parameters, not the objects themselves. For
users of other languages, this is the same semantics used for objects
in Java, RB/VB.NET, and C++ when dealing with objects."

Best,
- Joe

Arnaud Delobelle · Nov 7, 2008

Joe Strout said:
So. How about this for a summary?

"Python uses call-by-sharing. That's a special case of call-by-value
where the variables are references to objects; it is these references
that are copied to the parameters, not the objects themselves. For
users of other languages, this is the same semantics used for objects
in Java, RB/VB.NET, and C++ when dealing with objects."

Here's a story about call by sharing:

One day a black cat came strolling into our garden. It seemed quite
hungry so we gave it some milk and food remains. The cat drank the milk
and ate the food, stayed for a bit then walked away. A couple of days
later it came back, miaowing, so we fed it again. It started coming to
see us almost every day, sometimes sleeping in the house, sometimes
disappearing for a day or two. We started thinking of it as our cat and
we *named* it Napoleon. Napoleon became very popular and stayed at home
more and more often, and very young son was very fond of it, calling it
something like 'Polion'.

Unfortunately Napoleon got infested with fleas and we had to take him to
the vet. The vet scanned Napoleon with a little machine and discovered
that it had an ID chip. This revealed that Napoleon was really called
Nelson (ironically!) and belonged to a house down our road. We
contacted them and they were happy to 'share' the cat with us. By this
time the cat answered to the name of Napoleon so we carried on calling
it this name.

This story ends quite sadly. One day Napoleon escaped out the front
door and got run over by a passing van. Our son kept asking for Polion
so we decided to get a new cat and called him Napoleon as well (we
decided it would be easier for him!).

Now some questions about the story:

1. Is Napoleon a copy of Dobby or are they the same cat?

2. Is Polion a copy of Napoleon or are they the same cat?

3. When we got rid of Napoleon's fleas, was Nelson deflea-ed as well?

4. When Napoleon died, did Nelson die as well?

5. When we got a new Napoleon, does this mean that our neighbours got a
new Nelson?

Now a question about the questions about the story:

To be able to understand the story and answer questions 1-5, do we
need to think of Napoleon, Nelson and Polion as variables containing
references to cat objects, or is it enough to think of them as three
names for cats?

Steve Holden · Nov 7, 2008

Joe said:
On Nov 7, 2008, at 12:13 PM, Terry Reedy wrote: [...]

Except for your garbling of the alternative to your version, I agree.
I suspect that different people might do better with different
explanations, depending on background.

Click to expand...

Could be.

Maybe we could run such an experiment at PyCon, pulling in non-attendees
from the hallway and getting them to take the test in exchange for a
free donut or coffee.

+1: great idea! It would make a good video too ...

regards
Steve

Aaron Brady · Nov 7, 2008

Here's a story about call by sharing:

One day a black cat came strolling into our garden. It seemed quite
hungry so we gave it some milk and food remains. The cat drank the milk
and ate the food, stayed for a bit then walked away. A couple of days
later it came back, miaowing, so we fed it again. It started coming to
see us almost every day, sometimes sleeping in the house, sometimes
disappearing for a day or two. We started thinking of it as our cat and
we *named* it Napoleon. Napoleon became very popular and stayed at home
more and more often, and very young son was very fond of it, calling it
something like 'Polion'.

Unfortunately Napoleon got infested with fleas and we had to take him to
the vet. The vet scanned Napoleon with a little machine and discovered
that it had an ID chip. This revealed that Napoleon was really called
Nelson (ironically!) and belonged to a house down our road. We
contacted them and they were happy to 'share' the cat with us. By this
time the cat answered to the name of Napoleon so we carried on calling
it this name.

This story ends quite sadly. One day Napoleon escaped out the front
door and got run over by a passing van. Our son kept asking for Polion
so we decided to get a new cat and called him Napoleon as well (we
decided it would be easier for him!).

Now some questions about the story:

Sorry didn't read above yet.

1. Is Napoleon a copy of Dobby or are they the same cat?

Same cat.

2. Is Polion a copy of Napoleon or are they the same cat?

Same cat.

3. When we got rid of Napoleon's fleas, was Nelson deflea-ed as well? Yes.

4. When Napoleon died, did Nelson die as well?

No, honey. He lives on in all of us.

5. When we got a new Napoleon, does this mean that our neighbours got a
new Nelson?

No, darling, Nelson is just sleeping. When we got a New Orleans,
where did the old one go?

Now a question about the questions about the story:

To be able to understand the story and answer questions 1-5, do we
need to think of Napoleon, Nelson and Polion as variables containing
references to cat objects, or is it enough to think of them as three
names for cats?

I think they are names, implying that Python has a different "variable
model" than C++.

However, a[0] isn't exactly a name, per se, and if you say that 'b'
and 'a[0]' are names of an object, then 'a[1-1]', 'a[2*0]', etc. are
all names of it. Furthermore, some class models variables like this:

a.b= 'abc'
a.c= 'def'
a.d= 'ghi'

It also allows index access: a[0], a[1], a[2], respectively. 'abc'
has two names: 'a.b', and 'a[0]'. Correct?

Arnaud Delobelle · Nov 7, 2008

Aaron Brady said:
Furthermore, some class models variables like this:

a.b= 'abc'
a.c= 'def'
a.d= 'ghi'

It also allows index access: a[0], a[1], a[2], respectively. 'abc'
has two names: 'a.b', and 'a[0]'. Correct?

You know very well that a.b and a[0] aren't names, they are function
calls written in short hand

a.b is getattr(a, 'b')
a[0] is getattr(a, '__getitem__')(0)

So they just return an object, which happens to be the same

Douglas Alan · Nov 8, 2008

Joe Strout said:
As for where I get my definitions from, I draw from several sources:

1. Dead-tree textbooks

You've been reading the wrong textbooks. Read Liskov -- she's called
CLU (and hence Python's) calling strategy "call-by-sharing" since the
70s.

2. Wikipedia [2] (and yes, I know that has to be taken with a grain of
salt, but it's so darned convenient)
3. My wife, who is a computer science professor and does compiler
research
4. http://javadude.com/articles/passbyvalue.htm (a brief but excellent
article)
5. Observations of the "ByVal" (default) mode in RB and VB.NET
6. My own experience implementing the RB compiler (not that
implementation details matter, but it forced me to think very
carefully about references and parameter passing for a very long time)

Not that I'm trying to argue from authority; I'm trying to argue from
logic. I suspect, though, that your last comment gets to the crux of
the matter, and reinforces my guess above: you don't think c-b-v means
what most people think it means. Indeed, you don't think any of the
languages shown at [1] are, in fact, c-b-v languages. If so, then we
should focus on that and see if we can find a definitive answer.

I'll give you the definitive answer from a position of authority,
then. I took Barbara Liskov's graduate-level class in programming
language theory at MIT, and she called what Python does
"call-by-sharing".

|>oug

Douglas Alan · Nov 8, 2008

Joe Strout said:
Yes, OK, that's great. But there are several standard pass-by-
somethings that are defined by the CS community, and which are simple
and clear and apply to a wide variety of languages. "Pass by object"
isn't one of them.

"Call-by-sharing" *is* one of them, and the term has been around since
the 70s:

http://hopl.murdoch.edu.au/showlanguage.prx?exp=637

I guess if you want to campaign for it as a shorthand for "object
reference passed by value," you could do that, and it's not
outrageous.

There's no need for a campaign. The term has already been used in the
academic literature for 34 years.

But to anybody new to the term, you should explain it as exactly
that, rather than try to claim that Python is somehow different from
other OOP languages where everybody calls it simply pass by value.

It's not true that "everybody calls it simply pass by value".

OK, if there were such a thing as "pass-by-object" in the standard
lexicon of evaluation strategies, I would be perfectly happy saying
that a system has it if it behaves as though it has it, regardless of
the underpinnings.

There is "call-by-sharing" in the standard lexicon of evaluation
strategies, and it's been in the lexicon since 1974.

However, if you really think the term is that handy, and we want to
agree to say "Python uses pass by object" and answer the inevitable
"huh?" question with "that's shorthand for object references passed by
value," then I'd be OK with that.

Excellent. We can all agree to get along then!

|>oug

Aaron Brady · Nov 8, 2008

Aaron Brady said:
Aaron Brady said:

Furthermore, some class models variables like this:

Click to expand...

a.b= 'abc'
a.c= 'def'
a.d= 'ghi'

Click to expand...

It also allows index access: a[0], a[1], a[2], respectively. 'abc'
has two names: 'a.b', and 'a[0]'. Correct?

Click to expand...

You know very well that a.b and a[0] aren't names, they are function
calls written in short hand

a.b is getattr(a, 'b')
a[0] is getattr(a, '__getitem__')(0)

So they just return an object, which happens to be the same

Therefore objects don't need names to exist. Having a name is
sufficient but not necessary to exist. Being in a container is
neither necessary -nor- sufficient.

a is the name of an object. The object is associated with a
dictionary you can usually access. 'b' is a key in the dictionary.

Save instance when rotating	0	Sep 27, 2023
Save instance when rotating screen	1	Sep 27, 2023
FIXED: Dynamically reference variables in object	2	Mar 26, 2014
Reference types definition	0	Jun 18, 2023
An object is and isn't an instance of a class at the same time	2	Dec 9, 2012
Creating an instance when the argument is already an instance.	4	Jul 5, 2012
Need help finding Segmentation fault C++	0	Apr 16, 2022
Getting incorrect output in finding the maximum pair sum in the given array.	7	Apr 6, 2023

Finding the instance reference of an object

Steve Holden

Steven D'Aprano

Steven D'Aprano

Steve Holden

Steven D'Aprano

Steven D'Aprano

Marc 'BlackJack' Rintsch

Joe Strout

Terry Reedy

Steve Holden

Terry Reedy

Joe Strout

Joe Strout

Arnaud Delobelle

Steve Holden

Aaron Brady

Arnaud Delobelle

Douglas Alan

Douglas Alan

Aaron Brady

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads