Finding the instance reference of an object

Terry Reedy · Nov 5, 2008

Dennis said:
I'm sure all the programmers using FORTRAN for the last 50 years
will be very surprised to hear that it uses call-by-value!

That should be 'last 31 years'. Fortran IV *was* call-by-value, as
least for scalars. I remember spending a couple of hours tracking down
an obscure 'bug' in a once-working program when I ran it with F77. (I
had not noticed the call-by-reference switch.) The reason was, as I
remember, 'x = -x' in a function. I never forgot the difference.

I've used it for 30 years

Then you apparently missed the switch ;-).

tjr

Lie · Nov 5, 2008

There's no such thing. Those are just terms made up by the Python
community to in place of the more standard "call-by-value" terminology
to make Python seem more mysterious than it really is. I guess you
can call it "purple bananas" if you want, but the behavior is exactly
the same as what every other language calls call-by-value.

But I really am trying not to continue this debate. So that's my last
reply about it for tonight, I promise.

Cheers,
- Joe
<http://www.strout.net/info/coding/valref/>

I'm fed up with you.

In Von Neumann Architecture computers, only pass-by-value is natively
supported. Other idioms are made on top of pass-by-value.

That's why exclusively pass-by-value languages like C is most flexible
than other exclusively pass-by-<insertanythinghere>.

BUT the difference is clear, if you want to do pass-by-reference in C
(I prefer to call it faux pass-by-reference), you'd have to manually
find a variable's address, pass it by-value, then dereference the
address. Compare with VB's ByRef or C++'s &, which does all that
_automatically_ for you behind the scene.

Another example: pass-by-object. I disagree with the term pass-by-
object-reference, since the notion of object reference is unnecessary
in a true pass-by-object mechanism (i.e. we should travel outside VNA
realm for a true pass-by-object mechanism). Problem is: I'm not aware
of any computer architecture that can accommodate pass-by-object
natively or whether such architecture is feasible to be made.
Solution: Emulate such architecture on a VNA computer. Limitations:
pass-by-object implementation on a VNA computer requires the notion of
object reference/pointer because VNA computers cannot pass object, it
can only pass numbers.

In pass-by-object, the whole object is passed around, not a copy of
the object like in pass-by-value or a copy of the pointer to the
object like in faux pass-by-reference. This makes pass-by-object
actually closer to pass-by-reference than to pass-by-value, since the
notion of VNA's requirement for Object Reference means passing
pointers around. But don't let it confuse you, the pass-by-object
mechanism itself does not recognize Object Reference, unlike pass-by-
reference recognition of reference/pointer.

Now that I have clearly defined the line between pass-by-object and
pass-by-value, I'm left with you thinking what's the difference
between pass-by-object and pass-by-reference. In pass-by-reference,
names are _mnemonic to location in memory_. In pass-by-object, names
are _mnemonic to objects_. In pass-by-reference, when we assign a
value to the name, we'd assign a value _to the location in memory_. In
pass-by-object, when we assign a value to the name, we assign an
object _to the name_. When passing parameters, pass-by-reference
systems passed the memory address and alias that memory address to a
local name. In pass-by-object system, since the object itself has no
idea of its own name, so it is rebound to a local name. You can say
that pass-by-reference system is memory-location-centered, while pass-
by-object system is name-centered.

To summarize, the key bit you're missing is built-in _automatic_
abstraction from pass-by-value. All pass-by-<insertanything> except
pass-by-value is an emulation over an architecture that can only
natively pass-by-value. pass-by-reference is emulated in C, C++, and
VB by passing memory address/pointer but C isn't a by-ref system
because C doesn't provide the automatization. What makes python's
parameter passing called as pass-by-object is because python provides
the automatization to make it seems that you can do true pass-by-
object.

Joe Strout · Nov 6, 2008

Please show us the type definition of "PersonPtr"

Sorry, that'd be obvious to anyone experienced in C++, but I shouldn't
assume. It would be:

typedef Person* PersonPtr;

It's a pretty standard idiom, precisely because it is so much more
common to need a variable of the pointer type than of the class type
itself.

Please show us the type definition of "Person"

No. It's a class, and you can see that it defines a "zipcode" member;
nothing more is needed for the sake of this example. (The Java
designers realized that needing object reference variables is so
overwhelmingly more common than needing non-pointer variables, that
they eliminated the pointer syntax and made every variable of object
type actually a pointer to the object data, just as in RB, .NET, and
Python.)

Note they YOU, as the programmer, explicitly told the language to
use a call-by-value

Correct; these languages support both call-by-value and call-by-
reference. "ByVal" is the default, but you can still specify it (as
I've done here) if you want to be explicit.

-- and you still have not shown us the definition of "Person"

Right, again, there is nothing you need to know about it not already
evident from the example.

Note that there is NO PROGRAMMER CONTROL over the passing mechanism.

Quite right. (Same too in Java.) Some languages offer both call-by-
value and call-by-reference; but Java and Python offer only call-by-
value.

I do not... I see the programmer telling the language that a
reference is to be taken of some object and that this reference is
/then/ to be passed.

Quite right (and that's what I said). The reference is being passed.
By value. And then that reference is used to mutate an object.

AND the programmer is also (C++) explicitly dereferencing (by use of
->)!

Yep, "->" is the dereference operator in C++; in Java, RB, .NET, and
Python, it's ".". Different characters, same meaning.

The syntax, in C/C++ is different for
accessing the parameter by value vs by reference (pointer -- C++ is
smarter in that it allows declaring the parameter as a &ref and the
language automatically takes the address on the caller's side, and
dereferences on the called side -- but that notation is STILL a
programmer responsibility).

Careful here -- a &ref parameter really is passed by reference; that's
not the same as passing a pointer by value, which is what's shown
above. It's the C++ equivalent of the "ByRef" keyword in RB/VB.NET.

Java passes /objects/ by reference, but passes "native" types (base
numerics) by value

No, it passes everything by value. "who" in the Java example is not
an object; it is an object reference. It was passed by value. If it
were passed by reference, you'd be able to make a "swap" function that
exchanges the objects referred to by two variables, but you can't.
This is explained more fully here:

said:
-- can you create a Java example where you pass a base numeric by
reference?

No, because Java (like Python) has only call-by-value.

By your argument, even FORTRAN is call-by-value.

No, FORTRAN's an oddball: it passes everything by reference.

You obfuscate the mechanics used at the machine language level with
the semantics of the
language itself.

No, I haven't said anything at all about what's happening at the
machine language level, and I frankly don't care. This is about how
the language behaves.

FORTRAN is the language commonly used to explain call-by-reference!

Quite right. So, please try to convert any of those FORTRAN
explanations into Python or Java. Can't be done. In C++ or RB/
VB.NET, it can be done only by using the special by-reference syntax
(& or ByRef).

I'm not sure where you got the idea that I thought FORTRAN is call-by-
value. I never said or implied any such thing. And the examples
above aren't meant to prove that those languages are using by-value;
they're meant to show that mutating an object via a reference passed
in proves NOTHING about how that reference was passed. This is to
invalidate the argument (frequently heard around here) that Python
must use call-by-reference since you can mutate an object passed to a
function.

As should be painfully clear by now, that argument is total bunk. You
can pass an object reference by value, and use it to mutate the object
it refers to just fine. To test whether you're passing by reference
or by value, you need to instead assign a new value to the formal
parameter, and see whether that affects the actual parameter.
Cheers,
- Joe

Joe Strout · Nov 6, 2008

This is dialectic nit picking - WTF makes "passing a reference by
value"
different from "passing a reference" - the salient point is that its
a reference
that is passed

I know it seems nit-picky on the surface, but it is important. It is
the distinction that lets you answer whether:

def foo(x):
x = Foo()

x = Bar()
foo(x)

....results in x (after the call) now referring to a Foo, or still
referring to a Bar.

If x is a reference passed by value, then the assignment within the
foo method can't affect the x that was passed in. But if it is passed
by reference, then it does. (Also, if it is passed by reference, then
you probably wouldn't be able to pass in a literal or computed value,
but only a simple variable.)

- would you expect another level of indirection - a reference to
the reference, or what, before you admit that the thing that is
passed is a
reference and not a copied value of the OBJECT that is of interest.

Um, no, I've admitted that it's a reference all along. Indeed, that's
pretty much the whole point: that variables in Python don't contain
objects, but merely contain references to objects that are actually
stored somewhere else (i.e. on the heap). This is explicitly stated
in the Python docs [1], yet many here seem to want to deny it.

Looks to me that even if there were ten levels of indirection you
would still
insist that its a "pass by value" because in the end, its the actual
memory
address of the first pointer in the queue that is passed.

No, it's entirely possible for an OOP language to have pass by
reference, and, to put it your way, this adds one more level of
indirection. Look at the C++ and RB/VB.NET examples at [2]. But
Python and Java do not offer this option.

If that is what you mean, it is obviously trivially true - but then
ALL
calling can only be described as "call by value" - which makes
nonsense
of what the CS people have been doing all these years.

That would indeed be nonsense. But it's also not what I'm saying.
See [2] again for a detailed discussion and examples. Call-by-value
and call-by-reference are quite distinct.

"Calling by value" is not a useful definition of Pythons behaviour.

It really is, though. You have to know how the formal parameter
relates to the actual parameter. Is it a copy of it, or an alias of
it? Without knowing that, you don't know what assignments to the
formal parameter will do, or even what sort of arguments are valid.
Answer: it's a copy of it. Assignments don't affect the actual
parameter at all. This is exactly what "call by value" means.

Best,
- Joe

[1] http://www.python.org/doc/2.5.2/ext/refcounts.html
[2] http://www.strout.net/info/coding/valref/

Arnaud Delobelle · Nov 6, 2008

I know this thread has grown quite personal for some of its
participants. I am posting in a spirit of peace and understanding

Joe Strout said:
Um, no, I've admitted that it's a reference all along. Indeed, that's
pretty much the whole point: that variables in Python don't contain
objects, but merely contain references to objects that are actually
stored somewhere else (i.e. on the heap). This is explicitly stated
in the Python docs [1], yet many here seem to want to deny it.

You refer to docs about the *implementation* of Python in C. This is
irrelevant.

Also, you talk about variables 'containing' something. In Python,
variables don't contain anything, they're simply names for objects.
'Pass by value' is not relevant to Python as variables do not contain
anything. 'Pass by reference' is not relevant to Python as the language
doesn't have the concept of object reference (in the sense of e.g. C++
reference).

[...]

It really is, though. You have to know how the formal parameter
relates to the actual parameter. Is it a copy of it, or an alias of
it? Without knowing that, you don't know what assignments to the
formal parameter will do, or even what sort of arguments are valid.
Answer: it's a copy of it. Assignments don't affect the actual
parameter at all. This is exactly what "call by value" means.

Here lies, IMHO, the reason why you think you need Python to 'pass by
value'. As you believe that variables must contain something, you think
that assignment is about copying the content of a variable. Assignment
in Python is simply giving a new name to an object.

To understand variables (which I prefer to call 'names') and function
calls in Python you need simply to understand that:

- a variable is a name for an object
- assignment is naming an object
- the parameters of a function are local names for the call arguments
(I guess 'pass by object' is a good name).

Now quoting the start of your post:

I would say that an oject is passed, not a reference.

I know it seems nit-picky on the surface, but it is important. It is
the distinction that lets you answer whether:

def foo(x):
x = Foo()

x = Bar()
foo(x)

...results in x (after the call) now referring to a Foo, or still
referring to a Bar.

You don't need this to decide. This is what happens:

x = Bar() # Call this new Bar object 'x'
foo(x) # call function foo with argument the object known as 'x'

# Now, in foo:
def foo(x): # Call 'x' locally the object passed to foo
x = Foo() # Call 'x' locally this new Foo object.

Obviously after all this, 'x' is still the name of the Bar object
created at the start.

To sum up: for 'pass by value' to make sense in Python you need to
create an unnecessarily complex model of how Python works. By letting
go of 'pass by value' you can simplify your model of the language
(keeping it correct of course) and it fits in your brain more easily.

Of course your own model is valid but there is a better one which is
easier to grasp for people without a background in C/C++ - like
languages.

Aaron Brady · Nov 6, 2008

I know this thread has grown quite personal for some of its
participants. I am posting in a spirit of peace and understanding

Hear, hear.

You refer to docs about the *implementation* of Python in C. This is
irrelevant.

Also, you talk about variables 'containing' something. In Python,
variables don't contain anything, they're simply names for objects.
'Pass by value' is not relevant to Python as variables do not contain
anything. 'Pass by reference' is not relevant to Python as the language
doesn't have the concept of object reference (in the sense of e.g. C++
reference). ....
I would say that an oject is passed, not a reference. ....
To sum up: for 'pass by value' to make sense in Python you need to
create an unnecessarily complex model of how Python works. By letting
go of 'pass by value' you can simplify your model of the language
(keeping it correct of course) and it fits in your brain more easily.

Of course your own model is valid but there is a better one which is
easier to grasp for people without a background in C/C++ - like
languages.

I agree, and I don't think we're giving Joe the proper credit for what
he knows and has worked on. Furthermore, his understanding of the
implementation of languages is thorough, and you can't have languages
without implementations. Though, you do not need to understand the
implementation to understand the language.

I haven't thought it through completely, but now that Joe mentions it,
it appears Python behaves the same as C++, if variables can't be
anything but pointers, parameters are all c-b-v, and you can't
dereference the l-h-s of an assignment ( '*a= SomeClass()' ).

When you're explaining Python to a beginner, you have to introduce a
new term either way. You'll either have to explain pointers, which
there are chapters and chapters on in introductory textbooks; or,
you'll have to explain a new calling mechanism, and give it a name.

I think pointer-by-value would be accurate, but by-value wouldn't be.

I would say that an oject is passed, not a reference.

I agree, and anything else would be overcomplicated.

Joe Strout · Nov 6, 2008

I'm fed up with you.

I'm sorry -- I'm really not trying to be difficult. And it's odd that
you're fed up with me, yet you seem to be agreeing with me on at least
most points.

In Von Neumann Architecture computers, only pass-by-value is natively
supported. Other idioms are made on top of pass-by-value.

Well, sure. But a language may or may not naturally support the others.

That's why exclusively pass-by-value languages like C is most flexible
than other exclusively pass-by-<insertanythinghere>.

BUT the difference is clear, if you want to do pass-by-reference in C
(I prefer to call it faux pass-by-reference), you'd have to manually
find a variable's address, pass it by-value, then dereference the
address.
Agreed.

Compare with VB's ByRef or C++'s &, which does all that
_automatically_ for you behind the scene.

Agreed again.

Another example: pass-by-object.

Here's where we depart, I guess. I think there's no such thing (see <http://en.wikipedia.org/wiki/Evaluation_strategy

> for example, and the dead-tree references I have on hand agree).

I disagree with the term pass-by-object-reference, since the notion
of object reference is unnecessary
in a true pass-by-object mechanism (i.e. we should travel outside VNA
realm for a true pass-by-object mechanism). Problem is: I'm not aware
of any computer architecture that can accommodate pass-by-object
natively or whether such architecture is feasible to be made.

I don't think it's necessary (or helpful) to reach down into the
architecture or implementation details. What matters is the behavior
as seen by the language user. Variables in a language could hold
actual object data (in which case "a = b" would copy the data from b
into a), or they could hold just references to object data that lives
elsewhere (in which case "a = b" would copy that reference, giving you
two references to the same object). In Python, they are of course
just references.

Then, and completely independent of that, variables could be passed
into a method by value or by reference (or by several other more
esoteric evaluation strategies that are rarely used, but "call by
object" isn't one of them). Again, it's easy to tell these apart with
a simple behavioral test. Python passes its object references by
value; changes to the formal parameter have no effect on the actual
parameter.

In pass-by-object, the whole object is passed around, not a copy of
the object like in pass-by-value or a copy of the pointer to the
object like in faux pass-by-reference.

I can't understand what that would mean, unless we imagine the
variable actually containing the object data; but it doesn't, or "a =
b" would copy that data, which clearly it does not.

Understanding that a name in Python is merely a reference to the
object, and not the object itself, seems to me to be one of the
fundamental truths that any beginning Python programmer must know.

This makes pass-by-object
actually closer to pass-by-reference than to pass-by-value, since the
notion of VNA's requirement for Object Reference means passing
pointers around. But don't let it confuse you, the pass-by-object
mechanism itself does not recognize Object Reference, unlike pass-by-
reference recognition of reference/pointer.

Sorry, it's confusing me whether I want it to or not.

And here's what I don't understand: object references are so simple,
why do people go to such great lengths to pretend that Python doesn't
have them, resulting in a far more complex and convoluted explanation
that is much harder to understand?

Now that I have clearly defined the line between pass-by-object and
pass-by-value, I'm left with you thinking what's the difference
between pass-by-object and pass-by-reference.

If they're that hard to tell apart, then Python doesn't have either,
since it is very clear and obvious that Python does not have pass-by-
reference (since you can't write a function to swap two parameters,
for example).

In pass-by-reference, names are _mnemonic to location in memory_.
In pass-by-object, names are _mnemonic to objects_.

These would be equivalent if objects live at some location in memory,
wouldn't they?

In pass-by-reference, when we assign a
value to the name, we'd assign a value _to the location in memory_. In
pass-by-object, when we assign a value to the name, we assign an
object _to the name_. When passing parameters, pass-by-reference
systems passed the memory address and alias that memory address to a
local name. In pass-by-object system, since the object itself has no
idea of its own name, so it is rebound to a local name. You can say
that pass-by-reference system is memory-location-centered, while pass-
by-object system is name-centered.

Hmm. I can only interpret "the object is rebound to a local name"
sounds the same to me as "a local variable is assigned a reference to
the object." Can you explain how they're different?

I see how what you're describing is different from pass by reference;
I just don't see how it's any different from pass by value.

To summarize, the key bit you're missing is built-in _automatic_
abstraction from pass-by-value. All pass-by-<insertanything> except
pass-by-value is an emulation over an architecture that can only
natively pass-by-value.

Yes, OK, that's great. But there are several standard pass-by-
somethings that are defined by the CS community, and which are simple
and clear and apply to a wide variety of languages. "Pass by object"
isn't one of them. I guess if you want to campaign for it as a
shorthand for "object reference passed by value," you could do that,
and it's not outrageous. But to anybody new to the term, you should
explain it as exactly that, rather than try to claim that Python is
somehow different from other OOP languages where everybody calls it
simply pass by value.

pass-by-reference is emulated in C, C++, and
VB by passing memory address/pointer but C isn't a by-ref system
because C doesn't provide the automatization.

Yes, that's true of C. I never argued that C had a by-ref mode. C++,
VB, and REALbasic all do, however. Python and Java do not.

What makes python's parameter passing called as pass-by-object is
because python provides
the automatization to make it seems that you can do true pass-by-
object.

OK, if there were such a thing as "pass-by-object" in the standard
lexicon of evaluation strategies, I would be perfectly happy saying
that a system has it if it behaves as though it has it, regardless of
the underpinnings.

My objection to this term is simply that it is made up and
nonstandard, and adds no new value to the discussion. To me, "object
references passed by value" is simple and clear, where as "pass by
object" means nothing (until you explain that it means the former).

However, if you really think the term is that handy, and we want to
agree to say "Python uses pass by object" and answer the inevitable
"huh?" question with "that's shorthand for object references passed by
value," then I'd be OK with that.

Best,
- Joe

Joe Strout · Nov 6, 2008

I know this thread has grown quite personal for some of its
participants. I am posting in a spirit of peace and understanding

Thanks, I'll do the same.

Um, no, I've admitted that it's a reference all along. Indeed,
that's
pretty much the whole point: that variables in Python don't contain
objects, but merely contain references to objects that are actually
stored somewhere else (i.e. on the heap). This is explicitly stated
in the Python docs [1], yet many here seem to want to deny it.

Click to expand...

You refer to docs about the *implementation* of Python in C. This is
irrelevant.

It's supportive. I don't understand how/why anybody would deny that
Python names are references -- it's all over the place, from any
discussion of "reference counting" (necessary to understand the life
cycle of Python object) to understanding the basics of what "a = b"
does. It seems absurd to argue that Python does NOT use references.
So the official documentation calmly discussing Python references,
with no caveats about it being internal implementation detail, seemed
relevant.

Also, you talk about variables 'containing' something. In Python,
variables don't contain anything, they're simply names for objects.

You say "names for", I say "references to". We're saying the same
thing (though I'm saying it with terminology that is more standard, at
least in the wider OOP world).

'Pass by value' is not relevant to Python as variables do not contain
anything. 'Pass by reference' is not relevant to Python as the
language
doesn't have the concept of object reference (in the sense of e.g. C++
reference).

Both are relevant to answering simple questions, like what happens to
x in this case:

def foo(spam):
spam = 5
foo(x)

This is a basic and fundamental thing that a programmer of a language
should know. If it's call-by-reference, then x becomes 5. If it's
call-by-value, it does not.

Why the resistance to these simple and basic terms that apply to any
OOP language?

Here lies, IMHO, the reason why you think you need Python to 'pass by
value'. As you believe that variables must contain something, you
think
that assignment is about copying the content of a variable.
Assignment
in Python is simply giving a new name to an object.

What does "give a new name to an object" mean? I submit that it means
exactly the same thing as "assigns the name to refer to the object".
There certainly is no difference in behavior that anyone has been able
to point out between what assignment does in Python, and what
assignment does in RB, VB.NET, Java, or C++ (in the context of object
pointers, of course). If the behavior is the same, why should we make
up our own unique and different terminology for it?

To understand variables (which I prefer to call 'names') and function
calls in Python you need simply to understand that:

- a variable is a name for an object

A reference to an object, got it.

- assignment is naming an object

Assigning the reference to the object, yes.

- the parameters of a function are local names for the call arguments

Agreed; they're not aliases of the call arguments.

(I guess 'pass by object' is a good name).

Well, I'm not sure why that would be. What you've just described is
called "pass by value" in every other language.

I would say that an oject is passed, not a reference.

That seems to contradict the actual behavior, as well as what you said
yourself above. The only way I know how to interpret "an object is
passed" is "the data of that object is copied onto the stack". But of
course that's not what happens. What actually happens is what you
said above: a name (reference) is assigned to the object. The name is
a reference; it is made to refer to the same thing that the argument
(actual parameter) referred to. This is exactly what "the reference
is passed" means, nothing more or less.

You don't need this to decide. This is what happens:

x = Bar() # Call this new Bar object 'x'

In other words, make 'x' refer to this new object.

foo(x) # call function foo with argument the object known as 'x'

Yes. But what does that mean? Does the parameter within 'foo' become
an alias of x, or a copy of it? That's what we DO need to decide.

# Now, in foo:
def foo(x): # Call 'x' locally the object passed to foo

I think you mean here that local 'x' is made to refer to the object
passed to foo. Agreed. It is NOT an alias of the actual parameter.
And that's what we need to know. So it's not call-by-reference, it's
call-by-value; the value of x (a reference to whatever object Bar()
returned) is copied from the value of the parameter (a reference to
that same object, of course).

x = Foo() # Call 'x' locally this new Foo object.

Make 'x' refer to the new Foo() result, yes.

Obviously after all this, 'x' is still the name of the Bar object
created at the start.

Obvious only once you've determined that Python is call-by-value. If
it were like FORTRAN, where everything is call-by-reference, then that
wouldn't be the case at all; the assignment within the function would
affect the variable (or name, if you prefer) passed in.

To sum up: for 'pass by value' to make sense in Python you need to
create an unnecessarily complex model of how Python works.

I have to disagree. The model is simple, self-consistent, and
consistent with all other languages. It's making up terms and trying
to justify them and get out of the logical knots (such as your claim
above that the object itself is passed to a method) that is
unnecessarily complex.

By letting go of 'pass by value' you can simplify your model of the
language
(keeping it correct of course) and it fits in your brain more easily.

I'm afraid I don't see that.

Of course your own model is valid but there is a better one which is
easier to grasp for people without a background in C/C++ - like
languages.

Well, if by C/C++-like languages, you mean also Java, VB.NET, and so
on, then maybe you're right -- perhaps my view is colored by my
experience with those. But alternatively, perhaps there are enough
Python users without experience in other OOP languages, that the
standard terminology was unfamiliar to them, and they made up their
own, resulting in the current linguistic mess.

Best,
- Joe

Terry Reedy · Nov 6, 2008

Aaron said:
and you can't have languages without implementations.

This is either a claim that languages are their own implementations, or
an admission that human brains are the implementation of all languages
thought of by human brains, coupled with a claim that there cannot be
languages not thought of by humans, or just wrong in terms of the common
meaning of algorithmic implementation of an interpreter. There are many
algorithm languages without such implementations. They were once called
'pseudocode'. (I don't know if that term is still used much.) Hence my
oxymoronic definition, a decade ago, of Python as 'executable pseudocode'.

> Though, you do not need to understand the
implementation to understand the language.

Languages are typically defined before there is any implementation.
They are implemented if they are understood as being worth the effort.

I haven't thought it through completely, but now that Joe mentions it,
it appears Python behaves the same as C++,

Python and C/C++ have completely different object and data models. You
and Joe are completely welcome to understand/model Python in terms of
the C implementation, if you wish, but it not necessary and, I believe,
a disservice to push such a model on beginners as *the* way to model Python.

if variables can't be
anything but pointers, parameters are all c-b-v, and you can't
dereference the l-h-s of an assignment ( '*a= SomeClass()' ).

I don't understand this.

When you're explaining Python to a beginner, you have to introduce a
new term either way. You'll either have to explain pointers, which

Python has no concept of pointers.

there are chapters and chapters on in introductory textbooks; or,
you'll have to explain a new calling mechanism, and give it a name.

I disagree. In everyday life, we have multiple names for objects and
classes of objects. We nearly always define procedures in terms of
generic object classes (common nouns) rather than particular objects.
We apply procedures by binding generic names to particular objects, by
associating them with objects, by filling in the named blanks. This is
what Python does. Except for the particular formalized syntax, there is
nothing new, not even the idea of a specialized procedure language.

def make_cookies(flour, oil, sugar, egg, bowl, spoon, cookie_pan, oven):
"flour, oil, and sugar are containers with respectively at least 2
cups, 1 tablespoon, and 1/4 cup of the indicated material"
mix(bowl, spoon, measure(flour, '2 cups'), measure(sugar, '1/4 cup'))
<etc>

OK, there is one important difference. The concrete referents for the
parameters are non-physical information objects rather than physical
objects.

I agree, and anything else would be overcomplicated.

To me, 'pass' implies movement, but in Python, objects don't really have
a position and hence cannot move. So I would say that an object is
associated (with a parameter). 'Associating' is a primitive operation
for Python interpreters. If and when all parameters get associated with
something, the function code is activated. Then the return object gets
associated with the call expression in the computation graph.

Terry Jan Reedy

Aaron Brady · Nov 6, 2008

This is either a claim that languages are their own implementations, or
an admission that human brains are the implementation of all languages
thought of by human brains, coupled with a claim that there cannot be
languages not thought of by humans, or just wrong in terms of the common
meaning of algorithmic implementation of an interpreter. There are many
algorithm languages without such implementations. They were once called
'pseudocode'. (I don't know if that term is still used much.) Hence my
oxymoronic definition, a decade ago, of Python as 'executable pseudocode'..

I was just saying that Joe has a good idea for one way to implement
Python.

Python and C/C++ have completely different object and data models. You
and Joe are completely welcome to understand/model Python in terms of
the C implementation, if you wish, but it not necessary and, I believe,
a disservice to push such a model on beginners as *the* way to model Python.

You're asking about people's thought processes when they write a
program. In chess, I sometimes think, "If I move here, he moves
there," etc., but not always; sometimes I just visualize pieces in
places. In Python, if I want a function that, say, duplicates the
first and last elements of a sequence type, I visualize one parameter,
and some code that operates on it.

def duper( a ):
a.insert( 0, a[ 0 ] )
a.append( a[ -1 ] )

Someone else with no experience, or certain incompatible experience,
might visualize this:

def duper( a ):
a= a[ 0 ] + a + a[ -1 ]

As you can see, they know a lot of Python already. What are they
missing?

I don't understand this.

If you restricted a C program as follows:
- variables can only be pointers
- no pointers to pointers
- no dereferencing l-h-s of assignment statements

It would look at the very least a lot like Python, or as Joe holds,
exactly like Python.

snip.

I disagree. In everyday life, we have multiple names for objects and
classes of objects. We nearly always define procedures in terms of
generic object classes (common nouns) rather than particular objects.
We apply procedures by binding generic names to particular objects, by
associating them with objects, by filling in the named blanks. This is
what Python does. Except for the particular formalized syntax, there is
nothing new, not even the idea of a specialized procedure language.

To examine the natural language (NL) equivalents of these issues a
little.

Here's a question I posed to Joe:

a) 'Fido' is a reference to a dog
b) Fido is a reference to a dog
c) 'Fido' is the name of a dog
d) Fido is the name of a dog
e) 'Fido' is a dog
f) Fido is a dog

Which are true?

I have no problems with (c) and (f), and many natural languages
(possibly being strictly spoken) do not distinguish (c) from (d).
Similarly,

Old McDonald had a dog and Bingo was it's name.
(*) Old McDonald had a dog and Bingo was it. (* marks non-standard)
(*) Old McDonald had a dog and it was Bingo.

I want to give a code name to Fido: Rex. What language do I use to
inform my audience ("spies"), and what statements are subsequently
true? Then, I want to write instructions to friends on how to walk
dogs in my neighborhood, still in natural language.

_How to walk a dog_
Take the dog out the door
Turn right at the end of the drive
etc.

Nothing mysterious. Same thing with instructions for vaccinating
them. However, if I want to model dogs, simulate them, etc., I'll
keep some running state information about them: think its chart at the
vet.

Dog:
Fido:
Last walked: 10 a.m.
Needs to be walked: False

At 4 p.m., I'm going through the charts, and I notice Fido hasn't been
walked, so I change the 'Needs to be walked' entry to true.

Then some members of "spies" walks into the vet, and ask if Rex needs
to be walked.

What things in this story are names, variables, dogs, references,
objects, pointers, types, classes, procedures, functions, arguments,
parameters, formal parameters, by-value parameters, by-reference
parameters, by-share parameters, etc.? How many names does Fido have,
what are they, what is Fido, what is Rex, etc.? Keep in mind there
are seldom quotes in spoken language; people have to make gestures or
be explicit in such cases as they need to indicate their presence.

"Quote Fido quote is the name of a dog, but Fido is not." (Actual
transcription.)

Keep answers to less than 500 words. Also, reevaluate the uses of
'the' and 'a' in the multiple choice question: Is 'Fido' the name of a
dog, a name of a dog, or a name of the dog?

def make_cookies(flour, oil, sugar, egg, bowl, spoon, cookie_pan, oven):
"flour, oil, and sugar are containers with respectively at least 2
cups, 1 tablespoon, and 1/4 cup of the indicated material"
mix(bowl, spoon, measure(flour, '2 cups'), measure(sugar, '1/4 cup'))

yield cookies #yum

OK, there is one important difference. The concrete referents for the
parameters are non-physical information objects rather than physical
objects.

This is actually underestimated. Equality and identity as defined on
mathematical ("ideal") / logical objects isn't the same as our "real"
notions of them. Words are sort of in between, giving rise to e.g.
the difference between numbers and numerals.

To me, 'pass' implies movement, but in Python, objects don't really have
a position and hence cannot move. So I would say that an object is
associated (with a parameter).

Further complicating matters, is the function, its parameters,
variables, code, etc. remain in existence between calls to it. Argh!

'Associating' is a primitive operation for Python interpreters.

Interesting. If I may be so bold as to ask, is it for C code, C
compilers, and/or C programs?

snip.

All due respect.

Steven D'Aprano · Nov 6, 2008

that's
pretty much the whole point: that variables in Python don't contain
objects, but merely contain references to objects that are actually
stored somewhere else (i.e. on the heap).

You're wrong, Python variables don't contain *anything*. Python variables
are names in a namespace. But putting that aside, consider the Python
code "x = 1". Which statement would you agree with?

(A) The value of x is 1.

(B) The value of x is an implementation-specific thing which is
determined at runtime. At the level of the Python virtual machine, the
value of x is arbitrary and can't be determined.

If you answer (A), then your claim that Python is call-by-value is false.
If you answer (B), then your claim that Python is call-by-value is true
but pointless, obtuse and obfuscatory.

This is explicitly stated in
the Python docs [1], yet many here seem to want to deny it.

[1] http://www.python.org/doc/2.5.2/ext/refcounts.html

You have a mysterious and strange meaning of the word "explicitly". Would
you care to quote what you imagine is this explicit claim?

Looks to me that even if there were ten levels of indirection you would
still
insist that its a "pass by value" because in the end, its the actual
memory
address of the first pointer in the queue that is passed.

Click to expand...

No, it's entirely possible for an OOP language to have pass by
reference, and, to put it your way, this adds one more level of
indirection. Look at the C++ and RB/VB.NET examples at [2]. But Python
and Java do not offer this option.

Yes, you are right, Python does not offer pass by reference. The
canonical test for "call by reference" behaviour is to write a function
that does this:

x = 1
y = 2
swap(x, y)
assert x == 2 and y == 1

If you can write such a function, your language may be call-by-reference.
If you can't, it definitely isn't c-b-r. You can't write such a function
in standard Python, so Python isn't c-b-r.

The canonical test for "call by value" semantics is if you can write a
function like this:

x = [1] # an object that supports mutation
mutate(x)
assert x == [1]

If mutations to an argument in a function are *not* reflected in the
caller's scope, then your language may be call-by-value. But if mutations
are visible to the caller, then your language is definitely not c-b-v.

Python is neither call-by-reference nor call-by-value.

If that is what you mean, it is obviously trivially true - but then ALL
calling can only be described as "call by value" - which makes nonsense
of what the CS people have been doing all these years.

Click to expand...

That would indeed be nonsense. But it's also not what I'm saying. See
[2] again for a detailed discussion and examples. Call-by-value and
call-by-reference are quite distinct.

And also a false dichotomy.

It really is, though. You have to know how the formal parameter relates
to the actual parameter. Is it a copy of it, or an alias of it?

And by definition, "call by value" means that the parameter is a copy. So
if you pass a ten megabyte data structure to a function using call-by-
value semantics, the entire ten megabyte structure is copied.

Since this does not happen in Python, Python is not a call-by-value
language. End of story.

Without knowing that, you don't know what assignments to the formal
parameter will do, or even what sort of arguments are valid. Answer:
it's a copy of it.

Lies, all lies. Python doesn't copy variables unless you explicitly ask
for a copy. That some implementations of Python choose to copy pointers
rather than move around arbitrarily large blocks of memory instead is an
implementation detail. It's an optimization and irrelevant to the
semantics of argument passing in Python.

Assignments don't affect the actual parameter at
all. This is exactly what "call by value" means.

Nonsense. I don't know where you get your definitions from, but it isn't
a definition anyone coming from a background in C, Pascal or Fortran
would agree with.

Joe Strout · Nov 6, 2008

First, I want to thank everyone for your patience -- I think we're
making progress towards a consensus.

You're wrong, Python variables don't contain *anything*. Python
variables
are names in a namespace.

I think we're saying the same thing. What's a name? It's a string of
characters used to refer to something. That which refers to something
is a reference. The thing it refers to is a referent. I was trying
to avoid saying "the value of an object reference is a reference to an
object" since that seems tautological and you don't like my use of the
word "value," but I see you don't like "contains" either.

Maybe we can try something even wordier: a variable in Python is, by
some means we're not specifying, associated with an object. (This is
what I mean when I say it "refers" to the object.) Can we agree that
far?

Now, when you pass a variable into a method, the formal parameter gets
associated with same object the actual parameter was associated with.
I like to say that the object reference gets copied into the formal
parameter, since that's a nice, simple, clear, and standard way of
describing it. I think you object to this way of saying it. But are
we at least in agreement that this is what happens?

But putting that aside, consider the Python
code "x = 1". Which statement would you agree with?

(A) The value of x is 1.

Only speaking loosely (which we can get away with because numbers are
immutable types, as pointed out in the last section of [1]).

(B) The value of x is an implementation-specific thing which is
determined at runtime. At the level of the Python virtual machine, the
value of x is arbitrary and can't be determined.

Hmm, this might be true to somebody working at the implementation
level, but I think we're all agreed that that's not the level of this
discussion. What's relevant here is how the language actually
behaves, as observable by tests written in that language.

If you answer (A), then your claim that Python is call-by-value is
false.
Correct.

If you answer (B), then your claim that Python is call-by-value is
true
but pointless, obtuse and obfuscatory.

Correct again. My answer is:

(C) The value of x is a reference to an immutable object with the
value of 1. (That's too wordy for casual conversation so we might
casually reduce this to (A), as long as we all understand that (A) is
not actually true. It's a harmless fiction as long as the object is
immutable; it becomes important when we're dealing with mutable
objects.)

This is explicitly stated in
the Python docs [1], yet many here seem to want to deny it.

Click to expand...

[1] http://www.python.org/doc/2.5.2/ext/refcounts.html

Click to expand...

You have a mysterious and strange meaning of the word "explicitly".
Would
you care to quote what you imagine is this explicit claim?

A few samples: "The chosen method is called reference counting. The
principle is simple: every object contains a counter, which is
incremented when a reference to the object is stored somewhere, and
which is decremented when a reference to it is deleted. When the
counter reaches zero, the last reference to the object has been
deleted and the object is freed. ...Python uses the traditional
reference counting implementation..."

This seems like a point we really shouldn't need to argue. Do you
really want to defend the claim that Python does not use references?

Yes, you are right, Python does not offer pass by reference. The
canonical test for "call by reference" behaviour is to write a
function
that does this:

x = 1
y = 2
swap(x, y)
assert x == 2 and y == 1

If you can write such a function, your language may be call-by-
reference.
If you can't, it definitely isn't c-b-r. You can't write such a
function
in standard Python, so Python isn't c-b-r.

Whew! That's a relief. A week ago (or more?), it certainly sounded
like some here were claiming that Python is c-b-r (usually followed by
some extended hemming and hawing and except-for-ing to explain why you
couldn't do the above).

The canonical test for "call by value" semantics is if you can write a
function like this:

x = [1] # an object that supports mutation
mutate(x)
assert x == [1]

If mutations to an argument in a function are *not* reflected in the
caller's scope, then your language may be call-by-value. But if
mutations
are visible to the caller, then your language is definitely not c-b-v.

Aha. So, in your view, neither C, nor C++, nor Java, nor VB.NET are c-
b-v, since all of those support passing an object reference into a
function, and using that reference to mutate the object.

Your view is at odds with the standard definition, though; in fact I'm
pretty sure we could dig up C and Java specs that explicitly spell out
their c-b-v semantics, and RB and VB.NET pretty clearly mean "ByVal"
to indicate by-value in those languages.

The canonical test of c-b-v is whether a *reassignment* of the formal
parameter is visible to the caller. Simply using the parameter for
something (such as dereferencing it to find and change data that lives
on the heap) doesn't prove anything at all about how the parameter was
passed.

Python is neither call-by-reference nor call-by-value.

That can be true only if at least one of the following is true:

1. Python's semantics are different from C/C++ (restricted to
pointers), Java, and RB/VB.NET; or
2. C/C++ (restricted to pointers), Java, and RB/VB.NET are not call-by-
value.

I asked you before which of these you believed to be the case, so we
could focus on that, but I must have missed your reply. Can you
please clarify?

From your above c-b-v test, I guess you would argue point 2, that
none of those languages are c-b-v. If so, then we can proceed to
examine that in more detail. Is that right?

That would indeed be nonsense. But it's also not what I'm saying.
See
[2] again for a detailed discussion and examples. Call-by-value and
call-by-reference are quite distinct.

Click to expand...

And also a false dichotomy.

I've never claimed these are the only options; just that they're the
only ones actually used in any of the languages under discussion. If
you think Python uses call by name, call by need, call by macro
expansion, or something else at [2], please do say which one. "Call
by object", as far as I can tell, is just a made-up term for call-by-
value when the value is an object reference. (And I'm reasonably OK
with that as long as we're all agreed that that is what it means.)

And by definition, "call by value" means that the parameter is a
copy. So
if you pass a ten megabyte data structure to a function using call-by-
value semantics, the entire ten megabyte structure is copied.

Right. And if (as is more sensible) you pass a reference to a ten MB
data structure to a function using call-by-value, then the reference
is copied.

Since this does not happen in Python, Python is not a call-by-value
language. End of story.

So your claim is that any language that includes references (which is
all OOP languages, as far as I'm aware), is not call-by-value?

Lies, all lies. Python doesn't copy variables unless you explicitly
ask
for a copy.

Hmm, I'm struggling to understand why you would say this. Perhaps you
mean that Python doesn't copy *objects* unless you explicitly ask for
a copy. That's certainly true. But it does copy references in many
circumstances, including in assignment statements, and parameter
passing.

That some implementations of Python choose to copy pointers
rather than move around arbitrarily large blocks of memory instead
is an
implementation detail. It's an optimization and irrelevant to the
semantics of argument passing in Python.

I agree that under the hood, there are probably other ways to get the
same behavior. What's important is to know whether the formal
parameter is an alias of the actual parameter, or its own independent
local variable that (let me try to say it more like your way here)
happens to be initially associated with the same referent as the
actual parameter. This obviously has behavioral consequences, as you
showed above.

A concise way to describe the behavior of Python and other languages
is to simply say: the object reference is copied into the formal
parameter.

Nonsense. I don't know where you get your definitions from, but it
isn't
a definition anyone coming from a background in C, Pascal or Fortran
would agree with.

Well I can trivially refute that by counterexample: I come from a
background in C, Pascal, and FORTRAN, and I agree with it.

As for where I get my definitions from, I draw from several sources:

1. Dead-tree textbooks
2. Wikipedia [2] (and yes, I know that has to be taken with a grain of
salt, but it's so darned convenient)
3. My wife, who is a computer science professor and does compiler
research
4. http://javadude.com/articles/passbyvalue.htm (a brief but excellent
article)
5. Observations of the "ByVal" (default) mode in RB and VB.NET
6. My own experience implementing the RB compiler (not that
implementation details matter, but it forced me to think very
carefully about references and parameter passing for a very long time)

Not that I'm trying to argue from authority; I'm trying to argue from
logic. I suspect, though, that your last comment gets to the crux of
the matter, and reinforces my guess above: you don't think c-b-v means
what most people think it means. Indeed, you don't think any of the
languages shown at [1] are, in fact, c-b-v languages. If so, then we
should focus on that and see if we can find a definitive answer.

Best,
- Joe

[1] http://www.strout.net/info/coding/valref/
[2] http://en.wikipedia.org/wiki/Evaluation_strategy

Steve Holden · Nov 7, 2008

Joe said:
Thanks, I'll do the same.

That's good to hear. Your arguments are sometimes pretty good, and
usually well made, but there's been far too much insistence on all sides
about being right and not enough on reaching agreement about how
Python's well-defined semantics for assignment and function calling
should best be described.

In other words, it's a classic communication problem.

Um, no, I've admitted that it's a reference all along. Indeed, that's
pretty much the whole point: that variables in Python don't contain
objects, but merely contain references to objects that are actually
stored somewhere else (i.e. on the heap). This is explicitly stated
in the Python docs [1], yet many here seem to want to deny it.

Click to expand...

You refer to docs about the *implementation* of Python in C. This is
irrelevant.

Click to expand...

It's supportive. I don't understand how/why anybody would deny that
Python names are references -- it's all over the place, from any
discussion of "reference counting" (necessary to understand the life
cycle of Python object) to understanding the basics of what "a = b"
does. It seems absurd to argue that Python does NOT use references. So
the official documentation calmly discussing Python references, with no
caveats about it being internal implementation detail, seemed relevant.

I must say I find it strange when people try to contradict my assertion
that Python names are references to objects, when the (no pun intended)
reference implementation of the language uses "reference counting" to
track how many assignments have been made.
Though there is an equally vociferous faction who will happily jump up
and down all day shouting "objects don't have names", a tendency I have
myself been known to indulge from time to time (but usually only when
some novitiate asks how they can find out "what the name of an object
is"). Being of the old school, I do tend to think of Python names as
being reference variables in the sense of Algol 68. Thus they are
fixed-size and frequently of limited lifetime. Since assignment (whether
by name binding or to a container element) copies the reference, and
since strong references keep objects alive, this is one way to explain
why Python doesn't suffer from C++'s dangling pointer issue.

You say "names for", I say "references to". We're saying the same thing
(though I'm saying it with terminology that is more standard, at least
in the wider OOP world).

Naughty, naughty, there's that little "I'm right, you're wrong" thing
sneaking in again. I don't want to have to get the clue stick out here ...

"Variables do not contain anything" seems to be a little extreme here.
They must store information of some sort, or no Python program could
ever produce a useful output. And while the concept of "object
reference" may not exist in the language, it is definitely valid for
implementers.

Interestingly, while "variable" isn't an indexed term in the (2.6)
documentation, "reference count" appears in the glossary and the
Language Reference Manual (again, no pun intended) explicitly states in
its discussion of Python's data model (vis a vis the exact meaning of
immutability) that container objects contain references to other
objects. It shortly thereafter mentions the reference-counting technique
of the CPython implementation, but does not claim it as part of the
language.

The same section also mentions "reference to 'external' resources such
as files or windows ..." and "references to other objects".

Interestingly it is also made explicit that "for immutable types,
operations that compute new values may return a reference to any
existing object with the same type and value, while for mutable objects
this is not allowed" (and if any reader disagrees that the reasons for
this are obvious their part in this thread was long since over).

There's even a built-in type called a "weak reference".

So any argument that the language "doesn't have the concept of object
reference (in the sense of e.g. C++ reference)" is simply stating the
obvious: that Python has no way to declare reference variables. I would
argue myself that it has no need of such a mechanism precisely because
names are object references, and I'd like to hear counter-arguments.
Consider my memory short -- I have a large dose of crotchety to go with
that it you'd like.

Both are relevant to answering simple questions, like what happens to x
in this case:

def foo(spam):
spam = 5
foo(x)

This is a basic and fundamental thing that a programmer of a language
should know. If it's call-by-reference, then x becomes 5. If it's
call-by-value, it does not.

Well that's not true either. If I remember all the way back to my
computational science degree I seem to remember being taught that there
was call by *simple reference*, which is what I understand you to mean.
Suppose I write the following on some not-quite-Python language:

lst = ['one', 'two', 'three']

index = 1

def foo(item, i):
i = 2
item = "ouch"

foo(lst[index], index)

index == 2
lst == ['one', 'two', 'ouch']

With call by simple reference, after the call I would expect the
following conditions to be true:

index == 2
lst == ['one', 'ouch', 'three']

With full call by reference, however, arguably the change to the value
of index would induce the post-conditions

index == 2
lst == ['one', 'two', 'ouch']

because the reference made by the first argument depends on the value of
a variable mutated inside the function call.

Why the resistance to these simple and basic terms that apply to any OOP
language?

Ideally I'd like to see this discussion concluded without resorting to
democratic appeals. Otherwise, after all, we should all eat shit: sixty
billion flies can't possibly be wrong.

What does "give a new name to an object" mean? I submit that it means
exactly the same thing as "assigns the name to refer to the object".

I normally internalize "x = 3" as meaning "store a reference to the
object 3 in the slot named x", and when I see "x" in an expression I
understand it to be a reference to some object, and that the value will
be used after dereferencing has taken place.

I've seen various descriptions of Python's name binding behavior in
terms of attaching Port-It notes bearing names to the objects reference
by the names, and I have never found them convincing. The reason for
this is that names live in namespaces, whereas values live in some other
universe altogether (that I normally describe as "object space" to
beginners, though this is not a term you will come across in the python
literature). So I see the Post-it as being attached to a portion of some
namespace, and that little fixed-size piece of object space being
attached by a piece of string to a specific object. Of course any object
can have many piece of string attached, and not all of them come from
names -- some of them come from container elements, for example.

There certainly is no difference in behavior that anyone has been able
to point out between what assignment does in Python, and what assignment
does in RB, VB.NET, Java, or C++ (in the context of object pointers, of
course). If the behavior is the same, why should we make up our own
unique and different terminology for it?

One reason would be that in the other languages you have other choices
as well, so you need to distinguish between them. Python is simpler, and
so I don't see us needing the terminological complexity required in the
other contexts you name, for a start. Java messed up the whole deal by
having different kinds of objects as a sacrifice to run-time speed,
thereby breeding a whole generation of programmers with little clue
about these matters, and the .NET environment also has to resort to
"boxing" and "unboxing" from time to time. I say away with comparisons
to such horrendously complex issues. One of the reasons for Python's
continue march towards world domination (allow me my fantasies) is its
consistent simplicity. Those last two words would be my candidate for
the definition of "Pythonicity".

A reference to an object, got it.

Assigning the reference to the object, yes.

Nope, storing the reference "against" the name (more exactly, in the
memory area associated with name, though I can hear hackles rising
throughout Pythonland as I type those words).

Agreed; they're not aliases of the call arguments.

They are actually names local to the function namespace, containing
references to the arguments. Some of those arguments were provided as
names, in which case the local name contains a copy of the reference
bound to the name provided as an argument. This is, however, merely a
degenerate case of the general instance, in which an expression is
provided as an argument and evaluated, yielding (a reference to) an
object which is then bound to the parameter name in the local namespace.

Well, I'm not sure why that would be. What you've just described is
called "pass by value" in every other language.

Sigh. This surely can only be true if you insist that references are
themselves values. I hold that they are not. It seems so transparent to
me that the parameters are copies of the references passed as arguments
I find it difficult to understand how, or why, anyone would
conceptualize it differently.

That seems to contradict the actual behavior, as well as what you said
yourself above. The only way I know how to interpret "an object is
passed" is "the data of that object is copied onto the stack". But of
course that's not what happens. What actually happens is what you said
above: a name (reference) is assigned to the object. The name is a
reference; it is made to refer to the same thing that the argument
(actual parameter) referred to. This is exactly what "the reference is
passed" means, nothing more or less.

OK, so above you argue quite cogently that Python uses a
reference-passing mechanism. This make you insistence in the preceding
paragraph on calling it "pass by value" a little stubborn.

In other words, make 'x' refer to this new object.

So far so good.

Yes. But what does that mean? Does the parameter within 'foo' become
an alias of x, or a copy of it? That's what we DO need to decide.

I think you mean here that local 'x' is made to refer to the object
passed to foo. Agreed. It is NOT an alias of the actual parameter.
And that's what we need to know. So it's not call-by-reference, it's
call-by-value; the value of x (a reference to whatever object Bar()
returned) is copied from the value of the parameter (a reference to that
same object, of course).

Sigh again. You appear to want to have your cake and eat it. You are, if
effect, saying "there are no values in Python, only references",
completely ignoring the fact that it is semantically impossible to have
a reference without having something to *refer to* (which we in the
Python world, in our usual sloppy way, often call "a value").

In Algol 68 terms what you are saying is "they are refs, not ref refs".

I suspect this may be at the root of our equally stubborn insistence
that calling this mechanism "pass by value" is inviting
misunderstanding. If we didn't want to eliminate misunderstanding we
would all have stopped replying to you long ago.

Make 'x' refer to the new Foo() result, yes.

Obvious only once you've determined that Python is call-by-value. If it
were like FORTRAN, where everything is call-by-reference, then that
wouldn't be the case at all; the assignment within the function would
affect the variable (or name, if you prefer) passed in.

Sorry, I'd need to be a Fortran expert to decipher that or make a
judgment on its validity, so your appeal to the masses goes out of the
window.

I have to disagree. The model is simple, self-consistent, and
consistent with all other languages. It's making up terms and trying to
justify them and get out of the logical knots (such as your claim above
that the object itself is passed to a method) that is unnecessarily
complex.

I have to disagree. The model is clearly based on a wrong-headed
interpretation of a fairly exact understanding of Python's semantics.

I'm afraid I don't see that.

Well, if by C/C++-like languages, you mean also Java, VB.NET, and so on,
then maybe you're right -- perhaps my view is colored by my experience
with those. But alternatively, perhaps there are enough Python users
without experience in other OOP languages, that the standard terminology
was unfamiliar to them, and they made up their own, resulting in the
current linguistic mess.

Well, I started with Simula and SmallTalk back in 1973, so my experience
may be a bit light. Sorry about that. This terminology wasn't made up by
Python beginners, but by the people who invented Python. I believe they
did so on the grounds that it's easier for beginners to understand
Python's semantics without having to reference too many similar in
theory but confusingly different in practice other environments.

I would even argue that your confusion supports this argument. Your
understanding of Python is perfectly adequate, so get with the program
for Pete's sake!

regards
Steve

Terry Reedy · Nov 7, 2008

Aaron said:
Interesting. If I may be so bold as to ask, is it for C code, C
compilers, and/or C programs?

Sorry, I should have specified for the abstract definition of a Python
interpreter, and perhaps for people. Computer programs need an
implementation in terms of computer memory, etc.

Steve Holden · Nov 7, 2008

Steven said:
On Thu, 06 Nov 2008 09:59:37 -0700, Joe Strout wrote: [...]
And by definition, "call by value" means that the parameter is a copy. So
if you pass a ten megabyte data structure to a function using call-by-
value semantics, the entire ten megabyte structure is copied.

Since this does not happen in Python, Python is not a call-by-value
language. End of story.

Without knowing that, you don't know what assignments to the formal
parameter will do, or even what sort of arguments are valid. Answer:
it's a copy of it.

Click to expand...

Lies, all lies. Python doesn't copy variables unless you explicitly ask
for a copy. That some implementations of Python choose to copy pointers
rather than move around arbitrarily large blocks of memory instead is an
implementation detail. It's an optimization and irrelevant to the
semantics of argument passing in Python.

[...]

Are you sure you meant to write this?

regards
Steve

Steve Holden · Nov 7, 2008

Steven said:
On Thu, 06 Nov 2008 09:59:37 -0700, Joe Strout wrote: [...]
And by definition, "call by value" means that the parameter is a copy. So
if you pass a ten megabyte data structure to a function using call-by-
value semantics, the entire ten megabyte structure is copied.

Since this does not happen in Python, Python is not a call-by-value
language. End of story.

Without knowing that, you don't know what assignments to the formal
parameter will do, or even what sort of arguments are valid. Answer:
it's a copy of it.

Click to expand...

Lies, all lies. Python doesn't copy variables unless you explicitly ask
for a copy. That some implementations of Python choose to copy pointers
rather than move around arbitrarily large blocks of memory instead is an
implementation detail. It's an optimization and irrelevant to the
semantics of argument passing in Python.

[...]

Are you sure you meant to write this?

regards
Steve

Steve Holden · Nov 7, 2008

Joe said:
First, I want to thank everyone for your patience -- I think we're
making progress towards a consensus.
Phew.

But putting that aside, consider the Python
code "x = 1". Which statement would you agree with?

(A) The value of x is 1.

Click to expand...

Only speaking loosely (which we can get away with because numbers are
immutable types, as pointed out in the last section of [1]).

I'd agree this is a valid way to describe the post-condition of the
assignment.

Hmm, this might be true to somebody working at the implementation level,
but I think we're all agreed that that's not the level of this
discussion. What's relevant here is how the language actually behaves,
as observable by tests written in that language.

I'd actually say "x contains a reference to 1", claim that was
equivalent to (A) above and cite Humpty Dumpty as my justification [1].

Correct again. My answer is:

(C) The value of x is a reference to an immutable object with the value
of 1. (That's too wordy for casual conversation so we might casually
reduce this to (A), as long as we all understand that (A) is not
actually true. It's a harmless fiction as long as the object is
immutable; it becomes important when we're dealing with mutable objects.)

I am unsure here as to why you find it necessary to refer to 1's
immutability. How does this alter the essence of the situation?

[...]

Not that I'm trying to argue from authority; I'm trying to argue from
logic. I suspect, though, that your last comment gets to the crux of
the matter, and reinforces my guess above: you don't think c-b-v means
what most people think it means. Indeed, you don't think any of the
languages shown at [1] are, in fact, c-b-v languages. If so, then we
should focus on that and see if we can find a definitive answer.

You're right. Observable progress, I think.

regards
Steve

[1] http://www.sundials.org/about/humpty.htm

Steven D'Aprano · Nov 7, 2008

Steven said:
Steven said:

On Thu, 06 Nov 2008 09:59:37 -0700, Joe Strout wrote: [...]
And by definition, "call by value" means that the parameter is a copy.
So if you pass a ten megabyte data structure to a function using
call-by- value semantics, the entire ten megabyte structure is copied.

Since this does not happen in Python, Python is not a call-by-value
language. End of story.

Without knowing that, you don't know what assignments to the formal
parameter will do, or even what sort of arguments are valid. Answer:
it's a copy of it.

Click to expand...

Lies, all lies. Python doesn't copy variables unless you explicitly ask
for a copy. That some implementations of Python choose to copy pointers
rather than move around arbitrarily large blocks of memory instead is
an implementation detail. It's an optimization and irrelevant to the
semantics of argument passing in Python.

Click to expand...

[...]

Are you sure you meant to write this?

I'm not sure I understand what you're getting at. The only thing I can
imagine is that you think there's some sort of contradiction between my
two statements. I don't see why you think so. In principle one could
create an implementation of Python with no stack at all, where everything
lives in the heap, and all argument passing is via moving the actual
objects ("large blocks of memory") rather than references to the objects.

I don't mean to imply that this is a practical way to go about
implementing Python, particularly given current computer hardware where
registers are expensive and the call stack is ubiquitous. But there are
other computing models, such as the "register machine" model which uses a
hypothetically-infinite number of registers, no stack and no heap. With
sufficient effort, one could implement a Python compiler using only a
Turing Machine. Turing Machines don't have call-by-anything semantics.
What would that say about Python?

Alternatively, you're commenting about my comment about copying pointers.
I thought I had sufficiently distinguished between what takes place at
the level of the Python VM (nothing is copied without an explicit request
to copy it) and what takes place in the VM's implementation (the
implementation is free to copy whatever pointers it wants as an
optimization). If I failed to make that clear, I hope I have done so now.

Pointers do not exist at the level of the Python VM, but they clearly
exist in the C implementation. We agree that C is call-by-value. CPython
is (obviously) built on top of C's call-by-value semantics, by passing
pointers (or references if you prefer) to objects. But that doesn't imply
that Python's calling semantics are the same as C's.

Algol is call-by-name. Does anyone doubt that we could develop an Algol
implementation of Python that used nothing but call-by-name semantics at
the implementation level? Would that mean that Python was call-by-name?

The point is that Joe's argument is based on a confusion between the C
implementation (calling a function with argument x makes a copy of a
pointer to x and puts it into the stack for the function) and what
happens at the level of the Python language.

Joe Strout · Nov 7, 2008

That's good to hear. Your arguments are sometimes pretty good, and
usually well made, but there's been far too much insistence on all
sides
about being right and not enough on reaching agreement about how
Python's well-defined semantics for assignment and function calling
should best be described.

In other words, it's a classic communication problem.

That's a fair point. I'll try to do better.

I must say I find it strange when people try to contradict my
assertion
that Python names are references to objects, when the (no pun
intended)
reference implementation of the language uses "reference counting" to
track how many assignments have been made.

I agree. It seems like we should be able to take that as a given.

So any argument that the language "doesn't have the concept of object
reference (in the sense of e.g. C++ reference)" is simply stating the
obvious: that Python has no way to declare reference variables. I
would
argue myself that it has no need of such a mechanism precisely because
names are object references, and I'd like to hear counter-arguments.

Right. I think of it this way: every variable is an object reference;
no special syntax needed for it because that's the only type of
variable there is. (Just as with Java or .NET, when dealing with any
class type; Python is just a little more extreme in that even simple
things like numbers are wrapped in objects.)

Note: I tried to say "name" above instead of "variable" but I couldn't
bring myself to do it -- "name" seems to generic to do that job. Lots
of things have names that are not variables: modules have names,
classes have names, methods have names, and so do variables. If I say
"name," an astute listener would reasonably say "name of what" -- and
I don't want to have to say "name of some thing in a name space which
can be flexibly associated with an object" when the simple term
"variable" seems to work as well.

Well that's not true either. If I remember all the way back to my
computational science degree I seem to remember being taught that
there
was call by *simple reference*, which is what I understand you to
mean.
Suppose I write the following on some not-quite-Python language:

lst = ['one', 'two', 'three']

index = 1

def foo(item, i):
i = 2
item = "ouch"

foo(lst[index], index)
...
With call by simple reference, after the call I would expect the
following conditions to be true:

index == 2
lst == ['one', 'ouch', 'three']

Yes, I guess so, though it would require that lst[index] evaluate to
an lvalue to which the 'item' parameter could be an alias. (With the
second parameter, 'i', the situation is more straightforward because
you're passing in a simple variable rather than a more complex
expression.)

With full call by reference, however, arguably the change to the value
of index would induce the post-conditions

index == 2
lst == ['one', 'two', 'ouch']

because the reference made by the first argument depends on the
value of
a variable mutated inside the function call.

I confess that I've never heard of "call by simple reference" or "call
by full reference" before. What you're describing in the second case
sounds more like call by name to me.

But I think we can agree that neither of these behaviors describes
Python.

Ideally I'd like to see this discussion concluded without resorting to
democratic appeals. Otherwise, after all, we should all eat shit:
sixty
billion flies can't possibly be wrong.

I think I could make a good argument that the nutritional needs of
flies are different from those of humans. On the other hand, what
argument is there that the Python community should use its own unique
terminology for concepts that apply equally well to other languages?
Wouldn't communication be easier and smoother if we adopted standard
terms for standard behavior?

I normally internalize "x = 3" as meaning "store a reference to the
object 3 in the slot named x", and when I see "x" in an expression I
understand it to be a reference to some object, and that the value
will
be used after dereferencing has taken place.

Works for me.

I've seen various descriptions of Python's name binding behavior in
terms of attaching Port-It notes bearing names to the objects
reference
by the names, and I have never found them convincing. The reason for
this is that names live in namespaces, whereas values live in some
other
universe altogether (that I normally describe as "object space" to
beginners, though this is not a term you will come across in the
python
literature).

Agreed. That model implies that all names are global, and completely
fails to explain how one object might be named "x" and a completely
different object might also be "x" (albeit in a different namespace).
I suppose your post-its could be color-coded by namespace, and then
you could add additional warts and caveats and addendums to explain
recursion, or explain why you don't have to search all objects in
existence to find the right one every time a name is dereferenced, but
the whole thing seems like a house of cards to me.

So I see the Post-it as being attached to a portion of some
namespace, and that little fixed-size piece of object space being
attached by a piece of string to a specific object. Of course any
object
can have many piece of string attached, and not all of them come from
names -- some of them come from container elements, for example.
Right.

One reason would be that in the other languages you have other choices
as well, so you need to distinguish between them. Python is simpler,
and
so I don't see us needing the terminological complexity required in
the
other contexts you name, for a start.

OK, that's a fair argument, and I do suspect this is a big part of it
-- when your language clearly supports passing object references and
other types by-ref and by-val, and you can easily demonstrate the
difference, then there is little temptation to claim that it doesn't
do either one. But if your language supports only one of these, and
you have no choices about it and can't (within the language itself)
compare and contrast that one against another, then it is easy to make
all sorts of claims about what that one is.

But getting back to your point: is the standard terminology really
more complex than whatever else we can come up with?

Java messed up the whole deal by
having different kinds of objects as a sacrifice to run-time speed,
thereby breeding a whole generation of programmers with little clue
about these matters, and the .NET environment also has to resort to
"boxing" and "unboxing" from time to time. I say away with comparisons
to such horrendously complex issues. One of the reasons for Python's
continue march towards world domination (allow me my fantasies) is its
consistent simplicity. Those last two words would be my candidate for
the definition of "Pythonicity".

I'm with you there. To me, the consistent simplicity is exactly this:
all variables are object references, and these are always passed by
value.

They are actually names local to the function namespace, containing
references to the arguments. Some of those arguments were provided as
names, in which case the local name contains a copy of the reference
bound to the name provided as an argument. This is, however, merely a
degenerate case of the general instance, in which an expression is
provided as an argument and evaluated, yielding (a reference to) an
object which is then bound to the parameter name in the local
namespace.

Quite right.

Sigh. This surely can only be true if you insist that references are
themselves values. I hold that they are not.

Here's an example of the above, I guess. In a language that supports
integers and doubles as simple types, stored directly in a variable,
then it is an obvious generalization that in the case of an object
type, the value is a reference to an object. (Then you can
"dereference" such a value to get to the values stored within the
object.) It is the only simple and consistent description of such a
language (which includes Java, RB, and .NET, as well as C++ if you
consider an object pointer equivalent to a reference in more modern
languages.)

But Python doesn't have those simple types, so there is a temptation
to try to skip this generalization and say that references are not
values, but rather the values are the objects themselves (despite the
dereferencing step that is still required to get any data out of
them). Well, and of course in the case of immutable objects, there is
very little observable difference between references and values.

However, it seems to me that when you start denying that the value of
an object reference is a reference to an object, this is when you get
led into a quagmire of contradictions. Perhaps I'm wrong and I just
haven't explored that path far enough, because it appears dark and
cobwebby to my eyes. I will try to give it a chance.

It seems so transparent to me that the parameters are copies of the
references passed as arguments
I find it difficult to understand how, or why, anyone would
conceptualize it differently.

Now you seem to be saying the same thing I've been saying all along.
But this really is called "pass by value" in at least RB, VB.NET, and
Java. And that makes sense to me.

OK, so above you argue quite cogently that Python uses a reference-
passing mechanism.

Yes, of course.

This make you insistence in the preceding paragraph on calling it
"pass by value" a little stubborn.

Why? Are you really meaning to insist that the RB/VB.NET example:

Function GetAgeInDogYears(ByVal whom As Person) As Integer
return whom.age * 7
End Function

is not actually using a by-value parameter? Or that it's not passing
an object reference?

Sigh again. You appear to want to have your cake and eat it. You
are, if
effect, saying "there are no values in Python, only references",
completely ignoring the fact that it is semantically impossible to
have
a reference without having something to *refer to*

Well of course. I'm pretty sure I've said repeatedly that Python
variables refer to objects on the heap. (Please replace "heap" with
"object space" if you prefer.) I'm only saying that Python variables
don't contain any other type of value than references -- no integers
or doubles, for example. This is unlike the other languages under
discussion (and may be at the root of the confusion).

(which we in the Python world, in our usual sloppy way, often call
"a value").

Yes, and as long as we're agreed that this is only a sloppy shorthand,
I'm OK with it (especially in the case of immutable objects, where the
distinction is irrelevant).

I suspect this may be at the root of our equally stubborn insistence
that calling this mechanism "pass by value" is inviting
misunderstanding. If we didn't want to eliminate misunderstanding we
would all have stopped replying to you long ago.

Ditto right back at you.

So maybe here's the trouble: since all
Python variables are references, there is no need to distinguish
reference types from any other types (there aren't any other types).
So, with the distinction gone, there is a strong temptation to gloss
over the fact that they are references at all, and try to say that the
variables directly contain their objects.

But it seems to me that this claim quickly breaks down -- even as you
said yourself; you need instead some mental model that shows the
variables as pointing to (tied to via strings, associated via a lookup
table, or whatever) the objects, which exist in object space. In
other words, they're references.

But continuing to attempt to gloss over that fact, when you come to
parameter passing, you're then stuck trying to avoid describing it as
call by value, since if you claim that what a variable contains is the
object itself, then that doesn't fit (since clearly the object itself
is not copied). You also have to describe the assignment operator as
different from all other languages, since clearly that's not copying
the object either.

So you end up in this (to me, very strange) state where you're making
up new terms to describe the parameter behavior, and the assignment
behavior, which behavior is exactly the same as any other modern OOP
language. It makes it (again, IMHO) all seem very much more complex
and mysterious than it really is. And this all results inevitably
from trying to gloss over the fact that Python variables are references.

So, while I'm trying this path on for size (and will continue to mull
it over further), please try on this approach: boldly admit that
they're references, and embrace that fact. An assignment copies the
RHS reference into the LHS variable, nothing more or less. A
parameter copies the argument reference into the formal parameter,
nothing more or less. And all this is exactly the same as in any
other OOP language the reader is likely to know. Isn't that simple,
clear, and far easier to explain?

Well, I started with Simula and SmallTalk back in 1973, so my
experience
may be a bit light. Sorry about that. This terminology wasn't made
up by
Python beginners, but by the people who invented Python.

Was it? Has our BDFL weighed in on this terminology issue anywhere?
So far, the only "official" words I've found related to this
discussion are the ones plainly admitting that Python uses references
(which some in this thread seem to want to deny, though not you Steve).

I believe they did so on the grounds that it's easier for beginners
to understand
Python's semantics without having to reference too many similar in
theory but confusingly different in practice other environments.

I wonder if that could be tested systematically. Perhaps we could
round up 20 newbies, divide them into two groups of 10, give each one
a 1-page explanation either based on passing object references by-
value, or passing values sort-of-kind-of-by-reference, and then check
their comprehension by predicting the output of some code snippets.
That'd be very interesting. It's hard for me to believe that the
glossing-over-references approach really is easier for anybody, but
maybe I'm wrong.

I would even argue that your confusion supports this argument. Your
understanding of Python is perfectly adequate, so get with the program
for Pete's sake!

In my case, my understanding of Python became clear only once I
stopped listening to all the confusing descriptions here, and realized
that Python is no different from other OOP languages I already knew.

Best,
- Joe

Steve Holden · Nov 7, 2008

Steven said:
Steven said:

On Thu, 06 Nov 2008 09:59:37 -0700, Joe Strout wrote: [...]
And by definition, "call by value" means that the parameter is a copy.
So if you pass a ten megabyte data structure to a function using
call-by- value semantics, the entire ten megabyte structure is copied.

Since this does not happen in Python, Python is not a call-by-value
language. End of story.

Without knowing that, you don't know what assignments to the formal
parameter will do, or even what sort of arguments are valid. Answer:
it's a copy of it.
Lies, all lies. Python doesn't copy variables unless you explicitly ask
for a copy. That some implementations of Python choose to copy pointers
rather than move around arbitrarily large blocks of memory instead is
an implementation detail. It's an optimization and irrelevant to the
semantics of argument passing in Python.

Click to expand...

[...]

Are you sure you meant to write this?

Click to expand...

I'm not sure I understand what you're getting at. The only thing I can
imagine is that you think there's some sort of contradiction between my
two statements. I don't see why you think so. In principle one could
create an implementation of Python with no stack at all, where everything
lives in the heap, and all argument passing is via moving the actual
objects ("large blocks of memory") rather than references to the objects.

I don't mean to imply that this is a practical way to go about
implementing Python, particularly given current computer hardware where
registers are expensive and the call stack is ubiquitous. But there are
other computing models, such as the "register machine" model which uses a
hypothetically-infinite number of registers, no stack and no heap. With
sufficient effort, one could implement a Python compiler using only a
Turing Machine. Turing Machines don't have call-by-anything semantics.
What would that say about Python?

Alternatively, you're commenting about my comment about copying pointers.
I thought I had sufficiently distinguished between what takes place at
the level of the Python VM (nothing is copied without an explicit request
to copy it) and what takes place in the VM's implementation (the
implementation is free to copy whatever pointers it wants as an
optimization). If I failed to make that clear, I hope I have done so now.

Pointers do not exist at the level of the Python VM, but they clearly
exist in the C implementation. We agree that C is call-by-value. CPython
is (obviously) built on top of C's call-by-value semantics, by passing
pointers (or references if you prefer) to objects. But that doesn't imply
that Python's calling semantics are the same as C's.

Algol is call-by-name. Does anyone doubt that we could develop an Algol
implementation of Python that used nothing but call-by-name semantics at
the implementation level? Would that mean that Python was call-by-name?

The point is that Joe's argument is based on a confusion between the C
implementation (calling a function with argument x makes a copy of a
pointer to x and puts it into the stack for the function) and what
happens at the level of the Python language.

I am probably egregiously misunderstanding. The practical difficulty
with the "moving huge blocks of data" approach would appear to emerge
when a function that gets passed an instance of some container then
calls another function that references the same container as a global,
for example.

Again, I don't think we disagree about how Python actually works, but it
seemed to me that your statement obscured rather than helped.

regards
Steve

Help with finding difference between two bodies of text in order	0	Sep 10, 2024
Save instance when rotating	0	Sep 27, 2023
Physics coding -> getting the trajectory of an object with a force applied to it.	2	Oct 21, 2024
Save instance when rotating screen	1	Sep 27, 2023
FIXED: Dynamically reference variables in object	2	Mar 26, 2014
Reference types definition	0	Jun 18, 2023
How to upload a compressed file (.gz) to the swift object storage using the Python swift client?	1	Jul 24, 2024
An object is and isn't an instance of a class at the same time	2	Dec 9, 2012

Finding the instance reference of an object

Terry Reedy

Lie

Joe Strout

Joe Strout

Arnaud Delobelle

Aaron Brady

Joe Strout

Joe Strout

Terry Reedy

Aaron Brady

Steven D'Aprano

Joe Strout

Steve Holden

Terry Reedy

Steve Holden

Steve Holden

Steve Holden

Steven D'Aprano

Joe Strout

Steve Holden

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads