The meaning of "=" (Was: tough-to-explain Python)

K

kj

In said:
I had not realized how *profoundly* different the meaning of the
"=" in Python's
spam = ham
is from the "=" in its
spam[3] = ham[3]


To clarify, this comes from my reading of Fredrik Lundh's pages
"Python Objects" (http://effbot.org/zone/python-objects.htm) and
"Call By Object" (http://effbot.org/zone/call-by-object.htm).
(Thanks to Chris Rebert for the pointer!)

In the first one of these pages, Lundh writes:

[START OF LENGTHY QUOTE]

Assignment statements modify namespaces, not objects.

In other words,

name = 10

means that you're adding the name 'name' to your local namespace,
and making it refer to an integer object containing the value
10.

If the name is already present, the assignment replaces the
original name:

name = 10
name = 20

means that you're first adding the name 'name' to the local
namespace, and making it refer to an integer object containing
the value 10. You're then replacing the name, making it point to
an integer object containing the value 20. The original '10'
object isn't affected by this operation, and it doesn't care.

In contrast, if you do:

name = []
name.append(1)

you're first adding the name 'name' to the local namespace, making
it refer to an empty list object. This modifies the namespace.
You're then calling a method on that object, telling it to append
an integer object to itself. This modifies the content of the
list object, but it doesn't touch the namespace, and it doesn't
touch the integer object.

Things like name.attr and name[index] are just syntactic sugar
for method calls. The first corresponds to __setattr__/__getattr__,
the second to __setitem__/__getitem__ (depending on which side
of the assignment they appear).

[END OF LENGTHY QUOTE]

Therefore, extending just a bit beyond Lundh's explanation, if we
did:

name = []
name.append(1)
name[0] = 3

....the second assignment would amount to a method call on the object
called 'name', an operation of a very different nature (according
to Lundh) from the first assignment, which is a modification of a
namespace.

In the second one of these pages, Lundh makes a very similar point
(third post from the bottom).

But note that Lundh appears to be contradicting himself somewhat
when he writes "Assignment statements modify namespaces, not
objects." If by "assignment statements" he means ones consisting
of a left operand, a "=", and a right operand, then according to
the rest of what he writes on this subject, this assertion applies
only to *some* assignment statements, namely those of the form

<identifier> = <expression>

and not to those like, for example,

<identifier>[<expression>] = <expression>

or

<identifier>.<identifier> = <expression>

The former are syntatic sugar for certain namespace modifications
that leave objects unchanged. The latter are syntactic sugar for
certain object-modifying method calls that leave namespaces unchanged.

At least this is how I interpret what Lundh writes.

kj
 
P

Paul Boddie

  <identifier> = <expression>

and not to those like, for example,

  <identifier>[<expression>] = <expression>

or

  <identifier>.<identifier> = <expression>

The former are syntatic sugar for certain namespace modifications
that leave objects unchanged.  The latter are syntactic sugar for
certain object-modifying method calls that leave namespaces unchanged.

Almost. The latter can modify namespaces - the objects themselves -
but through properties or dynamic attribute access, they may choose
not to modify such a namespace. Really, we can phrase assignment (=)
as follows:

<thing> = <expression> # make <thing> refer to the result of
<expression>

Here, <thing> has to provide something that can be made to refer to
something else, such as a name within a namespace - the first and last
of your cases - or an item or slice within a sequence - the special
second case which is actually handled differently from the other
cases.

Meanwhile, the <expression> will always provide an object to refer to,
never anything of the nature of <thing> referring to something else.
In other words, if you have this...

x[1] = y[2]

....then the <expression> which is y[2] will yield an object which is
then assigned to x[1]. The concept of y[2] is not assignable - it must
be fully evaluated and produce the object at location #2 in the
sequence for assignment.

I suppose you could say that the left-hand side <thing> is like a sign
on a signpost which always points to a real place, not another sign on
a signpost. You could stretch this analogy by treating sequences as
signposts holding many signs, each adjustable to point to something
different. Since signposts (not the individual signs) are located in
real places, they would naturally be acceptable as targets of
assignments: where the signs are allowed to point to. Indeed, this
would be a world of signposts with the occasional primitive value
mixed in to keep weary travellers interested. ;-)

Paul
 
K

kj

In said:
=A0 <identifier> =3D <expression>

and not to those like, for example,

=A0 <identifier>[<expression>] =3D <expression>

or

=A0 <identifier>.<identifier> =3D <expression>

The former are syntatic sugar for certain namespace modifications
that leave objects unchanged. =A0The latter are syntactic sugar for
certain object-modifying method calls that leave namespaces unchanged.
Almost. The latter can modify namespaces - the objects themselves -
but through properties or dynamic attribute access, they may choose
not to modify such a namespace. Really, we can phrase assignment (=3D)
as follows:
<thing> =3D <expression> # make <thing> refer to the result of
<expression>
Here, <thing> has to provide something that can be made to refer to
something else, such as a name within a namespace - the first and last
of your cases - or an item or slice within a sequence - the special
second case which is actually handled differently from the other
cases.

Thanks for this correction.

OK, so, scratching from my original post the case

<identifier>.<identifier> = <expression>

(as being a special case of <identifier> = <expression>), still,
to the extent that I understand your post, the "=" in

x = 1

means something fundamentally different (in terms of Python's
underlying implementation) from the "=" in

y[0] = 1

No?
You could stretch this analogy by treating sequences as
signposts holding many signs, each adjustable to point to something
different.

Notionally, yes, I can see that, but there's no counterpart of this
analogy at the level of Python's implementation. The "x" above is
a sign, as you put it, i.e. an entry in a namespace, but "y[0]"
is, in essence, a method call.

kj
 
A

Aahz

OK, so, scratching from my original post the case

<identifier>.<identifier> = <expression>

(as being a special case of <identifier> = <expression>), still,
to the extent that I understand your post, the "=" in

x = 1

means something fundamentally different (in terms of Python's
underlying implementation) from the "=" in

y[0] = 1

No?

No. ;-) What's different is not the ``=`` but the construction of the
assignment target before ``=`` gets executed. Consider also this:

x, y = y, x
 
K

kj

In said:
OK, so, scratching from my original post the case

<identifier>.<identifier> = <expression>

(as being a special case of <identifier> = <expression>), still,
to the extent that I understand your post, the "=" in

x = 1

means something fundamentally different (in terms of Python's
underlying implementation) from the "=" in

y[0] = 1

No?

No??? Just when I thought I finally understood all this!
What's different is not the ``=`` but the construction of the
assignment target before ``=`` gets executed.

Hmm. OK, I went to the link you posted in your other message
(http://docs.python.org/reference/simple_stmts.html#assignment-statements)
and I find this (my emphasis):

Assignment of an object to a single target is recursively defined as follows.

* If the target is an identifier (name):

o If the name does not occur in a global statement in
the current code block: the name is bound to the object
^^^^^^^^^^^^^^^^^
in the current local namespace.

o Otherwise: the name is bound to the object in the
^^^^^^^^^^^^^^^^^
current global namespace.

The name is rebound if it was already bound. This may cause
the reference count for the object previously bound to the
name to reach zero, causing the object to be deallocated and
its destructor (if it has one) to be called.

* If the target is a target list enclosed in parentheses or in
square brackets... (I'LL IGNORE THIS FOR NOW)

* If the target is an attribute reference: The primary expression
in the reference is evaluated. It should yield an object with
assignable attributes; if this is not the case, TypeError is
raised. That object is then asked to assign the assigned
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
object to the given attribute; if it cannot perform the
^^^^^^
assignment, it raises an exception (usually but not necessarily
AttributeError).

* If the target is a subscription: The primary expression in
the reference is evaluated. It should yield either a mutable
sequence object (such as a list) or a mapping object (such
as a dictionary). Next, the subscript expression is evaluated.

If the primary is a mutable sequence object (such as a
list),... [CONDITIONS ON THE INDEX EXPRESSION OMITTED]...
the sequence is asked to assign the assigned object to its
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
item with that index....

If the primary is a mapping object (such as a dictionary),...
[CONDITIONS ON THE SUBSCRIPT EXPRESSION OMITTED]... the
^^^
mapping is then asked to create a key/datum pair which maps
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
the subscript to the assigned object.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

* If the target is a slicing: [INDEX STUFF OMITTED]... the
^^^
sequence object is asked to replace the slice with the items
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
of the assigned sequence...
^^^^^^^^^^^^^^^^^^^^^^^^

OK, I originally interpreted what Lundh wrote in his two articles
that the "binding" described at the very beginning (i.e. when the
target is an identifier), which I take to make an entry or modify
such an entry in a namespace, is a *categorically different*
operation from the remaining operations underlined above.

I interpreted Paul Boddie's correction in his response to me as
saying that the "assignment" mentioned for the case when the target
is an attribute reference is actually a special case of the
"assignment" to simple identifiers (i.e. it also means "binding").

But that still leaves all the other "assignments" (or the like)
underlined above. I don't think that the full definitions of these
remaining cases are covered by the same rule, even though the rule
is described as "recursive." I think that the writer has something
else in mind, and in particular, something *other* than binding,
but the author remains vague on exactly what this means.

Clearly, both Lundh and the documentation draw some distinction
between "binding" and some other forms of "assignment" (which remain
ill-defined throughout). This distinction is what I was referring
to when I said that "=" means different things in different contexts.

kj
 
T

Terry Reedy

kj said:
To clarify, this comes from my reading of Fredrik Lundh's pages
"Python Objects" (http://effbot.org/zone/python-objects.htm) and
"Call By Object" (http://effbot.org/zone/call-by-object.htm). [snip]
[END OF LENGTHY QUOTE]

Therefore, extending just a bit beyond Lundh's explanation, if we
did:

name = []
name.append(1)
name[0] = 3

...the second assignment would amount to a method call on the object
called 'name', an operation of a very different nature (according
to Lundh) from the first assignment, which is a modification of a
namespace.

I disagree. Assignment creates an association. Modification of a
namespace, when implemented, amounts to a method call on the concrete
object, whether a Python object or not, that implements the abstraction
of a namespace. At module scope,
name = ob
is the same as
globals()['name']=ob
Within a class statement, substitute '<class-dict>' for 'globals'
Within functions, CPython uses an internal array, so
name = ob
becomes
<locals_array>[name-number] = ob

Or, to put it another way, Python dicts and lists are, considered
abstractly, associations also, just like namespaces. Dicts are more
general than namespaces, sequences are 'number-spaces' instead of
name-spaces.

Terry Jan Reedy
 
A

Aahz

[excessive quoting ahead, I'm too tired to trim]

In said:
OK, so, scratching from my original post the case

<identifier>.<identifier> = <expression>

(as being a special case of <identifier> = <expression>), still,
to the extent that I understand your post, the "=" in

x = 1

means something fundamentally different (in terms of Python's
underlying implementation) from the "=" in

y[0] = 1

No?

No. ;-)

No??? Just when I thought I finally understood all this!
What's different is not the ``=`` but the construction of the
assignment target before ``=`` gets executed.

Hmm. OK, I went to the link you posted in your other message
(http://docs.python.org/reference/simple_stmts.html#assignment-statements)
and I find this (my emphasis):

Assignment of an object to a single target is recursively defined as follows.

* If the target is an identifier (name):

o If the name does not occur in a global statement in
the current code block: the name is bound to the object
^^^^^^^^^^^^^^^^^
in the current local namespace.

o Otherwise: the name is bound to the object in the
^^^^^^^^^^^^^^^^^
current global namespace.

The name is rebound if it was already bound. This may cause
the reference count for the object previously bound to the
name to reach zero, causing the object to be deallocated and
its destructor (if it has one) to be called.

* If the target is a target list enclosed in parentheses or in
square brackets... (I'LL IGNORE THIS FOR NOW)

* If the target is an attribute reference: The primary expression
in the reference is evaluated. It should yield an object with
assignable attributes; if this is not the case, TypeError is
raised. That object is then asked to assign the assigned
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
object to the given attribute; if it cannot perform the
^^^^^^
assignment, it raises an exception (usually but not necessarily
AttributeError).

* If the target is a subscription: The primary expression in
the reference is evaluated. It should yield either a mutable
sequence object (such as a list) or a mapping object (such
as a dictionary). Next, the subscript expression is evaluated.

If the primary is a mutable sequence object (such as a
list),... [CONDITIONS ON THE INDEX EXPRESSION OMITTED]...
the sequence is asked to assign the assigned object to its
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
item with that index....

If the primary is a mapping object (such as a dictionary),...
[CONDITIONS ON THE SUBSCRIPT EXPRESSION OMITTED]... the
^^^
mapping is then asked to create a key/datum pair which maps
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
the subscript to the assigned object.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

* If the target is a slicing: [INDEX STUFF OMITTED]... the
^^^
sequence object is asked to replace the slice with the items
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
of the assigned sequence...
^^^^^^^^^^^^^^^^^^^^^^^^

OK, I originally interpreted what Lundh wrote in his two articles
that the "binding" described at the very beginning (i.e. when the
target is an identifier), which I take to make an entry or modify
such an entry in a namespace, is a *categorically different*
operation from the remaining operations underlined above.

I interpreted Paul Boddie's correction in his response to me as
saying that the "assignment" mentioned for the case when the target
is an attribute reference is actually a special case of the
"assignment" to simple identifiers (i.e. it also means "binding").

But that still leaves all the other "assignments" (or the like)
underlined above. I don't think that the full definitions of these
remaining cases are covered by the same rule, even though the rule
is described as "recursive." I think that the writer has something
else in mind, and in particular, something *other* than binding,
but the author remains vague on exactly what this means.

Clearly, both Lundh and the documentation draw some distinction
between "binding" and some other forms of "assignment" (which remain
ill-defined throughout). This distinction is what I was referring
to when I said that "=" means different things in different contexts.

Consider this:

x = 1
globals()['x'] = 1
locals()[1] = 1

What's the difference between the three? Although there's a lot of
machinery amenable to manipulation, with the corresponding potential for
breaking the standard model, fundamentally it all comes down to the
intention that *some* piece of code establishes a target on the left-hand
side and binds it to the object returned by the right-hand side. IME,
except when going down to the very lowest-level details, Python is
easiest to understand when you treat all assignment statements as working
the same. It helps to remember that names and namespaces are in many
ways syntactic sugar for dicts or lists.

(Obviously, the example with locals() doesn't do what it purports to do,
but it shows how function/method namespaces work under the covers.)

Slicing is the one real exception, but again I think that it's easier to
understand by wedging your mental model into conformance with the simple
target/binding metaphor.
 
L

Lawrence D'Oliveiro

It helps to remember that names and namespaces are in many
ways syntactic sugar for dicts or lists.

Interesting, though, that Python insists on maintaining a distinction
between c["x"] and c.x, whereas JavaScript doesn't bother.
 
G

greg

Lawrence said:
Interesting, though, that Python insists on maintaining a distinction
between c["x"] and c.x, whereas JavaScript doesn't bother.

And that distinction is a good thing. It means, for
example, that dictionaries can have methods without
colliding with the key space of the items put into
them.
 
A

Aahz

It helps to remember that names and namespaces are in many
ways syntactic sugar for dicts or lists.

Interesting, though, that Python insists on maintaining a distinction
between c["x"] and c.x, whereas JavaScript doesn't bother.

Why do you say "insists"?

class AttrDict:
def __getitem__(self, key):
return getattr(self, key)
 
L

Lawrence D'Oliveiro

It helps to remember that names and namespaces are in many
ways syntactic sugar for dicts or lists.

Interesting, though, that Python insists on maintaining a distinction
between c["x"] and c.x, whereas JavaScript doesn't bother.

Why do you say "insists"?

class AttrDict:
def __getitem__(self, key):
return getattr(self, key)

OK, let's try it:
>>> c = {}
>>> c["x"] = 3
>>> c.x = 4
Traceback (most recent call last):
... def __getitem__(self, key):
... return getattr(self, key)
... Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'dict' object has no attribute 'x'

Nope, still doesn't work...
 
P

Piet van Oostrum

Lawrence D'Oliveiro said:
LD> In message <[email protected]>, Aahz wrote:
Aahz> class AttrDict:
Aahz> def __getitem__(self, key):
Aahz> return getattr(self, key)
LD> OK, let's try it:
LD> >>> c = {}
LD> >>> c["x"] = 3
LD> >>> c.x = 4
LD> Traceback (most recent call last):
LD> File "<stdin>", line 1, in <module>
LD> AttributeError: 'dict' object has no attribute 'x'
LD> >>> class AttrDict:
LD> ... def __getitem__(self, key):
LD> ... return getattr(self, key)
LD> ...
LD> >>> c.x = 4
LD> Traceback (most recent call last):
LD> File "<stdin>", line 1, in <module>
LD> AttributeError: 'dict' object has no attribute 'x'
LD> Nope, still doesn't work...

Of course you need c = AttrDict()

And to get c.x = 4 working you also need a __setitem__.
And to get c["x"] working AtrrDict should subclass dict:

but these are only minor details :=)
 
S

Steven D'Aprano

On Mon, 13 Jul 2009 23:22:36 +1200, Lawrence D'Oliveiro wrote:

[stupidity omitted]
Nope, still doesn't work...


Are we supposed to interpret that post as Dumb Insolence or just Dumb?
 
A

Aahz

LD> In message <[email protected]>, Aahz wrote:
Aahz> class AttrDict:
Aahz> def __getitem__(self, key):
Aahz> return getattr(self, key)
LD> OK, let's try it:
LD> >>> c = {}
LD> >>> c["x"] = 3
LD> >>> c.x = 4
LD> Traceback (most recent call last):
LD> File "<stdin>", line 1, in <module>
LD> AttributeError: 'dict' object has no attribute 'x'
LD> >>> class AttrDict:
LD> ... def __getitem__(self, key):
LD> ... return getattr(self, key)
LD> ...
LD> >>> c.x = 4
LD> Traceback (most recent call last):
LD> File "<stdin>", line 1, in <module>
LD> AttributeError: 'dict' object has no attribute 'x'
LD> Nope, still doesn't work...

Of course you need c = AttrDict()

Absolutely -- Lawrence really needs to learn to do his own debugging.
And to get c.x = 4 working you also need a __setitem__.

Nope. You do need __setitem__ so that this works:

c['x'] = 4
And to get c["x"] working AtrrDict should subclass dict:

Definitely not. There's a perfectly good dict inside a regular class
instance already. The only reason to subclass from dict is so that you
get all the dict methods for free; however, the cost is that you get
ugly bugs because of e.g.

c['update'] = 'foo'

Overally, I think it's much better/safer to explicitly pull the dict
methods you want to use.
 
A

Aahz

A> In article <[email protected]>, Piet van Oostrum
A> Nope. You do need __setitem__ so that this works:
A> c['x'] = 4

Sorry, I meant such that c.x = 4 does the same as c['x'] = 4 because
that was what the OP wanted (I think).

c.x = 4
already updates the instance dict, so there's no need to change any class
methods to support it. That is, IME it's much better to add methods to
a regular class to make it more dict-like using the built-in instance
dict rather than changing any of the attribute mechanisms. If you're
really curious, I recommend trying several approaches yourself to see
what works better. ;-)
 
P

Piet van Oostrum

(e-mail address removed) (Aahz) (A) wrote:
Piet van Oostrum said:
And to get c.x = 4 working you also need a __setitem__.
A> Nope. You do need __setitem__ so that this works:
A> c['x'] = 4

Sorry, I meant such that c.x = 4 does the same as c['x'] = 4 because
that was what the OP wanted (I think).
A> c.x = 4
A> already updates the instance dict, so there's no need to change any class
A> methods to support it. That is, IME it's much better to add methods to
A> a regular class to make it more dict-like using the built-in instance
A> dict rather than changing any of the attribute mechanisms. If you're
A> really curious, I recommend trying several approaches yourself to see
A> what works better. ;-)

Yes, that's why I mentioned __setitem__. I just mixed up the motivation.

In [28]: class AttrDict:
....: def __getitem__(self, key):
....: return getattr(self, key)
....:
....: def __setitem__(self, key, value):
....: setattr(self, key, value)
....:
....:

In [29]: c = AttrDict()

In [30]: c["y"] = 3

In [31]: c.y
Out[31]: 3

In [32]: c.x = 4

In [33]: c['x']
Out[33]: 4
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,070
Latest member
BiogenixGummies

Latest Threads

Top