Too many 'self' in python.That's a big flaw in this language.

A.T.Hofkamp · Jun 28, 2007

It's very common and practical (though not ideologically pure!) to want
each instance of a class to "stand for itself", be equal only to itself:
this lets me place instances in a set, etc, without fuss.

Convenience is the big counter argument, and I have thought about that.
I concluded that the convenience advantage is not big enough, and the problem
seems to be what "itself" exactly means.

In object oriented programming, objects are representations of values, and the
system shouldn't care about how many instances there are of some value, just
like numbers in math. Every instance with a certain value is the same as every
other instance with the same value.

You can also see this in the singleton concept. The fact that it is a pattern
implies that it is special, something not delivered by default in object
oriented programming.

This object-oriented notion of "itself" is not what Python delivers.

Python 2.4.4 (#1, Dec 15 2006, 13:51:44)
[GCC 3.4.4 20050721 (Red Hat 3.4.4-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information..... def __init__(self, number):
.... self.number = number
.... def __repr__(self):
.... return "Car(%r)" % self.number
....
False

So in Python, the default equivalence notion for numbers is based on values,
and the default equivalence notion for objects assumes singleton objects which
is weird from an object oriented point of view.

Therefore, I concluded that we are better off without a default __eq__ .

The default existence of __hash__ gives other nasty surprises:
.... def __init__(self, number):
.... self.number = number
.... def __repr__(self):
.... return "Car2(%r)" % self.number
.... def __eq__(self, other):
.... return self.number == other.number
....

Above I have fixed Car to use value equivalence (albeit not very robust).
Now if I throw these objects naively in a set:

a = Car2(123)
b = Car2(123)
a == b True
set([a,b])

Click to expand...

Click to expand...

set([Car2(123), Car2(123)])

I get a set with two equal cars, something that never happens with a set
my math teacher once told me.

Of course, I should have defined an appropiate __hash__ method together with
the __eq__ method. Unfortunately, not every Python programmer has always had
enough coffee to think about that when he is programming a class. Even worse, I
may get a class such as the above from somebody else and decide that I need a
set of such objects, something the original designer never intended.
The problem is then that something like "set([Car2(123), Car2(124)])" does the
right thing for the wrong reason without telling me.

Without a default __hash__ I'd get at least an error that I cannot put Car2
objects in a set. In that setup, I can still construct a broken set, but I'd
have to write a broken __hash__ function explicitly rather than implicitly
inheriting it from object.

I don't want, in order to get that often-useful behavior, to have to
code a lot of boilerplate such as
def __hash__(self): return hash(id(self))
and the like -- so, I like the fact that object does it for me. I'd

I understand that you'd like to have less typing to do. I'd like that too if
only it would work without major accidents by simple omission such as
demonstrated in the set example.

Another question can be whether your coding style would be correct here.

Since you apparently want to have singleton objects (since that is what you get
and you are happy with them), shouldn't you be using "is" rather than "=="?
Then you get the equivalence notion you want, you don't need __eq__, and you
write explicitly that you have singleton objects.

In the same way, sets have very little value for singleton objects, you may as
well use lists instead of sets since duplicate **values** are not filtered.
For lists, you don't need __hash__ either.

The only exception would be to filter multiple inclusions of the same object
(that is what sets are doing by default). I don't know whether that would be
really important for singleton objects **in general**.
(ie wouldn't it be better to explicitly write a __hash__ based on identity for
those cases?)

have no objection if there were two "variants" of object (object itself
and politically_correct_object), inheriting from each other either way
'round, one of which kept the current practical approach while the other
made __hash__ and comparisons abstract.

Or you define your own base object class "class Myobject(object)" and add a
default __eq__ and __hash__ method. This at least gives an explicit definition
of the equivalence notion for your application.

In Python 3000, ordering comparisons will not exist by default (sigh, a
modest loss of practicality on the altar of purity -- ah well, saw it
coming, ever since complex numbers lost ordering comparisons), but
equality and hashing should remain just like now (yay!).

I didn't try that, but it seems like a good decision. Ordering based on
identity may change with each invocation of the program!

Albert

Alan Isaac · Jun 28, 2007

A.T.Hofkamp said:
a = Car2(123)
b = Car2(123)
a == b

Click to expand...

True

set([a,b])

Click to expand...

Click to expand...

set([Car2(123), Car2(123)])

I get a set with two equal cars, something that never happens with a set
my math teacher once told me.

Then your math teacher misspoke.
You have two different cars in the set,
just as expected. Use `is`.
http://docs.python.org/ref/comparisons.html

This is good behavior.

Cheers,
Alan Isaac

A.T.Hofkamp · Jun 28, 2007

A.T.Hofkamp said:
A.T.Hofkamp said:

a = Car2(123)
b = Car2(123)
a == b
True

set([a,b])

Click to expand...

set([Car2(123), Car2(123)])

I get a set with two equal cars, something that never happens with a set
my math teacher once told me.

Click to expand...

Then your math teacher misspoke.
You have two different cars in the set,
just as expected. Use `is`.
http://docs.python.org/ref/comparisons.html

This is good behavior.

Hmm, maybe numbers in sets are broken then?

a = 12345
b = 12345
a == b True
a is b False
set([a,b])

Click to expand...

Click to expand...

set([12345])

Numbers and my Car2 objects behave the same w.r.t. '==' and 'is', yet I get a
set with 1 number, and a set with 2 cars.
Something is wrong here imho.

The point I intended to make was that having a default __hash__ method on
objects give weird results that not everybody may be aware of.
In addition, to get useful behavior of objects in sets one should override
__hash__ anyway, so what is the point of having a default object.__hash__ ?

The "one should override __hash__ anyway" argument is being discussed in my
previous post.

Albert

Roy Smith · Jun 28, 2007

"A.T.Hofkamp said:
In object oriented programming, objects are representations of values, and the
system shouldn't care about how many instances there are of some value, just
like numbers in math. Every instance with a certain value is the same as every
other instance with the same value.

Whether two things are equal depends on the context. Is one $10 note equal
to another? It depends.

If the context is a bank teller making change, then yes, they are equal.
What's more, there are lots of sets of smaller notes which would be equally
fungible.

If the context is a district attorney showing a specific $10 note to a jury
as evidence in a drug buy-and-bust case, they're not. It's got to be
exactly that note, as proven by a recorded serial number.

In object oriented programming, objects are representations of the real
world. In one case, the $10 note represents some monetary value. In
another, it represents a piece of physical evidence in a criminal trial.
Without knowing the context of how the objects are going to be used, it's
really not possible to know how __eq__() should be defined.

Let me give you a more realistic example. I've been doing a lot of network
programming lately. We've got a class to represent an IP address, and a
class to represent an address-port pair (a "sockaddr"). Should you be able
to compare an address to a sockaddr? Does 192.168.10.1 == 192.168.10.1:0?
You tell me. This is really just the "does 1 == (1 + 0j)" question in
disguise. There's reasonable arguments to be made on both sides, but there
is no one true answer. It depends on what you're doing.

John Nagle · Jun 28, 2007

Alex said:
Bjoern Schliessmann <[email protected]>
wrote:
...

That would be nice, unfortunately your C++ compiler will refuse that,
and force you to use this->a instead;-).

Yes, as Strostrup admits, "this" should have been a reference.
Early versions of C++ didn't have references.

One side effect of that mistake was the "delete(this)" idiom,
which does not play well with inheritance. But that's a digression here.

John Nagle

Steve Holden · Jun 29, 2007

A.T.Hofkamp said:
A.T.Hofkamp said:

a = Car2(123)
b = Car2(123)
a == b
True

set([a,b])
set([Car2(123), Car2(123)])

I get a set with two equal cars, something that never happens with a set
my math teacher once told me.

Click to expand...

Then your math teacher misspoke.
You have two different cars in the set,
just as expected. Use `is`.
http://docs.python.org/ref/comparisons.html

This is good behavior.

Click to expand...

Hmm, maybe numbers in sets are broken then?

a = 12345
b = 12345
a == b True
a is b False
set([a,b])

Click to expand...

Click to expand...

set([12345])

Numbers and my Car2 objects behave the same w.r.t. '==' and 'is', yet I get a
set with 1 number, and a set with 2 cars.
Something is wrong here imho.

The point I intended to make was that having a default __hash__ method on
objects give weird results that not everybody may be aware of.
In addition, to get useful behavior of objects in sets one should override
__hash__ anyway, so what is the point of having a default object.__hash__ ?

The "one should override __hash__ anyway" argument is being discussed in my
previous post.

Hmm, I suspect you'll like this even less:
set([1.0])

Just the same there are sound reasons for it, so I'd prefer to see you
using "counterintuitive" or "difficult to fathom" rather than "broken"
and "wrong".

Such language implies you have thought about this more deeply than the
developers (which I frankly doubt) and that they made an inappropriate
decision (which is less unlikely, but which in the case you mention I
also rather doubt).

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------

Gabriel Genellina · Jun 29, 2007

The point I intended to make was that having a default __hash__ method on
objects give weird results that not everybody may be aware of.
In addition, to get useful behavior of objects in sets one should
override
__hash__ anyway, so what is the point of having a default
object.__hash__ ?

__hash__ and equality tests are used by the dictionary implementation, and
the default implementation is OK for immutable objects. I like the fact
that I can use almost anything as dictionary keys without much coding.
This must always be true: (a==b) => (hash(a)==hash(b)), and the
documentation for __hash__ and __cmp__ warns about the requisites (but
__eq__ and the other rich-comparison methods are lacking the warning).

Guest · Jun 29, 2007

__hash__ and equality tests are used by the dictionary
implementation, and the default implementation is OK for immutable
objects.

That is probably why inf == inf yields True.
In this unique case, I do not like the default implementation.

Martin

A.T.Hofkamp · Jun 29, 2007

Just the same there are sound reasons for it, so I'd prefer to see you
using "counterintuitive" or "difficult to fathom" rather than "broken"
and "wrong".

You are quite correct, in the heat of typing an answer, my wording was too
strong, I am sorry.

Albert

A.T.Hofkamp · Jun 29, 2007

Whether two things are equal depends on the context. Is one $10 note equal
to another? It depends.

If the context is a bank teller making change, then yes, they are equal.
What's more, there are lots of sets of smaller notes which would be equally
fungible.

If the context is a district attorney showing a specific $10 note to a jury
as evidence in a drug buy-and-bust case, they're not. It's got to be
exactly that note, as proven by a recorded serial number.

In object oriented programming, objects are representations of the real
world. In one case, the $10 note represents some monetary value. In
another, it represents a piece of physical evidence in a criminal trial.
Without knowing the context of how the objects are going to be used, it's
really not possible to know how __eq__() should be defined.

I can see your point, but am not sure I agree. The problem is that OO uses
models tailored to an application, ie the model changes with each application.

In a bank teller application, one would probably not model the serial number,
just the notion of $10 notes would be enough, as in "Note(value)". The contents
of a cash register would then for example be a dictionary of Note() objects to
a count. You can merge two of such dictionaries, where the 'value' data of the
Note objects would be the equivalence notion.

In an evidence application one **would** record the serial number, since it is
a relevant distinguishing feature between notes, ie one would model Note(value,
serialnumber).
In this application the combination of value and serial number together defines
equivalence.

However, also in this situation we use values of the model for equivalence. If
we have a data base that relates evidence to storage location, and we would
like to know where a particular note was stored, we would compare Note objects
with each other based in the combination of value and serial number, not on
their id()'s.

You tell me. This is really just the "does 1 == (1 + 0j)" question in
disguise. There's reasonable arguments to be made on both sides, but there
is no one true answer. It depends on what you're doing.

While we don't agree on how OO programming handles equality (and it may well be
that there are multiple interpretations possible), wouldn't your argument also
not lead to the conclusion that it is better not to have a pre-defined __eq__
method?

Albert

Steve Holden · Jun 29, 2007

A.T.Hofkamp said:
You are quite correct, in the heat of typing an answer, my wording was too
strong, I am sorry.

No problem, I do the same thing myself ...

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------

A.T.Hofkamp · Jun 29, 2007

__hash__ and equality tests are used by the dictionary implementation, and
the default implementation is OK for immutable objects. I like the fact

I don't understand exactly how mutability relates to this.

The default __eq___ and __hash__ implementation for classes is ok if you never
have equivalent objects. In that case, == and 'is' are exactly the same
function in the sense that for each pair of arguments, they deliver the same
value.

This remains the case even if I mutate existing objects without creating
equivalent objects.

As soon as I create two equivalent instances (either by creating a duplicate at
a new address, or by mutating an existing one) the default __eq__ should be
redefined if you want these equivalent objects to announce themselves as
equivalent with the == operator.

that I can use almost anything as dictionary keys without much coding.

Most data-types of Python have their own implementation of __eq__ and __hash__
to make this work. This is good, it makes the language easy to use. However for
home-brewn objects (derived from object) the default implementation of these
functions may easily cause unexpected behavior and we may be better off without
a default implementation for these functions. That would prevent use of such
objects in combination with == or in sets/dictionaries without an explicit
definition of the __eq__ and __hash__ functions, but that is not very bad,
since in many cases one would have to define the proper equivalence notion
anyway.

This must always be true: (a==b) => (hash(a)==hash(b)), and the
documentation for __hash__ and __cmp__ warns about the requisites (but
__eq__ and the other rich-comparison methods are lacking the warning).

I don't know exactly what the current documentation says. One of the problems
is that not everybody is reading those docs. Instead they run a simple test
like "print set([Car(1),Car(2)])". That gives the correct result even if the
"(a==b) => (hash(a)==hash(b))" relation doesn't hold due to re-definition of
__eq__ but not __hash__ (the original designer never expected to use the class
in a set/dictionary for example) , and the conclusion is "it works". Then they
use the incorrect implementation for months until they discover that it doesn't
quite work as expected, followed by a long debugging session to find and
correct the problem.

Without default __eq__ and __hash__ implementations for objects, the program
would drop dead on the first experiment. While it may be inconvenient at that
moment (to get the first experiment working, one needs to do more effort), I
think it would be preferable to having an incorrect implementation for months
without knowing it. In addition, a developer has to think explicitly about his
notion of equivalence.

Last but not least, in the current implementation, you cannot see whether there
is a __eq__ and/or __hash__ equivalence notion. Lack of an explicit definition
does not necessarily imply there is no such notion. Without default object
implementation this would also be uniqly defined.

Albert

Alan Isaac · Jul 2, 2007

A.T.Hofkamp said:
Hmm, maybe numbers in sets are broken then?

a = 12345
b = 12345
a == b

Click to expand...

True

a is b

Click to expand...

False

set([a,b])

Click to expand...

Click to expand...

set([12345])

Numbers and my Car2 objects behave the same w.r.t. '==' and 'is', yet I get a
set with 1 number, and a set with 2 cars.
Something is wrong here imho.

The point I intended to make was that having a default __hash__ method on
objects give weird results that not everybody may be aware of.
In addition, to get useful behavior of objects in sets one should override
__hash__ anyway, so what is the point of having a default object.__hash__ ?

The point is: let us have good default behavior.
Generally, two equal numbers are two conceptual
references to the same "thing". (Say, the Platonic
form of the number.) So it is good that the hash value
is determined by the number. Similarly for strings.
Two equal numbers or strings are **also** identical,
in the sense of having the same conceptual reference.
In contrast, two "equal" cars are generally not identical
in this sense. Of course you can make them so if you wish,
but it is odd. So *nothing* is wrong here, imo.

Btw:True

Cheers,
Alan Isaac

Conceptual flaw in pxdom?	10	May 17, 2009
How to make the tip of a QSlider in PySide2 look like a triangle?	1	Mar 21, 2023
Multiprocessing taking too much time	1	Jul 29, 2010
HOW TO build object graph or get superclasses list for self.__class__?	1	Apr 21, 2010
Replacing globals in exec by custom class	5	Dec 8, 2010
Need to modify a Python Gui	0	Jun 2, 2013
The devolution of English language and slothful c.l.p behaviors exposed!	50	Jan 24, 2012
To make a method or attribute private	5	Jan 20, 2013

Too many 'self' in python.That's a big flaw in this language.

A.T.Hofkamp

Alan Isaac

A.T.Hofkamp

Roy Smith

John Nagle

Steve Holden

Gabriel Genellina

Guest

A.T.Hofkamp

A.T.Hofkamp

Steve Holden

A.T.Hofkamp

Alan Isaac

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads