eq() inconvenience when subclassing set

Jess Austin · Oct 29, 2009

I'm subclassing set, and redefining __eq__(). I'd appreciate any
relevant advice.
.... def __eq__(self, other):
.... print "called mySet.__eq__()!"
.... if isinstance(other, (set, frozenset)):
.... return True
.... return set.__eq__(self, other)
....

I stipulate that this is a weird thing to do, but this is a toy class
to avoid the lengthy definition of the class I actually want to
write. Now I want the builtin set and frozenset types to use the new
__eq__() with mySet symmetrically.

mySet() == set([1])

Click to expand...

Click to expand...

called mySet.__eq__()!
True

mySet() == frozenset([1])

Click to expand...

Click to expand...

called mySet.__eq__()!
True

set([1]) == mySet()

Click to expand...

Click to expand...

called mySet.__eq__()!
True

frozenset([1]) == mySet()

Click to expand...

Click to expand...

False

frozenset doesn't use mySet.__eq__() because mySet is not a subclass
of frozenset as it is for set. I've tried a number of techniques to
mitigate this issue. If I multiple-inherit from both set and
frozenset, I get the instance lay-out conflict error. I have similar
problems setting mySet.__bases__ directly, and hacking mro() in a
metaclass. So far nothing has worked. If it matters, I'm using 2.6,
but I can change versions if it will help.

Should I give up on this, or is there something else I can try? Keep
in mind, I must redefine __eq__(), and I'd like to be able to compare
instances of the class to both set and frozenset instances.

cheers,
Jess

Mick Krippendorf · Oct 29, 2009

Jess said:
frozenset([1]) == mySet()

Click to expand...

Click to expand...

False

frozenset doesn't use mySet.__eq__() because mySet is not a subclass
of frozenset as it is for set.

You could just overwrite set and frozenset:

class eqmixin(object):
def __eq__(self, other):
print "called %s.__eq__()" % self.__class__
if isinstance(other, (set, frozenset)):
return True
return super(eqmixin, self).__eq__(other)

class set(eqmixin, set):
pass
class frozenset(eqmixin, frozenset):
pass
class MySet(set):
pass

Regards,
Mick.

Jess Austin · Oct 29, 2009

You could just overwrite set and frozenset:

class eqmixin(object):
def __eq__(self, other):
print "called %s.__eq__()" % self.__class__
if isinstance(other, (set, frozenset)):
return True
return super(eqmixin, self).__eq__(other)

class frozenset(eqmixin, frozenset):
pass

That's nice, but it means that everyone who imports my class will have
to import the monkeypatch of frozenset, as well. I'm not sure I want
that. More ruby than python, ne?

thanks,
Jess

Mick Krippendorf · Oct 29, 2009

Jess said:
That's nice, but it means that everyone who imports my class will have
to import the monkeypatch of frozenset, as well. I'm not sure I want
that. More ruby than python, ne?

I thought it was only a toy class?

Mick.

Jess Austin · Oct 29, 2009

I thought it was only a toy class?

Well, I posted a toy, but it's a stand-in for something else more
complicated. Trying to conserve bytes, you know.

Gabriel Genellina · Oct 30, 2009

... def __eq__(self, other):
... print "called mySet.__eq__()!"
... if isinstance(other, (set, frozenset)):
... return True
... return set.__eq__(self, other)
...

Now I want the builtin set and frozenset types to use the new
__eq__() with mySet symmetrically.

mySet() == set([1])

Click to expand...

Click to expand...

called mySet.__eq__()!
True

mySet() == frozenset([1])

Click to expand...

Click to expand...

called mySet.__eq__()!
True

set([1]) == mySet()

Click to expand...

Click to expand...

called mySet.__eq__()!
True

frozenset([1]) == mySet()

Click to expand...

Click to expand...

False

frozenset doesn't use mySet.__eq__() because mySet is not a subclass
of frozenset as it is for set. [...failed attempts to inherit from both
set and frozenset...]
I must redefine __eq__(), and I'd like to be able to compare
instances of the class to both set and frozenset instances.

We know the last test fails because the == logic fails to recognize mySet
(on the right side) as a "more specialized" object than frozenset (on the
left side), because set and frozenset don't have a common base type
(although they share a lot of implementation)

I think the only way would require modifying tp_richcompare of
set/frozenset objects, so it is aware of subclasses on the right side.
Currently, frozenset() == mySet() effectively ignores the fact that mySet
is a subclass of set.

Jess Austin · Oct 30, 2009

We know the last test fails because the == logic fails to recognize mySet
(on the right side) as a "more specialized" object than frozenset (on the
left side), because set and frozenset don't have a common base type
(although they share a lot of implementation)

I think the only way would require modifying tp_richcompare of
set/frozenset objects, so it is aware of subclasses on the right side.
Currently, frozenset() == mySet() effectively ignores the fact that mySet
is a subclass of set.

I don't think even that would work. By the time set_richcompare() is
called (incidentally, it's used for both set and frozenset), it's too
late. That function is not responsible for calling the subclass's
method. It does call PyAnySet_Check(), but only to short-circuit
equality and inequality for non-set objects. I believe that something
higher-level in the interpreter decides to call the right-side type's
method because it's a subclass of the left-side type, but I'm not
familiar enough with the code to know where that happens. It may be
best not to sully such generalized code with a special case for
this.

I may do some experiments with bytes, str, and unicode, since that
seems to be an analogous case. There is a basestring type, but at
this point I don't know that it really helps with anything.

cheers,
Jess

Gabriel Genellina · Nov 1, 2009

I don't think even that would work. By the time set_richcompare() is
called (incidentally, it's used for both set and frozenset), it's too
late. That function is not responsible for calling the subclass's
method. It does call PyAnySet_Check(), but only to short-circuit
equality and inequality for non-set objects. I believe that something
higher-level in the interpreter decides to call the right-side type's
method because it's a subclass of the left-side type, but I'm not
familiar enough with the code to know where that happens. It may be
best not to sully such generalized code with a special case for
this.

I may do some experiments with bytes, str, and unicode, since that
seems to be an analogous case. There is a basestring type, but at
this point I don't know that it really helps with anything.

Looks like in 3.1 this can be done with bytes+str and viceversa, even if
bytes and str don't have a common ancestor (other than object; basestring
doesn't exist in 3.x):

p3> Base = bytes
p3> Other = str
p3>
p3> class Derived(Base):
.... def __eq__(self, other):
.... print('Derived.__eq__')
.... return True
....
p3> Derived()==Base()
Derived.__eq__
True
p3> Base()==Derived()
Derived.__eq__
True
p3> Derived()==Other()
Derived.__eq__
True
p3> Other()==Derived()
Derived.__eq__ # !!!
True
p3> Base.mro()
[<class 'bytes'>, <class 'object'>]
p3> Other.mro()
[<class 'str'>, <class 'object'>]

The same example with set+frozenset (the one you're actually interested
in) doesn't work, unfortunately.
After further analysis, this works for bytes and str because both types
refuse to guess and compare to each other; they return NotImplemented when
the right-side operand is not of the same type. And this gives that other
operand the chance of being called.

set and frozenset, on the other hand, are promiscuous: their
tp_richcompare slot happily accepts any set of any kind, derived or not,
and compares their contents. I think it should be a bit more strict: if
the right hand side is not of the same type, and its tp_richcompare slot
is not the default one, it should return NotImplemented. This way the
other type has a chance to be called.

Jess Austin · Nov 3, 2009

Looks like in 3.1 this can be done with bytes+str and viceversa, even if
bytes and str don't have a common ancestor (other than object; basestring
doesn't exist in 3.x):

p3> Base = bytes
p3> Other = str
p3>
p3> class Derived(Base):
... def __eq__(self, other):
... print('Derived.__eq__')
... return True
...
p3> Derived()==Base()
Derived.__eq__
True
p3> Base()==Derived()
Derived.__eq__
True
p3> Derived()==Other()
Derived.__eq__
True
p3> Other()==Derived()
Derived.__eq__ # !!!
True
p3> Base.mro()
[<class 'bytes'>, <class 'object'>]
p3> Other.mro()
[<class 'str'>, <class 'object'>]

The same example with set+frozenset (the one you're actually interested
in) doesn't work, unfortunately.
After further analysis, this works for bytes and str because both types
refuse to guess and compare to each other; they return NotImplemented when
the right-side operand is not of the same type. And this gives that other
operand the chance of being called.

set and frozenset, on the other hand, are promiscuous: their
tp_richcompare slot happily accepts any set of any kind, derived or not,
and compares their contents. I think it should be a bit more strict: if
the right hand side is not of the same type, and its tp_richcompare slot
is not the default one, it should return NotImplemented. This way the
other type has a chance to be called.

Thanks for this, Gabriel! There seems to be a difference between the
two cases, however:
True

I doubt that either of these invariants is amenable to modification,
even for purposes of "consistency". I'm not sure how to resolve this,
but you've definitely helped me here. Perhaps the test in
set_richcompare can return NotImplemented in particular cases but not
in others? I'll think about this; let me know if you come up with
anything more.

thanks,
Jess

Gabriel Genellina · Nov 3, 2009

Thanks for this, Gabriel! There seems to be a difference between the
two cases, however:

True

I doubt that either of these invariants is amenable to modification,
even for purposes of "consistency". I'm not sure how to resolve this,
but you've definitely helped me here. Perhaps the test in
set_richcompare can return NotImplemented in particular cases but not
in others? I'll think about this; let me know if you come up with
anything more.

I think it should return NotImplemented only when the right-hand side
operand has overriden tp_richcompare. That way, set()==frozenset() would
still be True. Only when one inherits from set/frozenset AND overrides
__eq__, set_richcompare should step aside and let the more specific __eq__
be called (by just returning NotImplemented).

What is your goal when overriding __eq__ for your new set class? It may
help building a case for this change; a concrete use case is much better
than an abstract request.

__eq__ problem with subclasses	1	Aug 22, 2008
Inexplicable behavior in simple example of a set in a class	8	Jul 2, 2011
when to use == and when to use is	0	Mar 10, 2014
How to subclass sets.Set() to change intersection() behavior?	2	Dec 13, 2006
Get item from set	11	Apr 26, 2009
adding elements to set	0	Dec 8, 2011
Retrieving an object from a set	1	Jan 25, 2013
Unexpected comparisons in dict lookup	1	Mar 18, 2014

eq() inconvenience when subclassing set

Jess Austin

Mick Krippendorf

Jess Austin

Mick Krippendorf

Jess Austin

Gabriel Genellina

Jess Austin

Gabriel Genellina

Jess Austin

Gabriel Genellina

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads

__eq__() inconvenience when subclassing set

Jess Austin

Mick Krippendorf

Jess Austin

Mick Krippendorf

Jess Austin

Gabriel Genellina

Jess Austin

Gabriel Genellina

Jess Austin

Gabriel Genellina

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads

eq() inconvenience when subclassing set