cmp and sorting non-symmetric types

A

Adam Olsen

(I've had trouble getting response for collaboration on a PEP.
Perhaps I'm the only interested party?)

Although py3k raises an exception for completely unsortable types, it
continues to silently do the wrong thing for non-symmetric types that
overload comparison operator with special meanings.
a = set([1])
b = set([2, 5])
c = set([1, 2])
sorted([a, c, b]) [{1}, {1, 2}, {2, 5}]
sorted([a, b, c])
[{1}, {2, 5}, {1, 2}]

To solve this I propose a revived cmp (as per the previous thread[1]),
which is the preferred path for orderings. The rich comparison
operators will be simple wrappers for cmp() (ensuring an exception is
raised if they're not merely comparing for equality.)

Thus, set would need 7 methods defined (6 rich comparisons plus
__cmp__, although it could skip __eq__ and __ne__), whereas nearly all
other types (int, list, etc) need only __cmp__.

Code which uses <= to compare sets would be assumed to want subset
operations. Generic containers should use cmp() exclusively.


[1] http://mail.python.org/pipermail/python-3000/2007-October/011072.html
 
R

Raymond Hettinger

Although py3k raises an exception for completely unsortable types, it
continues to silently do the wrong thing for non-symmetric types that
overload comparison operator with special meanings.
To solve this I propose a revived cmp (as per the previous thread[1]),
which is the preferred path for orderings. The rich comparison
operators will be simple wrappers for cmp() (ensuring an exception is
raised if they're not merely comparing for equality.)

Thus, set would need 7 methods defined (6 rich comparisons plus
__cmp__, although it could skip __eq__ and __ne__), whereas nearly all
other types (int, list, etc) need only __cmp__.

Am strongly against this. Please don't re-introduce an atrocity to
solve this non-problem. I realize that it bugs you but AFAICT it never
is an issue in real programs and I expect it to be even less likely in
Py3.0 where we get much farther away from sorts of hetrogenous types.
Also, sort() is no different than hash() or any other function that
can be fooled by a magic method implementation that doesn't do what
the calling function expects (for example, a custom __hash__ method
returning a random number will make it possible to add dict entries
that cannot be retrieved).

Also, keep in mind that cmp died for a reason. Having it exist
alongside rich comparison operations was confusing for users and it
complicated the heck out of the CPython implementation (and checking
for it is slow)


Raymond
 
C

Carl Banks

(I've had trouble getting response for collaboration on a PEP.
Perhaps I'm the only interested party?)

Although py3k raises an exception for completely unsortable types, it
continues to silently do the wrong thing for non-symmetric types that
overload comparison operator with special meanings.
a = set([1])
b = set([2, 5])
c = set([1, 2])
sorted([a, c, b])

[{1}, {1, 2}, {2, 5}]>>> sorted([a, b, c])

[{1}, {2, 5}, {1, 2}]

To solve this I propose a revived cmp (as per the previous thread[1]),
which is the preferred path for orderings. The rich comparison
operators will be simple wrappers for cmp() (ensuring an exception is
raised if they're not merely comparing for equality.)

Thus, set would need 7 methods defined (6 rich comparisons plus
__cmp__, although it could skip __eq__ and __ne__), whereas nearly all
other types (int, list, etc) need only __cmp__.

Code which uses <= to compare sets would be assumed to want subset
operations. Generic containers should use cmp() exclusively.


I second Raymond Hettinger's strong -1 on this.

I would further suggest that it's the wrong solution even insofar as
it's a problem. The right solution is to use comparison operators
only for ordered comparisons, not for subset and superset testing.
Unfortunately, the rogue operators are already there and in use, so
the right solution would probably cause more trouble than it would
save at this point. But it'd still cause less trouble than
resurrecting the __cmp__ operator.


Carl Banks
 
P

Paul Rubin

Carl Banks said:
resurrecting the __cmp__ operator.

I didnd't realize __cmp__ was going. Are we really supposed to
implement all the rich comparison methods to code an ordered class?
Or is there some kind of library superclass we can inherit from that
implements something like __cmp__? What about making __key__ ?
 
R

Raymond Hettinger

The right solution is to use comparison operators
only for ordered comparisons, not for subset and superset testing.

The whole point of the rich comparisons PEP was to be able to override
the operators for other purposes.



Raymond
 
R

Raymond Hettinger

I didnd't realize __cmp__ was going. Are we really supposed to
implement all the rich comparison methods to code an ordered class?
Or is there some kind of library superclass we can inherit from that
implements something like __cmp__? What about making __key__ ?

IIRC, you need only define __le__ and __eq__ and the rest get defined
using a mixin class.


Raymond
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top