Q: sort's key and cmp parameters

K

kj

Challenge: to come up with a sorting task that cannot be achieved
by passing to the sort method (or sorted function) suitable values
for its key and reverse parameters, but instead *require* giving
a value to its cmp parameter.

For example,

from random import random
scrambled = some_list.sort(cmp=lambda x, y: cmp(random(), 0.5))

can be achieved (to a very good approximation at least) with

scrambled = some_list.sort(key=lambda x: random())

Is there a real-life sorting task that requires (or is far more
efficient with) cmp and can't be easily achieved with key and
reverse?

Many thanks in advance,

G.

P.S. I guess that, if sort is going to produce a non-trivial,
*consistent* order, a function CMP passed as the value of its cmp
parameter must satisfy the following properties:

0. CMP(x, y) must be non-zero for some elements x, y of the list;
1. anti-symmetry: sign(CMP(x, y)) must be equal to -sign(CMP(y, x));
2. transitivity: if sign(CMP(x, y)) equals sign(CMP(y, z)), then
sign(CMP(x, z)) must be equal to sign(CMP(x, y)).

In (1) and (2), sign() stands for the function

-1 if x < 0
sign(x) = 0 if x == 0
1 if x > 0

I suppose that all bets are off if these properties are not satisfied,
though the documentation does not make this entirely clear, as far
as I can tell. (If I'm wrong about this alleged omission, please
point me to the part of the docs that I missed.)
 
P

Peter Otten

kj said:
Challenge: to come up with a sorting task that cannot be achieved
by passing to the sort method (or sorted function) suitable values
for its key and reverse parameters, but instead *require* giving
a value to its cmp parameter.

For example,

from random import random
scrambled = some_list.sort(cmp=lambda x, y: cmp(random(), 0.5))

The above violates the transitivity requirement as stated below. The result
will have a bias.
can be achieved (to a very good approximation at least) with

scrambled = some_list.sort(key=lambda x: random())

Is there a real-life sorting task that requires (or is far more
efficient with) cmp and can't be easily achieved with key and
reverse?

The core developers don't think there is one. list.sort() no longer supports
the cmp parameter in 3.x.
 
L

Laszlo Nagy

Is this a homework?
Challenge: to come up with a sorting task that cannot be achieved
by passing to the sort method (or sorted function) suitable values
for its key and reverse parameters, but instead *require* giving
a value to its cmp parameter.
Let me put up this question: how do you define the difference between an
object to be sorted in the list, and the passed "suitable value" that is
used for finding the position for the object in the list. If the only
difference is that they need to be different Python objects, then you
can always use [x] instead of x. This value is perfectly suitable. ;-)

If you could give us somewhat stronger requirements regarding keys and
objects, we may give you a different answer.

Best,

Laszlo
 
L

Laszlo Nagy

The core developers don't think there is one. list.sort() no longer supports
the cmp parameter in 3.x.
LOL :-D

BTW if you really want to sort a list then you can. Using keys or
values, it doesn't matter. The op's question has no practical usage.
REALLY looks like a homework.

L
 
E

Ethan Furman

Laszlo said:
LOL :-D

BTW if you really want to sort a list then you can. Using keys or
values, it doesn't matter. The op's question has no practical usage.
REALLY looks like a homework.

L

If it is, it's one he's contemplating giving his students. :-D

~Ethan~
 
C

Carsten Haese

kj said:
Challenge: to come up with a sorting task that cannot be achieved
by passing to the sort method (or sorted function) suitable values
for its key and reverse parameters, but instead *require* giving
a value to its cmp parameter.

Such a task can't exist, because any arbitrary comparison function can
be transformed into a key function. The idea behind the transformation
is to construct wrapper objects that can compare themselves to other
wrapper objects by invoking the given comparison function on their
wrapped originals.

Google for "Hettinger cmp2key" if you want to see the code that does
this transformation.

HTH,
 
P

Paul Rubin

kj said:
Is there a real-life sorting task that requires (or is far more
efficient with) cmp and can't be easily achieved with key and
reverse?

Yes, think of sorting tree structures where you have to recursively
compare them til you find an unequal pair of nodes. To sort with
key=... you have to wrap a class instance around each tree, with
special comparison methods. Blecch.
 
B

Bearophile

Paul Rubin:
Yes, think of sorting tree structures where you have to recursively
compare them til you find an unequal pair of nodes.

That's cute. In what situations do you have to perform such kind of
sort?

Bye,
bearophile
 
R

Raymond Hettinger

Challenge: to come up with a sorting task that cannot be achieved
by passing to the sort method (or sorted function) suitable values
for its key and reverse parameters, but instead *require* giving
a value to its cmp parameter.

If you're assuming a consistent sort-order (transitivity, not
evolving over time, etc), then the cmp method and key method
are mathematically equivalent (you could always do a compare
sort first, record the order produced, and assign the position
number as a key function):

# constructive proof of the existence of a key function
# corresponding to a given cmp function
tmplist = sorted(s, cmp=mycmp)
pos = dict(zip(map(id, tmplist), range(len(tmplist))))
result = sorted(s, key=lambda x: pos[id(x)])
assert tmplist == result

Given equivalence, what is really at stake is convenience.

If you're starting point is a cmp function (for instance,
asking a dating service member whether they prefer
mate x to mate y), then having to convert to a key function
can be inconvenient.

If you need to sort by an ascending primary key and a descending
secondary key, you may find it easiest to sort in two passes
(taking advantage of guaranteed sort stability):

sorted(s, key=secondary, reversed=True)
sorted(s, key=primary)

With a cmp function, the above could be achieved in a single pass:

def mycmp(x, y):
p = cmp(primary(a), primary(b))
return p if p else cmp(secondary(b), secondary(a))

sorted(s, cmp=mycmp)

Also, as Paul pointed-out, there are some structures such as tree
comparisons that may be challenging to express directly as a key
function. As Carsten pointed-out, the cmp2key function allows
cmp functions to trivially (though indirectly) be recast as key
functions.

All that being said, for many everyday uses, a key function is
simpler to write and faster to run than a cmp function approach.


Raymond
 
K

kj

<snip>

Thanks for an extremely helpful reply!
If you need to sort by an ascending primary key and a descending
secondary key, you may find it easiest to sort in two passes
(taking advantage of guaranteed sort stability):
sorted(s, key=secondary, reversed=3DTrue)
sorted(s, key=primary)

In the special case where the value returned by secondary is
numeric, I suppose one could do this in one go with

sorted(s, key=lambda x: (primary(x), -secondary(x)))

....but I can't think of a way to generalize this...

kj
 
K

kj

Yes, think of sorting tree structures where you have to recursively
compare them til you find an unequal pair of nodes. To sort with
key=... you have to wrap a class instance around each tree, with
special comparison methods. Blecch.


Good point. This example convinces me that it was a bad idea to
get rid of cmp in Python 3, even if situations like this one are
rare. With the cmp parameter as an option, coding this type of
sort was accessible even to those with a rudementary knowledge of
Python. Now one needs to be pretty clever and pretty skilled with
Python to figure this one out... Granted, anyone who needs to
perform such sorts is probably smart enough to handle the required
solution, but not necessarily very proficient with Python. Besides,
why make something that was relatively straightforward before become
an exercise in cleverness? I wonder what was gained by eliminating
the cmp option...

kj
 
P

Paul Rubin

Duncan Booth said:
There is no sorting task that *requires* cmp. If all else fails you can
define a key class to wrap the original wrapper such that comparing the
keys simply calls the comparison function that you *would* have used.

I would count that as key being far less efficient, though still
giving the same result.
 
P

Paul Rubin

Raymond Hettinger said:
If you're assuming a consistent sort-order (transitivity, not
evolving over time, etc), then the cmp method and key method
are mathematically equivalent (you could always do a compare
sort first, record the order produced, and assign the position
number as a key function)

But that is an efficiency hit.
If you're starting point is a cmp function (for instance,
asking a dating service member whether they prefer
mate x to mate y), then having to convert to a key function
can be inconvenient.

Well, the issue there is that the comparison function may not be
transitive. Maybe you really want a topological sort for that.
All that being said, for many everyday uses, a key function is
simpler to write and faster to run than a cmp function approach.

I still have never understood why cmp was removed. Sure, key is more
convenient a lot (or maybe most) of the time, but it's not always.
 
A

alex23

kj said:
This example convinces me that it was a bad idea to
get rid of cmp in Python 3, even if situations like this one are
rare.

It sounds like the entire point of this exercise was to get other
people to confirm your bias for you.
 
K

kj

It sounds like the entire point of this exercise was to get other
people to confirm your bias for you.

The only problem with this hypothesis is that my original bias was
exactly the opposite to the one you quote: when I sent my original
post I thought that cmp was in fact useless (after all I could not
think of a situation that required it or even made it preferable),
and was not even aware that it had been dropped in Python 3. Paul
Rubin's post convinced me otherwise.

BTW, what's with this business of ascribing underhanded motives to
me? Earlier some other clown alleged that that my original post
was homework??? WTF?

kj
 
P

Paul Rubin

Bearophile said:
That's cute. In what situations do you have to perform such kind of
sort?

It came up in a search engine application I've been involved with.
 
P

Paul Rubin

Scott David Daniels said:
Most cases are moreeasily done with key, and it is
a good idea to make the most accessible way to a sort be the most
efficient one. In the rare case that you really want each comparison,
the cmp-injection function will do nicely (and can be written as a
recipe.

I don't think wrapping the sorted objects in an otherwise useless
special purpose class is "nicely", either from a performance or from a
code verbosity point of view. I avoid Java and its useless extra
classes for a reason ;-).
In short, make the easy path the fast path, and more will use it;
provide two ways, and the first that springs to mind is the one
used.

I think we are saying the same thing. Python 2.x provides two ways
and you can use whichever one fits the application better. I have
never understood why Python 3.x finds it necessary to break one of
them. Maybe I can migrate to Haskell by the time Python 2.x becomes
deprecated.
 
R

Raymond Hettinger

[Paul Rubin]
Yes, think of sorting tree structures where you have to recursively
compare them til you find an unequal pair of nodes.  

I'm not sure what you mean by this. What are the semantics of
sorting a tree? Can you outline an example of tree that
could be sorted easily with a cmp function but not a key function?

t = [[9,6],[7,1]],[[2,5],[4,3]] # inputs
t.sort(mycmp) # what would the cmp function be?
print t # what would the output be


Raymond
 
P

Paul Rubin

Raymond Hettinger said:
I'm not sure what you mean by this. What are the semantics of
sorting a tree? Can you outline an example of tree that
could be sorted easily with a cmp function but not a key function?

The idea was that you have a list of trees that you want to sort, and
an ordering relation between trees:

def gt(tree1, tree2): ...

where you recursively descend both trees until you find an unequal
pair of nodes. You're not trying to sort the nodes within a single
tree.
 
R

Raymond Hettinger

[Paul Rubin]
The idea was that you have a list of trees that you want to sort, and
an ordering relation between trees:

   def gt(tree1, tree2): ...

where you recursively descend both trees until you find an unequal
pair of nodes.  You're not trying to sort the nodes within a single
tree.

Can you give an example of a list of trees and a cmp function
that recursively compares them?


Raymond
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top