# Q: sort's key and cmp parameters

Discussion in 'Python' started by kj, Oct 1, 2009.

1. ### kjGuest

Challenge: to come up with a sorting task that cannot be achieved
by passing to the sort method (or sorted function) suitable values
for its key and reverse parameters, but instead *require* giving
a value to its cmp parameter.

For example,

from random import random
scrambled = some_list.sort(cmp=lambda x, y: cmp(random(), 0.5))

can be achieved (to a very good approximation at least) with

scrambled = some_list.sort(key=lambda x: random())

Is there a real-life sorting task that requires (or is far more
efficient with) cmp and can't be easily achieved with key and
reverse?

G.

P.S. I guess that, if sort is going to produce a non-trivial,
*consistent* order, a function CMP passed as the value of its cmp
parameter must satisfy the following properties:

0. CMP(x, y) must be non-zero for some elements x, y of the list;
1. anti-symmetry: sign(CMP(x, y)) must be equal to -sign(CMP(y, x));
2. transitivity: if sign(CMP(x, y)) equals sign(CMP(y, z)), then
sign(CMP(x, z)) must be equal to sign(CMP(x, y)).

In (1) and (2), sign() stands for the function

-1 if x < 0
sign(x) = 0 if x == 0
1 if x > 0

I suppose that all bets are off if these properties are not satisfied,
though the documentation does not make this entirely clear, as far
point me to the part of the docs that I missed.)

kj, Oct 1, 2009

2. ### Peter OttenGuest

kj wrote:

> Challenge: to come up with a sorting task that cannot be achieved
> by passing to the sort method (or sorted function) suitable values
> for its key and reverse parameters, but instead *require* giving
> a value to its cmp parameter.
>
> For example,
>
> from random import random
> scrambled = some_list.sort(cmp=lambda x, y: cmp(random(), 0.5))

The above violates the transitivity requirement as stated below. The result
will have a bias.

> can be achieved (to a very good approximation at least) with
>
> scrambled = some_list.sort(key=lambda x: random())
>
> Is there a real-life sorting task that requires (or is far more
> efficient with) cmp and can't be easily achieved with key and
> reverse?

The core developers don't think there is one. list.sort() no longer supports
the cmp parameter in 3.x.

>
> G.
>
> P.S. I guess that, if sort is going to produce a non-trivial,
> *consistent* order, a function CMP passed as the value of its cmp
> parameter must satisfy the following properties:
>
> 0. CMP(x, y) must be non-zero for some elements x, y of the list;
> 1. anti-symmetry: sign(CMP(x, y)) must be equal to -sign(CMP(y, x));
> 2. transitivity: if sign(CMP(x, y)) equals sign(CMP(y, z)), then
> sign(CMP(x, z)) must be equal to sign(CMP(x, y)).
>
> In (1) and (2), sign() stands for the function
>
> -1 if x < 0
> sign(x) = 0 if x == 0
> 1 if x > 0
>
> I suppose that all bets are off if these properties are not satisfied,
> though the documentation does not make this entirely clear, as far
> point me to the part of the docs that I missed.)

Peter Otten, Oct 1, 2009

3. ### Laszlo NagyGuest

Is this a homework?

> Challenge: to come up with a sorting task that cannot be achieved
> by passing to the sort method (or sorted function) suitable values
> for its key and reverse parameters, but instead *require* giving
> a value to its cmp parameter.
>

Let me put up this question: how do you define the difference between an
object to be sorted in the list, and the passed "suitable value" that is
used for finding the position for the object in the list. If the only
difference is that they need to be different Python objects, then you
can always use [x] instead of x. This value is perfectly suitable. ;-)

If you could give us somewhat stronger requirements regarding keys and
objects, we may give you a different answer.

Best,

Laszlo

Laszlo Nagy, Oct 1, 2009
4. ### Laszlo NagyGuest

>> can be achieved (to a very good approximation at least) with
>>
>> scrambled = some_list.sort(key=lambda x: random())
>>
>> Is there a real-life sorting task that requires (or is far more
>> efficient with) cmp and can't be easily achieved with key and
>> reverse?
>>

>
> The core developers don't think there is one. list.sort() no longer supports
> the cmp parameter in 3.x.
>

LOL :-D

BTW if you really want to sort a list then you can. Using keys or
values, it doesn't matter. The op's question has no practical usage.
REALLY looks like a homework.

L

Laszlo Nagy, Oct 1, 2009
5. ### Ethan FurmanGuest

Laszlo Nagy wrote:
>
>>> can be achieved (to a very good approximation at least) with
>>>
>>> scrambled = some_list.sort(key=lambda x: random())
>>>
>>> Is there a real-life sorting task that requires (or is far more
>>> efficient with) cmp and can't be easily achieved with key and
>>> reverse?
>>>

>>
>>
>> The core developers don't think there is one. list.sort() no longer
>> supports the cmp parameter in 3.x.
>>

>
> LOL :-D
>
> BTW if you really want to sort a list then you can. Using keys or
> values, it doesn't matter. The op's question has no practical usage.
> REALLY looks like a homework.
>
> L
>

If it is, it's one he's contemplating giving his students. :-D

~Ethan~

Ethan Furman, Oct 1, 2009
6. ### Carsten HaeseGuest

kj wrote:
>
> Challenge: to come up with a sorting task that cannot be achieved
> by passing to the sort method (or sorted function) suitable values
> for its key and reverse parameters, but instead *require* giving
> a value to its cmp parameter.

Such a task can't exist, because any arbitrary comparison function can
be transformed into a key function. The idea behind the transformation
is to construct wrapper objects that can compare themselves to other
wrapper objects by invoking the given comparison function on their
wrapped originals.

Google for "Hettinger cmp2key" if you want to see the code that does
this transformation.

HTH,

--
Carsten Haese
http://informixdb.sourceforge.net

Carsten Haese, Oct 1, 2009
7. ### Paul RubinGuest

kj <> writes:
> Is there a real-life sorting task that requires (or is far more
> efficient with) cmp and can't be easily achieved with key and
> reverse?

Yes, think of sorting tree structures where you have to recursively
compare them til you find an unequal pair of nodes. To sort with
key=... you have to wrap a class instance around each tree, with
special comparison methods. Blecch.

Paul Rubin, Oct 1, 2009
8. ### BearophileGuest

Paul Rubin:

> Yes, think of sorting tree structures where you have to recursively
> compare them til you find an unequal pair of nodes.

That's cute. In what situations do you have to perform such kind of
sort?

Bye,
bearophile

Bearophile, Oct 1, 2009
9. ### Raymond HettingerGuest

On Oct 1, 10:08 am, kj <> wrote:
> Challenge: to come up with a sorting task that cannot be achieved
> by passing to the sort method (or sorted function) suitable values
> for its key and reverse parameters, but instead *require* giving
> a value to its cmp parameter.

If you're assuming a consistent sort-order (transitivity, not
evolving over time, etc), then the cmp method and key method
are mathematically equivalent (you could always do a compare
sort first, record the order produced, and assign the position
number as a key function):

# constructive proof of the existence of a key function
# corresponding to a given cmp function
tmplist = sorted(s, cmp=mycmp)
pos = dict(zip(map(id, tmplist), range(len(tmplist))))
result = sorted(s, key=lambda x: pos[id(x)])
assert tmplist == result

Given equivalence, what is really at stake is convenience.

If you're starting point is a cmp function (for instance,
asking a dating service member whether they prefer
mate x to mate y), then having to convert to a key function
can be inconvenient.

If you need to sort by an ascending primary key and a descending
secondary key, you may find it easiest to sort in two passes
(taking advantage of guaranteed sort stability):

sorted(s, key=secondary, reversed=True)
sorted(s, key=primary)

With a cmp function, the above could be achieved in a single pass:

def mycmp(x, y):
p = cmp(primary(a), primary(b))
return p if p else cmp(secondary(b), secondary(a))

sorted(s, cmp=mycmp)

Also, as Paul pointed-out, there are some structures such as tree
comparisons that may be challenging to express directly as a key
function. As Carsten pointed-out, the cmp2key function allows
cmp functions to trivially (though indirectly) be recast as key
functions.

All that being said, for many everyday uses, a key function is
simpler to write and faster to run than a cmp function approach.

Raymond

Raymond Hettinger, Oct 1, 2009
10. ### kjGuest

In <> Raymond Hettinger <> writes:

<snip>

>If you need to sort by an ascending primary key and a descending
>secondary key, you may find it easiest to sort in two passes
>(taking advantage of guaranteed sort stability):

> sorted(s, key=secondary, reversed=3DTrue)
> sorted(s, key=primary)

In the special case where the value returned by secondary is
numeric, I suppose one could do this in one go with

sorted(s, key=lambda x: (primary(x), -secondary(x)))

....but I can't think of a way to generalize this...

kj

kj, Oct 1, 2009
11. ### kjGuest

In <> Paul Rubin <http://> writes:

>kj <> writes:
>> Is there a real-life sorting task that requires (or is far more
>> efficient with) cmp and can't be easily achieved with key and
>> reverse?

>Yes, think of sorting tree structures where you have to recursively
>compare them til you find an unequal pair of nodes. To sort with
>key=... you have to wrap a class instance around each tree, with
>special comparison methods. Blecch.

Good point. This example convinces me that it was a bad idea to
get rid of cmp in Python 3, even if situations like this one are
rare. With the cmp parameter as an option, coding this type of
sort was accessible even to those with a rudementary knowledge of
Python. Now one needs to be pretty clever and pretty skilled with
Python to figure this one out... Granted, anyone who needs to
perform such sorts is probably smart enough to handle the required
solution, but not necessarily very proficient with Python. Besides,
why make something that was relatively straightforward before become
an exercise in cleverness? I wonder what was gained by eliminating
the cmp option...

kj

kj, Oct 1, 2009
12. ### Paul RubinGuest

Duncan Booth <> writes:
> > Is there a real-life sorting task that requires (or is far more
> > efficient with) cmp and can't be easily achieved with key and reverse?
> >

> There is no sorting task that *requires* cmp. If all else fails you can
> define a key class to wrap the original wrapper such that comparing the
> keys simply calls the comparison function that you *would* have used.

I would count that as key being far less efficient, though still
giving the same result.

Paul Rubin, Oct 2, 2009
13. ### Paul RubinGuest

Raymond Hettinger <> writes:
> If you're assuming a consistent sort-order (transitivity, not
> evolving over time, etc), then the cmp method and key method
> are mathematically equivalent (you could always do a compare
> sort first, record the order produced, and assign the position
> number as a key function)

But that is an efficiency hit.

> If you're starting point is a cmp function (for instance,
> asking a dating service member whether they prefer
> mate x to mate y), then having to convert to a key function
> can be inconvenient.

Well, the issue there is that the comparison function may not be
transitive. Maybe you really want a topological sort for that.

> All that being said, for many everyday uses, a key function is
> simpler to write and faster to run than a cmp function approach.

I still have never understood why cmp was removed. Sure, key is more
convenient a lot (or maybe most) of the time, but it's not always.

Paul Rubin, Oct 2, 2009
14. ### alex23Guest

kj <> wrote:
> This example convinces me that it was a bad idea to
> get rid of cmp in Python 3, even if situations like this one are
> rare.

It sounds like the entire point of this exercise was to get other
people to confirm your bias for you.

alex23, Oct 2, 2009
15. ### kjGuest

In <> alex23 <> writes:

>kj <> wrote:
>> This example convinces me that it was a bad idea to
>> get rid of cmp in Python 3, even if situations like this one are
>> rare.

>It sounds like the entire point of this exercise was to get other
>people to confirm your bias for you.

The only problem with this hypothesis is that my original bias was
exactly the opposite to the one you quote: when I sent my original
post I thought that cmp was in fact useless (after all I could not
think of a situation that required it or even made it preferable),
and was not even aware that it had been dropped in Python 3. Paul
Rubin's post convinced me otherwise.

BTW, what's with this business of ascribing underhanded motives to
me? Earlier some other clown alleged that that my original post
was homework??? WTF?

kj

kj, Oct 2, 2009
16. ### Paul RubinGuest

Bearophile <> writes:
> > Yes, think of sorting tree structures where you have to recursively
> > compare them til you find an unequal pair of nodes.

>
> That's cute. In what situations do you have to perform such kind of
> sort?

It came up in a search engine application I've been involved with.

Paul Rubin, Oct 2, 2009
17. ### Paul RubinGuest

Scott David Daniels <> writes:
> Most cases are moreeasily done with key, and it is
> a good idea to make the most accessible way to a sort be the most
> efficient one. In the rare case that you really want each comparison,
> the cmp-injection function will do nicely (and can be written as a
> recipe.

I don't think wrapping the sorted objects in an otherwise useless
special purpose class is "nicely", either from a performance or from a
code verbosity point of view. I avoid Java and its useless extra
classes for a reason ;-).

> In short, make the easy path the fast path, and more will use it;
> provide two ways, and the first that springs to mind is the one
> used.

I think we are saying the same thing. Python 2.x provides two ways
and you can use whichever one fits the application better. I have
never understood why Python 3.x finds it necessary to break one of
them. Maybe I can migrate to Haskell by the time Python 2.x becomes
deprecated.

Paul Rubin, Oct 2, 2009
18. ### Raymond HettingerGuest

[Paul Rubin]
> Yes, think of sorting tree structures where you have to recursively
> compare them til you find an unequal pair of nodes.

I'm not sure what you mean by this. What are the semantics of
sorting a tree? Can you outline an example of tree that
could be sorted easily with a cmp function but not a key function?

t = [[9,6],[7,1]],[[2,5],[4,3]] # inputs
t.sort(mycmp) # what would the cmp function be?
print t # what would the output be

Raymond

Raymond Hettinger, Oct 3, 2009
19. ### Paul RubinGuest

Raymond Hettinger <> writes:
> I'm not sure what you mean by this. What are the semantics of
> sorting a tree? Can you outline an example of tree that
> could be sorted easily with a cmp function but not a key function?

The idea was that you have a list of trees that you want to sort, and
an ordering relation between trees:

def gt(tree1, tree2): ...

where you recursively descend both trees until you find an unequal
pair of nodes. You're not trying to sort the nodes within a single
tree.

Paul Rubin, Oct 3, 2009
20. ### Raymond HettingerGuest

[Paul Rubin]
> The idea was that you have a list of trees that you want to sort, and
> an ordering relation between trees:
>
>    def gt(tree1, tree2): ...
>
> where you recursively descend both trees until you find an unequal
> pair of nodes.  You're not trying to sort the nodes within a single
> tree.

Can you give an example of a list of trees and a cmp function
that recursively compares them?

Raymond

Raymond Hettinger, Oct 3, 2009