Q: sort's key and cmp parameters

Discussion in 'Python' started by kj, Oct 1, 2009.

  1. kj

    kj Guest

    Challenge: to come up with a sorting task that cannot be achieved
    by passing to the sort method (or sorted function) suitable values
    for its key and reverse parameters, but instead *require* giving
    a value to its cmp parameter.

    For example,

    from random import random
    scrambled = some_list.sort(cmp=lambda x, y: cmp(random(), 0.5))

    can be achieved (to a very good approximation at least) with

    scrambled = some_list.sort(key=lambda x: random())

    Is there a real-life sorting task that requires (or is far more
    efficient with) cmp and can't be easily achieved with key and
    reverse?

    Many thanks in advance,

    G.

    P.S. I guess that, if sort is going to produce a non-trivial,
    *consistent* order, a function CMP passed as the value of its cmp
    parameter must satisfy the following properties:

    0. CMP(x, y) must be non-zero for some elements x, y of the list;
    1. anti-symmetry: sign(CMP(x, y)) must be equal to -sign(CMP(y, x));
    2. transitivity: if sign(CMP(x, y)) equals sign(CMP(y, z)), then
    sign(CMP(x, z)) must be equal to sign(CMP(x, y)).

    In (1) and (2), sign() stands for the function

    -1 if x < 0
    sign(x) = 0 if x == 0
    1 if x > 0

    I suppose that all bets are off if these properties are not satisfied,
    though the documentation does not make this entirely clear, as far
    as I can tell. (If I'm wrong about this alleged omission, please
    point me to the part of the docs that I missed.)
     
    kj, Oct 1, 2009
    #1
    1. Advertising

  2. kj

    Peter Otten Guest

    kj wrote:

    > Challenge: to come up with a sorting task that cannot be achieved
    > by passing to the sort method (or sorted function) suitable values
    > for its key and reverse parameters, but instead *require* giving
    > a value to its cmp parameter.
    >
    > For example,
    >
    > from random import random
    > scrambled = some_list.sort(cmp=lambda x, y: cmp(random(), 0.5))


    The above violates the transitivity requirement as stated below. The result
    will have a bias.

    > can be achieved (to a very good approximation at least) with
    >
    > scrambled = some_list.sort(key=lambda x: random())
    >
    > Is there a real-life sorting task that requires (or is far more
    > efficient with) cmp and can't be easily achieved with key and
    > reverse?


    The core developers don't think there is one. list.sort() no longer supports
    the cmp parameter in 3.x.

    > Many thanks in advance,
    >
    > G.
    >
    > P.S. I guess that, if sort is going to produce a non-trivial,
    > *consistent* order, a function CMP passed as the value of its cmp
    > parameter must satisfy the following properties:
    >
    > 0. CMP(x, y) must be non-zero for some elements x, y of the list;
    > 1. anti-symmetry: sign(CMP(x, y)) must be equal to -sign(CMP(y, x));
    > 2. transitivity: if sign(CMP(x, y)) equals sign(CMP(y, z)), then
    > sign(CMP(x, z)) must be equal to sign(CMP(x, y)).
    >
    > In (1) and (2), sign() stands for the function
    >
    > -1 if x < 0
    > sign(x) = 0 if x == 0
    > 1 if x > 0
    >
    > I suppose that all bets are off if these properties are not satisfied,
    > though the documentation does not make this entirely clear, as far
    > as I can tell. (If I'm wrong about this alleged omission, please
    > point me to the part of the docs that I missed.)
     
    Peter Otten, Oct 1, 2009
    #2
    1. Advertising

  3. kj

    Laszlo Nagy Guest

    Is this a homework?

    > Challenge: to come up with a sorting task that cannot be achieved
    > by passing to the sort method (or sorted function) suitable values
    > for its key and reverse parameters, but instead *require* giving
    > a value to its cmp parameter.
    >

    Let me put up this question: how do you define the difference between an
    object to be sorted in the list, and the passed "suitable value" that is
    used for finding the position for the object in the list. If the only
    difference is that they need to be different Python objects, then you
    can always use [x] instead of x. This value is perfectly suitable. ;-)

    If you could give us somewhat stronger requirements regarding keys and
    objects, we may give you a different answer.

    Best,

    Laszlo
     
    Laszlo Nagy, Oct 1, 2009
    #3
  4. kj

    Laszlo Nagy Guest


    >> can be achieved (to a very good approximation at least) with
    >>
    >> scrambled = some_list.sort(key=lambda x: random())
    >>
    >> Is there a real-life sorting task that requires (or is far more
    >> efficient with) cmp and can't be easily achieved with key and
    >> reverse?
    >>

    >
    > The core developers don't think there is one. list.sort() no longer supports
    > the cmp parameter in 3.x.
    >

    LOL :-D

    BTW if you really want to sort a list then you can. Using keys or
    values, it doesn't matter. The op's question has no practical usage.
    REALLY looks like a homework.

    L
     
    Laszlo Nagy, Oct 1, 2009
    #4
  5. kj

    Ethan Furman Guest

    Laszlo Nagy wrote:
    >
    >>> can be achieved (to a very good approximation at least) with
    >>>
    >>> scrambled = some_list.sort(key=lambda x: random())
    >>>
    >>> Is there a real-life sorting task that requires (or is far more
    >>> efficient with) cmp and can't be easily achieved with key and
    >>> reverse?
    >>>

    >>
    >>
    >> The core developers don't think there is one. list.sort() no longer
    >> supports the cmp parameter in 3.x.
    >>

    >
    > LOL :-D
    >
    > BTW if you really want to sort a list then you can. Using keys or
    > values, it doesn't matter. The op's question has no practical usage.
    > REALLY looks like a homework.
    >
    > L
    >


    If it is, it's one he's contemplating giving his students. :-D

    ~Ethan~
     
    Ethan Furman, Oct 1, 2009
    #5
  6. kj wrote:
    >
    > Challenge: to come up with a sorting task that cannot be achieved
    > by passing to the sort method (or sorted function) suitable values
    > for its key and reverse parameters, but instead *require* giving
    > a value to its cmp parameter.


    Such a task can't exist, because any arbitrary comparison function can
    be transformed into a key function. The idea behind the transformation
    is to construct wrapper objects that can compare themselves to other
    wrapper objects by invoking the given comparison function on their
    wrapped originals.

    Google for "Hettinger cmp2key" if you want to see the code that does
    this transformation.

    HTH,

    --
    Carsten Haese
    http://informixdb.sourceforge.net
     
    Carsten Haese, Oct 1, 2009
    #6
  7. kj

    Paul Rubin Guest

    kj <> writes:
    > Is there a real-life sorting task that requires (or is far more
    > efficient with) cmp and can't be easily achieved with key and
    > reverse?


    Yes, think of sorting tree structures where you have to recursively
    compare them til you find an unequal pair of nodes. To sort with
    key=... you have to wrap a class instance around each tree, with
    special comparison methods. Blecch.
     
    Paul Rubin, Oct 1, 2009
    #7
  8. kj

    Bearophile Guest

    Paul Rubin:

    > Yes, think of sorting tree structures where you have to recursively
    > compare them til you find an unequal pair of nodes.


    That's cute. In what situations do you have to perform such kind of
    sort?

    Bye,
    bearophile
     
    Bearophile, Oct 1, 2009
    #8
  9. On Oct 1, 10:08 am, kj <> wrote:
    > Challenge: to come up with a sorting task that cannot be achieved
    > by passing to the sort method (or sorted function) suitable values
    > for its key and reverse parameters, but instead *require* giving
    > a value to its cmp parameter.


    If you're assuming a consistent sort-order (transitivity, not
    evolving over time, etc), then the cmp method and key method
    are mathematically equivalent (you could always do a compare
    sort first, record the order produced, and assign the position
    number as a key function):

    # constructive proof of the existence of a key function
    # corresponding to a given cmp function
    tmplist = sorted(s, cmp=mycmp)
    pos = dict(zip(map(id, tmplist), range(len(tmplist))))
    result = sorted(s, key=lambda x: pos[id(x)])
    assert tmplist == result

    Given equivalence, what is really at stake is convenience.

    If you're starting point is a cmp function (for instance,
    asking a dating service member whether they prefer
    mate x to mate y), then having to convert to a key function
    can be inconvenient.

    If you need to sort by an ascending primary key and a descending
    secondary key, you may find it easiest to sort in two passes
    (taking advantage of guaranteed sort stability):

    sorted(s, key=secondary, reversed=True)
    sorted(s, key=primary)

    With a cmp function, the above could be achieved in a single pass:

    def mycmp(x, y):
    p = cmp(primary(a), primary(b))
    return p if p else cmp(secondary(b), secondary(a))

    sorted(s, cmp=mycmp)

    Also, as Paul pointed-out, there are some structures such as tree
    comparisons that may be challenging to express directly as a key
    function. As Carsten pointed-out, the cmp2key function allows
    cmp functions to trivially (though indirectly) be recast as key
    functions.

    All that being said, for many everyday uses, a key function is
    simpler to write and faster to run than a cmp function approach.


    Raymond
     
    Raymond Hettinger, Oct 1, 2009
    #9
  10. kj

    kj Guest

    In <> Raymond Hettinger <> writes:

    <snip>

    Thanks for an extremely helpful reply!

    >If you need to sort by an ascending primary key and a descending
    >secondary key, you may find it easiest to sort in two passes
    >(taking advantage of guaranteed sort stability):


    > sorted(s, key=secondary, reversed=3DTrue)
    > sorted(s, key=primary)


    In the special case where the value returned by secondary is
    numeric, I suppose one could do this in one go with

    sorted(s, key=lambda x: (primary(x), -secondary(x)))

    ....but I can't think of a way to generalize this...

    kj
     
    kj, Oct 1, 2009
    #10
  11. kj

    kj Guest

    In <> Paul Rubin <http://> writes:

    >kj <> writes:
    >> Is there a real-life sorting task that requires (or is far more
    >> efficient with) cmp and can't be easily achieved with key and
    >> reverse?


    >Yes, think of sorting tree structures where you have to recursively
    >compare them til you find an unequal pair of nodes. To sort with
    >key=... you have to wrap a class instance around each tree, with
    >special comparison methods. Blecch.



    Good point. This example convinces me that it was a bad idea to
    get rid of cmp in Python 3, even if situations like this one are
    rare. With the cmp parameter as an option, coding this type of
    sort was accessible even to those with a rudementary knowledge of
    Python. Now one needs to be pretty clever and pretty skilled with
    Python to figure this one out... Granted, anyone who needs to
    perform such sorts is probably smart enough to handle the required
    solution, but not necessarily very proficient with Python. Besides,
    why make something that was relatively straightforward before become
    an exercise in cleverness? I wonder what was gained by eliminating
    the cmp option...

    kj
     
    kj, Oct 1, 2009
    #11
  12. kj

    Paul Rubin Guest

    Duncan Booth <> writes:
    > > Is there a real-life sorting task that requires (or is far more
    > > efficient with) cmp and can't be easily achieved with key and reverse?
    > >

    > There is no sorting task that *requires* cmp. If all else fails you can
    > define a key class to wrap the original wrapper such that comparing the
    > keys simply calls the comparison function that you *would* have used.


    I would count that as key being far less efficient, though still
    giving the same result.
     
    Paul Rubin, Oct 2, 2009
    #12
  13. kj

    Paul Rubin Guest

    Raymond Hettinger <> writes:
    > If you're assuming a consistent sort-order (transitivity, not
    > evolving over time, etc), then the cmp method and key method
    > are mathematically equivalent (you could always do a compare
    > sort first, record the order produced, and assign the position
    > number as a key function)


    But that is an efficiency hit.

    > If you're starting point is a cmp function (for instance,
    > asking a dating service member whether they prefer
    > mate x to mate y), then having to convert to a key function
    > can be inconvenient.


    Well, the issue there is that the comparison function may not be
    transitive. Maybe you really want a topological sort for that.

    > All that being said, for many everyday uses, a key function is
    > simpler to write and faster to run than a cmp function approach.


    I still have never understood why cmp was removed. Sure, key is more
    convenient a lot (or maybe most) of the time, but it's not always.
     
    Paul Rubin, Oct 2, 2009
    #13
  14. kj

    alex23 Guest

    kj <> wrote:
    > This example convinces me that it was a bad idea to
    > get rid of cmp in Python 3, even if situations like this one are
    > rare.


    It sounds like the entire point of this exercise was to get other
    people to confirm your bias for you.
     
    alex23, Oct 2, 2009
    #14
  15. kj

    kj Guest

    In <> alex23 <> writes:

    >kj <> wrote:
    >> This example convinces me that it was a bad idea to
    >> get rid of cmp in Python 3, even if situations like this one are
    >> rare.


    >It sounds like the entire point of this exercise was to get other
    >people to confirm your bias for you.


    The only problem with this hypothesis is that my original bias was
    exactly the opposite to the one you quote: when I sent my original
    post I thought that cmp was in fact useless (after all I could not
    think of a situation that required it or even made it preferable),
    and was not even aware that it had been dropped in Python 3. Paul
    Rubin's post convinced me otherwise.

    BTW, what's with this business of ascribing underhanded motives to
    me? Earlier some other clown alleged that that my original post
    was homework??? WTF?

    kj
     
    kj, Oct 2, 2009
    #15
  16. kj

    Paul Rubin Guest

    Bearophile <> writes:
    > > Yes, think of sorting tree structures where you have to recursively
    > > compare them til you find an unequal pair of nodes.

    >
    > That's cute. In what situations do you have to perform such kind of
    > sort?


    It came up in a search engine application I've been involved with.
     
    Paul Rubin, Oct 2, 2009
    #16
  17. kj

    Paul Rubin Guest

    Scott David Daniels <> writes:
    > Most cases are moreeasily done with key, and it is
    > a good idea to make the most accessible way to a sort be the most
    > efficient one. In the rare case that you really want each comparison,
    > the cmp-injection function will do nicely (and can be written as a
    > recipe.


    I don't think wrapping the sorted objects in an otherwise useless
    special purpose class is "nicely", either from a performance or from a
    code verbosity point of view. I avoid Java and its useless extra
    classes for a reason ;-).

    > In short, make the easy path the fast path, and more will use it;
    > provide two ways, and the first that springs to mind is the one
    > used.


    I think we are saying the same thing. Python 2.x provides two ways
    and you can use whichever one fits the application better. I have
    never understood why Python 3.x finds it necessary to break one of
    them. Maybe I can migrate to Haskell by the time Python 2.x becomes
    deprecated.
     
    Paul Rubin, Oct 2, 2009
    #17
  18. [Paul Rubin]
    > Yes, think of sorting tree structures where you have to recursively
    > compare them til you find an unequal pair of nodes.  


    I'm not sure what you mean by this. What are the semantics of
    sorting a tree? Can you outline an example of tree that
    could be sorted easily with a cmp function but not a key function?

    t = [[9,6],[7,1]],[[2,5],[4,3]] # inputs
    t.sort(mycmp) # what would the cmp function be?
    print t # what would the output be


    Raymond
     
    Raymond Hettinger, Oct 3, 2009
    #18
  19. kj

    Paul Rubin Guest

    Raymond Hettinger <> writes:
    > I'm not sure what you mean by this. What are the semantics of
    > sorting a tree? Can you outline an example of tree that
    > could be sorted easily with a cmp function but not a key function?


    The idea was that you have a list of trees that you want to sort, and
    an ordering relation between trees:

    def gt(tree1, tree2): ...

    where you recursively descend both trees until you find an unequal
    pair of nodes. You're not trying to sort the nodes within a single
    tree.
     
    Paul Rubin, Oct 3, 2009
    #19
  20. [Paul Rubin]
    > The idea was that you have a list of trees that you want to sort, and
    > an ordering relation between trees:
    >
    >    def gt(tree1, tree2): ...
    >
    > where you recursively descend both trees until you find an unequal
    > pair of nodes.  You're not trying to sort the nodes within a single
    > tree.


    Can you give an example of a list of trees and a cmp function
    that recursively compares them?


    Raymond
     
    Raymond Hettinger, Oct 3, 2009
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Damir Mikoc

    CMR/CMP and Compound Primary Key

    Damir Mikoc, Jul 3, 2003, in forum: Java
    Replies:
    1
    Views:
    561
    Christopher Blunck
    Jul 4, 2003
  2. Andrea Sansottera

    no cmp field defined in cmp ejb

    Andrea Sansottera, Jul 16, 2004, in forum: Java
    Replies:
    0
    Views:
    390
    Andrea Sansottera
    Jul 16, 2004
  3. =?ISO-8859-1?Q?Sch=FCle_Daniel?=

    basic questions on cmp, < and sort

    =?ISO-8859-1?Q?Sch=FCle_Daniel?=, Oct 25, 2006, in forum: Python
    Replies:
    4
    Views:
    309
    Fredrik Lundh
    Oct 26, 2006
  4. Replies:
    7
    Views:
    745
    Stefan Arentz
    Sep 10, 2007
  5. gb345

    Q: sort's key and cmp parameters

    gb345, Oct 1, 2009, in forum: Perl Misc
    Replies:
    1
    Views:
    100
Loading...

Share This Page