Guido rethinking removal of cmp from sort method

Discussion in 'Python' started by Steven D'Aprano, Mar 13, 2011.

  1. The removal of cmp from the sort method of lists is probably the most
    disliked change in Python 3. On the python-dev mailing list at the
    moment, Guido is considering whether or not it was a mistake.

    If anyone has any use-cases for sorting with a comparison function that
    either can't be written using a key function, or that perform really
    badly when done so, this would be a good time to speak up.



    --
    Steven
    Steven D'Aprano, Mar 13, 2011
    #1
    1. Advertising

  2. Steven D'Aprano <steve+comp.lang.python <at> pearwood.info> writes:

    > If anyone has any use-cases for sorting with a comparison function that
    > either can't be written using a key function, or that perform really
    > badly when done so, this would be a good time to speak up.


    I think it's probably provable that there are no cases in the first category,
    provided you're willing to do something sufficiently contorted. However,
    it also seems self-evident to me that many programmers will rightly chafe
    at the idea of creating and tearing down a bunch of objects just to
    compare things for sorting. Think of the heap churn! Even if it turns out
    that Python 3 contains some magic implementation detail that makes it
    efficient most of the time, it goes against a natural understanding of the
    computation model

    2p for y'all.
    -Dave
    Dave Abrahams, Mar 13, 2011
    #2
    1. Advertising

  3. Steven D'Aprano, 13.03.2011 13:59:
    > The removal of cmp from the sort method of lists is probably the most
    > disliked change in Python 3. On the python-dev mailing list at the
    > moment, Guido is considering whether or not it was a mistake.
    >
    > If anyone has any use-cases for sorting with a comparison function that
    > either can't be written using a key function, or that perform really
    > badly when done so, this would be a good time to speak up.


    As Raymond Hettinger and Daniel Stutzbach pointed out in that thread, the
    memory overhead is much lower in Python 3.2 and also depends on the usage.
    If memory is a problem, it can still be traded for time by sorting in
    multiple (stable) passes with smaller keys. It rarely is a problem in
    practice, though, and Guido's initial post seems to suggest that as well.

    Stefan
    Stefan Behnel, Mar 14, 2011
    #3
  4. Steven D'Aprano wrote:
    > The removal of cmp from the sort method of lists is probably the most
    > disliked change in Python 3. On the python-dev mailing list at the
    > moment, Guido is considering whether or not it was a mistake.
    >
    > If anyone has any use-cases for sorting with a comparison function that
    > either can't be written using a key function, or that perform really
    > badly when done so, this would be a good time to speak up.
    >
    >
    >
    >

    You seem concerned by this removal, do you have any use-case ?

    JM
    Jean-Michel Pichavant, Mar 14, 2011
    #4
  5. On Mon, 14 Mar 2011 12:10:27 +0100, Jean-Michel Pichavant wrote:

    > Steven D'Aprano wrote:
    >> The removal of cmp from the sort method of lists is probably the most
    >> disliked change in Python 3. On the python-dev mailing list at the
    >> moment, Guido is considering whether or not it was a mistake.
    >>
    >> If anyone has any use-cases for sorting with a comparison function that
    >> either can't be written using a key function, or that perform really
    >> badly when done so, this would be a good time to speak up.
    >>
    >>
    >>
    >>

    > You seem concerned by this removal, do you have any use-case ?


    You seem concerned by my concern. Why do you think I am concerned?

    (1) I'm not concerned, but many people are. If you search the archives of
    this newsgroup (mailing list), you'll see that I have defended the
    removal of cmp from sort, e.g. this post:

    http://www.mail-archive.com//msg261728.html


    (2) If I had a good use-case for keeping cmp, I wouldn't need to ask
    others if they had a good use-case.

    As it is, Guido himself has mentioned one such good use for a comparison
    function when sorting. Use of a key function trades off memory for time,
    while sorting with a comparison function makes the opposite trade off,
    using more time for the sake of saving memory.



    --
    Steven
    Steven D'Aprano, Mar 14, 2011
    #5
  6. Steven D'Aprano wrote:
    > On Mon, 14 Mar 2011 12:10:27 +0100, Jean-Michel Pichavant wrote:
    >
    >
    >> Steven D'Aprano wrote:
    >>
    >>> The removal of cmp from the sort method of lists is probably the most
    >>> disliked change in Python 3. On the python-dev mailing list at the
    >>> moment, Guido is considering whether or not it was a mistake.
    >>>
    >>> If anyone has any use-cases for sorting with a comparison function that
    >>> either can't be written using a key function, or that perform really
    >>> badly when done so, this would be a good time to speak up.
    >>>
    >>>
    >>>
    >>>
    >>>

    >> You seem concerned by this removal, do you have any use-case ?
    >>

    >
    > You seem concerned by my concern. Why do you think I am concerned?
    >
    > (1) I'm not concerned, but many people are. If you search the archives of
    > this newsgroup (mailing list), you'll see that I have defended the
    > removal of cmp from sort, e.g. this post:
    >
    > http://www.mail-archive.com//msg261728.html
    >
    >
    > (2) If I had a good use-case for keeping cmp, I wouldn't need to ask
    > others if they had a good use-case.
    >
    > As it is, Guido himself has mentioned one such good use for a comparison
    > function when sorting. Use of a key function trades off memory for time,
    > while sorting with a comparison function makes the opposite trade off,
    > using more time for the sake of saving memory.
    >
    >

    Just to know if currently the use-case count is zero. Nothing evil
    hidden behind my question :p

    JM
    Jean-Michel Pichavant, Mar 14, 2011
    #6
  7. Steven D'Aprano

    Paul Rubin Guest

    Steven D'Aprano <> writes:
    > If anyone has any use-cases for sorting with a comparison function that
    > either can't be written using a key function, or that perform really
    > badly when done so, this would be a good time to speak up.


    We've had this discussion a couple times before. I remember an example
    involving comparing tree structures, that I thought couldn't be done
    with key= in a reasonable way, and that this was a reasonable real-world
    example. Raymond H then gave a way of encoding it with key= which
    looked ok at the time, but I decided sometime later that I think we both
    missed an error in that encoding, so I've been wanting to dig it out
    again and look more closely.

    key= can of course burn a lot more memory because of the DSU pattern
    (say you are sorting file handles according to the contents of the file,
    so with key= you might have to read N multi-megabyte files where with
    cmp you just pairwise-compare them until they differ, which is probably
    in the first few bytes).

    Finally I concocted an "infinite" example which we agreed is artificial:
    you are given a list of generators denoting real numbers, for example
    pi generates the infinite sequence 3,1,4,1,5,9... while e generates
    2,7,1,8,... You can sort these with cmp but not with key.
    Paul Rubin, Mar 14, 2011
    #7
  8. Steven D'Aprano

    Nobody Guest

    On Mon, 14 Mar 2011 12:39:35 -0700, Paul Rubin wrote:

    > Finally I concocted an "infinite" example which we agreed is artificial:
    > you are given a list of generators denoting real numbers, for example
    > pi generates the infinite sequence 3,1,4,1,5,9... while e generates
    > 2,7,1,8,... You can sort these with cmp but not with key.


    Is there some restriction on the return type of the key= function? If not,
    you can define a class with appropriate comparison methods and have key=
    return instances of that class.
    Nobody, Mar 14, 2011
    #8
  9. On Mon, 14 Mar 2011 23:37:06 +0000, Nobody wrote:

    > On Mon, 14 Mar 2011 12:39:35 -0700, Paul Rubin wrote:
    >
    >> Finally I concocted an "infinite" example which we agreed is
    >> artificial: you are given a list of generators denoting real numbers,
    >> for example pi generates the infinite sequence 3,1,4,1,5,9... while e
    >> generates 2,7,1,8,... You can sort these with cmp but not with key.

    >
    > Is there some restriction on the return type of the key= function? If
    > not, you can define a class with appropriate comparison methods and have
    > key= return instances of that class.


    You mean like this?

    http://docs.python.org/dev/library/functools.html#functools.cmp_to_key




    --
    Steven
    Steven D'Aprano, Mar 15, 2011
    #9
  10. Steven D'Aprano

    Paul Rubin Guest

    Nobody <> writes:
    >> 2,7,1,8,... You can sort these with cmp but not with key.

    >
    > Is there some restriction on the return type of the key= function? If not,
    > you can define a class with appropriate comparison methods and have key=
    > return instances of that class.


    Sorry, yeah, of course you can do it that way, but it's bloody ugly and
    you're no longer gaining the supposed benefit of the DSU pattern.
    Paul Rubin, Mar 15, 2011
    #10
  11. Steven D'Aprano

    Ian Kelly Guest

    On Mon, Mar 14, 2011 at 1:39 PM, Paul Rubin <> wrote:
    > Finally I concocted an "infinite" example which we agreed is artificial:
    > you are given a list of generators denoting real numbers, for example
    > pi generates the infinite sequence 3,1,4,1,5,9... while e generates
    > 2,7,1,8,...  You can sort these with cmp but not with key.


    I would think that you can sort them with key as long as none of the
    sequences are equal (which would result in an infinite loop using
    either method). Why not this?

    _MISSING = object()

    @functools.total_ordering
    class IteratorKey(object):

    def __init__(self, iterable):
    self._iterator = iter(iterable)
    self._cache = []

    def __iter__(self):
    # This is not reentrant, but it's
    # good enough for this purpose.
    for x in self._cache:
    yield x
    for x in self._iterator:
    self._cache.append(x)
    yield x

    def __lt__(self, other):
    for x, y in itertools.izip_longest(self, other, fillvalue=_MISSING):
    if x is _MISSING or y is _MISSING:
    return x is _MISSING
    elif x < y or y < x:
    return x < y
    return False

    def __eq__(self, other):
    for x, y in itertools.izip_longest(self, other, fillvalue=_MISSING):
    if x is _MISSING or y is _MISSING or x != y:
    return False
    return True
    Ian Kelly, Mar 15, 2011
    #11
  12. Steven D'Aprano

    Paul Rubin Guest

    Ian Kelly <> writes:
    > I would think that you can sort them with key as long as none of the
    > sequences are equal (which would result in an infinite loop using
    > either method). Why not this?


    Yes you can do something like that, but look how ugly it is compared
    with using cmp.
    Paul Rubin, Mar 15, 2011
    #12
  13. Paul Rubin, 15.03.2011 09:00:
    > Ian Kelly writes:
    >> I would think that you can sort them with key as long as none of the
    >> sequences are equal (which would result in an infinite loop using
    >> either method). Why not this?

    >
    > Yes you can do something like that, but look how ugly it is compared
    > with using cmp.


    I would argue that it's a rare use case, though. In most cases, memory
    doesn't really matter, and the overhead of a decorated sort is acceptable.
    In some cases, where space matters, an external sort is a better choice or
    even the only thing that will work, so all that's left are the cases where
    speed really needs to be traded for space *and* an external sort is not
    applicable, and those where things can be expressed more easily using cmp
    than using a key. For the latter, there's support in the standard library,
    as Steven pointed out. Now, the question that remains is: are the few
    remaining cases worth being supported directly, thus complicating the
    internal source code of the Python runtime?

    Stefan
    Stefan Behnel, Mar 15, 2011
    #13
  14. Steven D'Aprano

    Paul Rubin Guest

    Stefan Behnel <> writes:
    > the question that remains is: are the few remaining cases worth being
    > supported directly, thus complicating the internal source code of the
    > Python runtime?


    The other cases were supported perfectly well already, and the support
    was removed. That's just ludicrous in my opinion.
    Paul Rubin, Mar 15, 2011
    #14
  15. On Sun, Mar 13, 2011 at 12:59:55PM +0000, Steven D'Aprano wrote:
    > The removal of cmp from the sort method of lists is probably the most
    > disliked change in Python 3. On the python-dev mailing list at the
    > moment, Guido is considering whether or not it was a mistake.
    >
    > If anyone has any use-cases for sorting with a comparison function that
    > either can't be written using a key function, or that perform really
    > badly when done so, this would be a good time to speak up.


    How about a list of tuples where you want them sorted first item in ascending
    order en second item in descending order.

    --
    Antoon Pardon
    Antoon Pardon, Mar 23, 2011
    #15
  16. Antoon Pardon, 23.03.2011 14:53:
    > On Sun, Mar 13, 2011 at 12:59:55PM +0000, Steven D'Aprano wrote:
    >> The removal of cmp from the sort method of lists is probably the most
    >> disliked change in Python 3. On the python-dev mailing list at the
    >> moment, Guido is considering whether or not it was a mistake.
    >>
    >> If anyone has any use-cases for sorting with a comparison function that
    >> either can't be written using a key function, or that perform really
    >> badly when done so, this would be a good time to speak up.

    >
    > How about a list of tuples where you want them sorted first item in ascending
    > order en second item in descending order.


    You can use a stable sort in two steps for that.

    Stefan
    Stefan Behnel, Mar 23, 2011
    #16
  17. On Wed, Mar 23, 2011 at 02:59:09PM +0100, Stefan Behnel wrote:
    > Antoon Pardon, 23.03.2011 14:53:
    > >On Sun, Mar 13, 2011 at 12:59:55PM +0000, Steven D'Aprano wrote:
    > >>The removal of cmp from the sort method of lists is probably the most
    > >>disliked change in Python 3. On the python-dev mailing list at the
    > >>moment, Guido is considering whether or not it was a mistake.
    > >>
    > >>If anyone has any use-cases for sorting with a comparison function that
    > >>either can't be written using a key function, or that perform really
    > >>badly when done so, this would be a good time to speak up.

    > >
    > >How about a list of tuples where you want them sorted first item in ascending
    > >order en second item in descending order.

    >
    > You can use a stable sort in two steps for that.


    Which isn't helpfull if where you decide how they have to be sorted is
    not the place where they are actually sorted.

    I have a class that is a priority queue. Elements are added at random but
    are removed highest priority first. The priority queue can have a key or
    a cmp function for deciding which item is the highest priority. It
    can also take a list as an initializor, which will then be sorted.

    So this list is sorted within the class but how it is sorted is decided
    outside the class. So I can't do the sort in multiple steps.

    --
    Antoon Pardon
    Antoon Pardon, Mar 23, 2011
    #17
  18. Steven D'Aprano

    Ian Kelly Guest

    On Wed, Mar 23, 2011 at 9:14 AM, Antoon Pardon
    <> wrote:
    > Which isn't helpfull if where you decide how they have to be sorted is
    > not the place where they are actually sorted.
    >
    > I have a class that is a priority queue. Elements are added at random but
    > are removed highest priority first. The priority queue can have a key or
    > a cmp function for deciding which item is the highest priority. It
    > can also take a list as an initializor, which will then be sorted.
    >
    > So this list is sorted within the class but how it is sorted is decided
    > outside the class. So I can't do the sort in multiple steps.


    You can't do this?

    for (key, reversed) in self.get_multiple_sort_keys():
    self.pq.sort(key=key, reversed=reversed)
    Ian Kelly, Mar 23, 2011
    #18
  19. Antoon Pardon, 23.03.2011 16:14:
    > On Wed, Mar 23, 2011 at 02:59:09PM +0100, Stefan Behnel wrote:
    >> Antoon Pardon, 23.03.2011 14:53:
    >>> On Sun, Mar 13, 2011 at 12:59:55PM +0000, Steven D'Aprano wrote:
    >>>> The removal of cmp from the sort method of lists is probably the most
    >>>> disliked change in Python 3. On the python-dev mailing list at the
    >>>> moment, Guido is considering whether or not it was a mistake.
    >>>>
    >>>> If anyone has any use-cases for sorting with a comparison function that
    >>>> either can't be written using a key function, or that perform really
    >>>> badly when done so, this would be a good time to speak up.
    >>>
    >>> How about a list of tuples where you want them sorted first item in ascending
    >>> order en second item in descending order.

    >>
    >> You can use a stable sort in two steps for that.

    >
    > Which isn't helpfull if where you decide how they have to be sorted is
    > not the place where they are actually sorted.
    >
    > I have a class that is a priority queue. Elements are added at random but
    > are removed highest priority first. The priority queue can have a key or
    > a cmp function for deciding which item is the highest priority. It
    > can also take a list as an initializor, which will then be sorted.
    >
    > So this list is sorted within the class but how it is sorted is decided
    > outside the class. So I can't do the sort in multiple steps.


    That sounds more like a use case for heap sort than for Python's builtin
    list sort. See the heapq module.

    Stefan
    Stefan Behnel, Mar 23, 2011
    #19
  20. Steven D'Aprano

    Carl Banks Guest

    On Mar 23, 6:59 am, Stefan Behnel <> wrote:
    > Antoon Pardon, 23.03.2011 14:53:
    >
    > > On Sun, Mar 13, 2011 at 12:59:55PM +0000, Steven D'Aprano wrote:
    > >> The removal of cmp from the sort method of lists is probably the most
    > >> disliked change in Python 3. On the python-dev mailing list at the
    > >> moment, Guido is considering whether or not it was a mistake.

    >
    > >> If anyone has any use-cases for sorting with a comparison function that
    > >> either can't be written using a key function, or that perform really
    > >> badly when done so, this would be a good time to speak up.

    >
    > > How about a list of tuples where you want them sorted first item in ascending
    > > order en second item in descending order.

    >
    > You can use a stable sort in two steps for that.


    How about this one: you have are given an obscure string collating
    function implented in a C library you don't have the source to.

    Or how about this: I'm sitting at an interactive session and I have a
    convenient cmp function but no convenient key, and I care more about
    the four minutes it'd take to whip up a clever key function or an
    adapter class than the 0.2 seconds I'd save to on sorting time.

    Removing cmp from sort was a mistake; it's the most straightforward
    and natural way to sort in many cases. Reason enough for me to keep
    it.


    Carl Banks
    Carl Banks, Mar 23, 2011
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Andrea Sansottera

    no cmp field defined in cmp ejb

    Andrea Sansottera, Jul 16, 2004, in forum: Java
    Replies:
    0
    Views:
    384
    Andrea Sansottera
    Jul 16, 2004
  2. Magnus Lycka

    Rethinking the Python tutorial

    Magnus Lycka, Feb 9, 2006, in forum: Python
    Replies:
    11
    Views:
    539
    Magnus Lycka
    Feb 15, 2006
  3. stork
    Replies:
    6
    Views:
    353
    Michael Ashton
    Dec 18, 2006
  4. Replies:
    7
    Views:
    740
    Stefan Arentz
    Sep 10, 2007
  5. john_re
    Replies:
    0
    Views:
    263
    john_re
    Apr 4, 2009
Loading...

Share This Page