Multiple "cmp"s chained one after another

Discussion in 'Python' started by Volker Grabsch, May 14, 2005.

  1. Hello!

    Ich just found a very nice 'pythonic' solution for an often appearing
    problem. I didn't find it documented anywhere, so I'm posting it here.
    If this isn't new in any way, I'd really like to get to know it.

    Example problem:
    I have some "datetime" objects and want to sort them, as here:

    birthdays = [d1,d2,d3,d4]
    birthdays.sort()

    However, I don't want to sort them the default way. These are birthdays,
    so only the month and day do matter, not the year. E.g.:

    2003-01-01 should be smaller than 1984-05-01

    So I have to write the comparison on my own, e.g.

    def cmp_birthdays(d1,d2):
    if d1.month > d2.month: return 1
    if d1.month < d2.month: return -1
    if d1.day > d2.day: return 1
    if d1.day < d2.day: return -1
    return 0

    ...
    birthdays.sort(cmp_birthdays)

    This implementation of cmp_birthdays is very ugly. Image you want to
    chain more than 2 values in that "cmp_birthdays". I also want to use the
    builtin "cmp" function, not ">" and "<".

    After thinking some minutes about it, I found a very nice solution:
    I have some "cmp"s one aftter another. If one if them return 1 oder -1,
    it sould be returned. If it returns 0, the next "cmp" is used. In other
    words: I have a sequence of numbers, and want to get the first one that
    is not 0. (or return 0, if all numbers were 0)

    But this is exactly what the "or" operator does, due to short-circuit
    evaluation. In this example, that means:

    def cmp_bithdays(d1,d2):
    return cmp(d1.month,d2.month) or cmp(d1.day,d2.day)

    The generic pattern is:

    return cmp(...) or cmp (...) or cmp(...) or ...

    I'm not sure whether this pattern is already a "common recipe", but
    I found it to be a very nice idea. :)

    Any opinions?


    Greets,

    Volker

    --
    Volker Grabsch
    ---<<(())>>---
    \frac{\left|\vartheta_0\times\{\ell,\kappa\in\Re\}\right|}{\sqrt
    [G]{-\Gamma(\alpha)\cdot\mathcal{B}^{\left[\oint\!c_\hbar\right]}}}
    Volker Grabsch, May 14, 2005
    #1
    1. Advertising

  2. "Volker Grabsch" <> schrieb im
    Newsbeitrag news:d64io9$s6o$05$-online.com...
    | Hello!
    |
    | Ich just found a very nice 'pythonic' solution for an often appearing
    | problem. I didn't find it documented anywhere, so I'm posting it here.
    | If this isn't new in any way, I'd really like to get to know it.
    |
    | Example problem:
    | I have some "datetime" objects and want to sort them, as here:
    |
    | birthdays = [d1,d2,d3,d4]
    | birthdays.sort()
    |
    | However, I don't want to sort them the default way. These are birthdays,
    | so only the month and day do matter, not the year. E.g.:
    |
    | 2003-01-01 should be smaller than 1984-05-01
    |
    | So I have to write the comparison on my own, e.g.
    |
    | def cmp_birthdays(d1,d2):
    | if d1.month > d2.month: return 1
    | if d1.month < d2.month: return -1
    | if d1.day > d2.day: return 1
    | if d1.day < d2.day: return -1
    | return 0
    |
    | ...
    | birthdays.sort(cmp_birthdays)


    If you don't care about the year, why not just "normalize" the year
    to all be the same using the replace method of the date instance?
    Something like:

    d1 = datetime.date(2004, 12, 2)
    d2 = datetime.date(2001, 12, 3)
    d3 = datetime.date(2002, 12, 6)
    d4 = datetime.date(1977, 12, 7)
    dates =[d1,d2,d3,d4]
    datesNorm = [obj.replace(year=1900) for obj in (dates)]
    datesNorm.sort()
    print datesNorm # etcetera


    HTH,

    ---
    Vincent




    |
    | This implementation of cmp_birthdays is very ugly. Image you want to
    | chain more than 2 values in that "cmp_birthdays". I also want to use the
    | builtin "cmp" function, not ">" and "<".
    |
    | After thinking some minutes about it, I found a very nice solution:
    | I have some "cmp"s one aftter another. If one if them return 1 oder -1,
    | it sould be returned. If it returns 0, the next "cmp" is used. In other
    | words: I have a sequence of numbers, and want to get the first one that
    | is not 0. (or return 0, if all numbers were 0)
    |
    | But this is exactly what the "or" operator does, due to short-circuit
    | evaluation. In this example, that means:
    |
    | def cmp_bithdays(d1,d2):
    | return cmp(d1.month,d2.month) or cmp(d1.day,d2.day)
    |
    | The generic pattern is:
    |
    | return cmp(...) or cmp (...) or cmp(...) or ...
    |
    | I'm not sure whether this pattern is already a "common recipe", but
    | I found it to be a very nice idea. :)
    |
    | Any opinions?
    |
    |
    | Greets,
    |
    | Volker
    |
    | --
    | Volker Grabsch
    | ---<<(())>>---
    | \frac{\left|\vartheta_0\times\{\ell,\kappa\in\Re\}\right|}{\sqrt
    | [G]{-\Gamma(\alpha)\cdot\mathcal{B}^{\left[\oint\!c_\hbar\right]}}}
    vincent wehren, May 14, 2005
    #2
    1. Advertising

  3. Volker Grabsch

    Robert Kern Guest

    Volker Grabsch wrote:
    > Hello!
    >
    > Ich just found a very nice 'pythonic' solution for an often appearing
    > problem. I didn't find it documented anywhere, so I'm posting it here.
    > If this isn't new in any way, I'd really like to get to know it.
    >
    > Example problem:
    > I have some "datetime" objects and want to sort them, as here:
    >
    > birthdays = [d1,d2,d3,d4]
    > birthdays.sort()
    >
    > However, I don't want to sort them the default way. These are birthdays,
    > so only the month and day do matter, not the year. E.g.:
    >
    > 2003-01-01 should be smaller than 1984-05-01

    [snip]
    > Any opinions?


    I find that using the "key" argument to sort is much nicer than "cmp"
    for these tasks.

    In [5]:L = [datetime.date(2005,5,2), datetime.date(1984,12,15),
    datetime.date(1954,1,1)]

    In [7]:L.sort(key=lambda x: (x.month, x.day))

    In [8]:L
    Out[8]:
    [datetime.date(1954, 1, 1),
    datetime.date(2005, 5, 2),
    datetime.date(1984, 12, 15)]

    --
    Robert Kern


    "In the fields of hell where the grass grows high
    Are the graves of dreams allowed to die."
    -- Richard Harter
    Robert Kern, May 14, 2005
    #3
  4. Volker Grabsch

    Peter Hansen Guest

    vincent wehren wrote:
    > "Volker Grabsch" <> schrieb im
    > Newsbeitrag news:d64io9$s6o$05$-online.com...
    > | However, I don't want to sort them the default way. These are birthdays,
    > | so only the month and day do matter, not the year. E.g.:
    > |

    ....
    > If you don't care about the year, why not just "normalize" the year
    > to all be the same using the replace method of the date instance?
    > Something like:
    >
    > d1 = datetime.date(2004, 12, 2)
    > d2 = datetime.date(2001, 12, 3)
    > d3 = datetime.date(2002, 12, 6)
    > d4 = datetime.date(1977, 12, 7)
    > dates =[d1,d2,d3,d4]
    > datesNorm = [obj.replace(year=1900) for obj in (dates)]
    > datesNorm.sort()
    > print datesNorm # etcetera


    Or just use the .timetuple() method on datetime objects and sort on the
    8th element of the 9-element tuple, which is the day-of-the-year.

    -Peter
    Peter Hansen, May 14, 2005
    #4
  5. Robert Kern wrote:
    > I find that using the "key" argument to sort is much nicer than "cmp"
    > for these tasks.
    >
    > In [5]:L = [datetime.date(2005,5,2), datetime.date(1984,12,15),
    > datetime.date(1954,1,1)]
    >
    > In [7]:L.sort(key=lambda x: (x.month, x.day))
    >
    > In [8]:L
    > Out[8]:
    > [datetime.date(1954, 1, 1),
    > datetime.date(2005, 5, 2),
    > datetime.date(1984, 12, 15)]


    Yes, definitely. Also worth noting in Robert Kern's solution is that
    instead of writing:

    def mycmp(d1, d2):
    return cmp(d1.month,d2.month) or cmp(d1.day,d2.day)

    you can write:

    def mycmp(d1, d2):
    return cmp((d1.month, d1.day), (d2.month, d2.day))

    or if you're using the key= argument (like you probably should):

    def mykey(d):
    return (d.month, d.day)

    The point here is that rather than chaining cmp() calls with ors, you
    should just use a tuple -- the standard comparison order in tuples is
    exactly what you're looking for.

    STeVe
    Steven Bethard, May 14, 2005
    #5
  6. Steven Bethard wrote:
    > Robert Kern wrote:
    > def mykey(d):
    > return (d.month, d.day)
    >
    > The point here is that rather than chaining cmp() calls with ors, you
    > should just use a tuple -- the standard comparison order in tuples is
    > exactly what you're looking for.


    That's an excellent idea! Thanks a lot. I really didn't think of the "key="
    argument.


    Greets,

    --
    Volker Grabsch
    ---<<(())>>---
    \frac{\left|\vartheta_0\times\{\ell,\kappa\in\Re\}\right|}{\sqrt
    [G]{-\Gamma(\alpha)\cdot\mathcal{B}^{\left[\oint\!c_\hbar\right]}}}
    Volker Grabsch, May 14, 2005
    #6
  7. vincent wehren wrote:
    >
    > If you don't care about the year, why not just "normalize" the year
    > to all be the same using the replace method of the date instance?


    That's a very bad idea. In my example, this would work, but in "reality"
    I don't sort datetime objects, of course! (Is there any real application
    where you want to do that?)

    Instead, I'm sorting "Person" objects using a "birthday" attribute.
    Since I use these Person objects also in other places, they should never
    be modified without just to be sorted. In general, the number side effects
    should always be minimized.

    > datesNorm = [obj.replace(year=1900) for obj in (dates)]
    > datesNorm.sort()


    This code would go bad especially in my situation, where my "Person"
    objects are SQLObjects, thus the "normalisation" would be written into
    the database. Okay, one could use transactions and rollback", but I
    think, my point is clear now.

    Nevertheless, I think you idea is very interesting. Is there any "real"
    application where normalizing just for sorting would be reasonable?


    Greets,

    --
    Volker Grabsch
    ---<<(())>>---
    \frac{\left|\vartheta_0\times\{\ell,\kappa\in\Re\}\right|}{\sqrt
    [G]{-\Gamma(\alpha)\cdot\mathcal{B}^{\left[\oint\!c_\hbar\right]}}}
    Volker Grabsch, May 14, 2005
    #7
  8. Peter Hansen wrote:
    >
    > Or just use the .timetuple() method on datetime objects and sort on the
    > 8th element of the 9-element tuple, which is the day-of-the-year.


    An interesting idea. But wouldn't sorting by (dd.month,dd.day) be more
    effective?

    In other words: Does .timetuple() create a tuple, or does it just return
    a tuple which is present anyway?


    Greets,

    --
    Volker Grabsch
    ---<<(())>>---
    \frac{\left|\vartheta_0\times\{\ell,\kappa\in\Re\}\right|}{\sqrt
    [G]{-\Gamma(\alpha)\cdot\mathcal{B}^{\left[\oint\!c_\hbar\right]}}}
    Volker Grabsch, May 14, 2005
    #8
  9. "Volker Grabsch" <> schrieb im
    Newsbeitrag news:d65h73$fgr$02$-online.com...
    | vincent wehren wrote:
    | >
    | > If you don't care about the year, why not just "normalize" the year
    | > to all be the same using the replace method of the date instance?
    |
    | That's a very bad idea. In my example, this would work, but in "reality"
    | I don't sort datetime objects, of course! (Is there any real application
    | where you want to do that?)
    |
    | Instead, I'm sorting "Person" objects using a "birthday" attribute.
    | Since I use these Person objects also in other places, they should never
    | be modified without just to be sorted. In general, the number side effects
    | should always be minimized.

    Can you explain where you see a modification to the orginal object
    happening?
    (or in any of the other solutions proposed for that matter...)

    Not here I hope:

    |
    | > datesNorm = [obj.replace(year=1900) for obj in (dates)]
    | > datesNorm.sort()

    If you print datesNorm, you'll see:

    [datetime.date(1900, 12, 2), datetime.date(1900, 12, 3), datetime.date(1900,
    12, 6), datetime.date(1900, 12, 7)]

    However, dates is still the same:

    [datetime.date(2004, 12, 2), datetime.date(2001, 12, 3), datetime.date(2002,
    12, 6), datetime.date(1977, 12, 7)]

    |
    | This code would go bad especially in my situation, where my "Person"
    | objects are SQLObjects, thus the "normalisation" would be written into
    | the database. Okay, one could use transactions and rollback", but I
    | think, my point is clear now.

    Since you wouldn't need to change the attribute object to
    perform a sort (on the instances) using it (or portions of it) as key, it's
    not.
    Or is there something fundamental I am missing about your particular use
    case?

    | Nevertheless, I think you idea is very interesting. Is there any "real"
    | application where normalizing just for sorting would be reasonable?

    How about a case-insensitive sort of strings? (uppering being the
    normalization step)
    Or getting rid of accented / special characters before sorting.
    These sound like fairly straight-forward use cases to me ;)

    Anyway, I was doing some SQL this morning where getting rid of portions of
    'datetime' fields in
    a WHERE clause is sometimes just the ticket - hence the connection.


    --

    Vincent


    |
    | --
    | Volker Grabsch
    | ---<<(())>>---
    | \frac{\left|\vartheta_0\times\{\ell,\kappa\in\Re\}\right|}{\sqrt
    | [G]{-\Gamma(\alpha)\cdot\mathcal{B}^{\left[\oint\!c_\hbar\right]}}}
    vincent wehren, May 14, 2005
    #9
  10. vincent wehren wrote:
    >
    >| > If you don't care about the year, why not just "normalize" the year
    >| > to all be the same using the replace method of the date instance?
    >|
    >| That's a very bad idea. In my example, this would work, but in "reality"
    >| I don't sort datetime objects, of course! (Is there any real application
    >| where you want to do that?)
    >|
    >| Instead, I'm sorting "Person" objects using a "birthday" attribute.
    >| Since I use these Person objects also in other places, they should never
    >| be modified without just to be sorted. In general, the number side effects
    >| should always be minimized.
    >
    > Can you explain where you see a modification to the orginal object
    > happening?
    > (or in any of the other solutions proposed for that matter...)


    Sorry, my fault. I didn't read carefully enough. X-)

    > Not here I hope:
    >
    >| > datesNorm = [obj.replace(year=1900) for obj in (dates)]
    >| > datesNorm.sort()


    While you don't change the original objects, there's still a problem since
    you're sorting the normalized values. However, I want to sort the original
    list (i.e. the list of "Person" objects).

    But that's not a real problem if one normalizes in a key function:

    def key_birthday(d):
    return d.replace(year=1900)
    ...
    dates.sort(key=key_birthday)

    ...as suggested in other followups of my posting.

    >| Nevertheless, I think you idea is very interesting. Is there any "real"
    >| application where normalizing just for sorting would be reasonable?
    >
    > How about a case-insensitive sort of strings? (uppering being the
    > normalization step)
    > Or getting rid of accented / special characters before sorting.
    > These sound like fairly straight-forward use cases to me ;)


    For your solution these are good examples. But my question was, whether
    normalizing first, and just sorting the normalized values (not the original
    values) is reasonable.

    I.e., when I sort some strings case-insensitive, I don't want my resulting
    (sorted) list to contain only lowercase string. But that's what I would
    get if I used the algorithm you described above.


    Greets,

    --
    Volker Grabsch
    ---<<(())>>---
    \frac{\left|\vartheta_0\times\{\ell,\kappa\in\Re\}\right|}{\sqrt
    [G]{-\Gamma(\alpha)\cdot\mathcal{B}^{\left[\oint\!c_\hbar\right]}}}
    Volker Grabsch, May 14, 2005
    #10
  11. Volker Grabsch

    Peter Hansen Guest

    Volker Grabsch wrote:
    > Peter Hansen wrote:
    >
    >>Or just use the .timetuple() method on datetime objects and sort on the
    >>8th element of the 9-element tuple, which is the day-of-the-year.

    >
    > An interesting idea. But wouldn't sorting by (dd.month,dd.day) be more
    > effective?


    Depending on the meaning of "effective", I suppose so.

    > In other words: Does .timetuple() create a tuple, or does it just return
    > a tuple which is present anyway?


    I don't know what it does. Normally one shouldn't care about that kind
    of thing. In this case, you seem to be assuming that .month and .day
    are values that are "present anyway", but we don't really know that
    either, since they could just as well be properties. If someone who
    knows or someone who checks the source (for a given implementation of
    course... it could change) cares to share, we'd know more. What good
    that would do us is still a mystery to me. ;-)

    -Peter
    Peter Hansen, May 15, 2005
    #11
  12. Volker Grabsch

    Glauco Silva Guest

    You can try this :
    >>> l = [(2001, 5, 2),(2111,3,3),(1984, 3, 1), (2001, 1, 1)]
    >>> l.sort(lambda x = l[0],y = l[1] : cmp((x[1],x[2]),(y[1],y[2])))


    ----- Original Message -----
    From: "Volker Grabsch" <>
    To: <>
    Sent: Saturday, May 14, 2005 7:09 AM
    Subject: Multiple "cmp"s chained one after another


    Hello!

    Ich just found a very nice 'pythonic' solution for an often appearing
    problem. I didn't find it documented anywhere, so I'm posting it here.
    If this isn't new in any way, I'd really like to get to know it.

    Example problem:
    I have some "datetime" objects and want to sort them, as here:

    birthdays = [d1,d2,d3,d4]
    birthdays.sort()

    However, I don't want to sort them the default way. These are birthdays,
    so only the month and day do matter, not the year. E.g.:

    2003-01-01 should be smaller than 1984-05-01

    So I have to write the comparison on my own, e.g.

    def cmp_birthdays(d1,d2):
    if d1.month > d2.month: return 1
    if d1.month < d2.month: return -1
    if d1.day > d2.day: return 1
    if d1.day < d2.day: return -1
    return 0

    ....
    birthdays.sort(cmp_birthdays)

    This implementation of cmp_birthdays is very ugly. Image you want to
    chain more than 2 values in that "cmp_birthdays". I also want to use the
    builtin "cmp" function, not ">" and "<".

    After thinking some minutes about it, I found a very nice solution:
    I have some "cmp"s one aftter another. If one if them return 1 oder -1,
    it sould be returned. If it returns 0, the next "cmp" is used. In other
    words: I have a sequence of numbers, and want to get the first one that
    is not 0. (or return 0, if all numbers were 0)

    But this is exactly what the "or" operator does, due to short-circuit
    evaluation. In this example, that means:

    def cmp_bithdays(d1,d2):
    return cmp(d1.month,d2.month) or cmp(d1.day,d2.day)

    The generic pattern is:

    return cmp(...) or cmp (...) or cmp(...) or ...

    I'm not sure whether this pattern is already a "common recipe", but
    I found it to be a very nice idea. :)

    Any opinions?


    Greets,

    Volker

    --
    Volker Grabsch
    ---<<(())>>---
    \frac{\left|\vartheta_0\times\{\ell,\kappa\in\Re\}\right|}{\sqrt
    [G]{-\Gamma(\alpha)\cdot\mathcal{B}^{\left[\oint\!c_\hbar\right]}}}




    --
    No virus found in this outgoing message.
    Checked by AVG Anti-Virus.
    Version: 7.0.308 / Virus Database: 266.11.10 - Release Date: 13/5/2005
    Glauco Silva, May 16, 2005
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mike H
    Replies:
    1
    Views:
    686
    Chris Smith
    Feb 27, 2004
  2. Andrea Sansottera

    no cmp field defined in cmp ejb

    Andrea Sansottera, Jul 16, 2004, in forum: Java
    Replies:
    0
    Views:
    376
    Andrea Sansottera
    Jul 16, 2004
  3. loveNUNO
    Replies:
    2
    Views:
    897
    loveNUNO
    Nov 20, 2003
  4. Sathyaish

    Chained Comparisons

    Sathyaish, Mar 20, 2006, in forum: Python
    Replies:
    15
    Views:
    511
    Dennis Lee Bieber
    Mar 20, 2006
  5. Miguel Durán

    problems with multiple (many) chained drop-downs

    Miguel Durán, Jun 22, 2004, in forum: Javascript
    Replies:
    0
    Views:
    72
    Miguel Durán
    Jun 22, 2004
Loading...

Share This Page