Code snippet: Natural string sorting

Discussion in 'Python' started by C. Barnes, Jun 9, 2004.

  1. C. Barnes

    C. Barnes Guest

    Summary:

    Sorts strings in a way that seems natural to humans.
    If the
    strings contain integers, then the integers are
    ordered
    numerically. For example, sorts ['Team 11', 'Team 3',
    'Team 1']
    into the order ['Team 1', 'Team 3', 'Team 11'].

    Code:

    #
    ---------------------------------------------------------
    # natsort.py: Natural string sorting.
    #
    ---------------------------------------------------------

    # By Seo Sanghyeon. Some changes by Connelly Barnes.

    def try_int(s):
    "Convert to integer if possible."
    try: return int(s)
    except: return s

    def natsort_key(s):
    "Used internally to get a tuple by which s is
    sorted."
    import re
    return map(try_int, re.findall(r'(\d+|\D+)', s))

    def natcmp(a, b):
    "Natural string comparison, case sensitive."
    return cmp(natsort_key(a), natsort_key(b))

    def natcasecmp(a, b):
    "Natural string comparison, ignores case."
    return natcmp(a.lower(), b.lower())

    def natsort(seq, cmp=natcmp):
    "In-place natural string sort."
    seq.sort(cmp)

    def natsorted(seq, cmp=natcmp):
    "Returns a copy of seq, sorted by natural string
    sort."
    import copy
    temp = copy.copy(seq)
    natsort(temp, cmp)
    return temp

    Examples:

    You can use this code to sort tarball filenames:

    >>> natsorted(['ver-1.3.12', 'ver-1.3.3', 'ver-1.2.5',

    'ver-1.2.15', 'ver-1.2.3', 'ver-1.2.1'])
    ['ver-1.2.1', 'ver-1.2.3', 'ver-1.2.5', 'ver-1.2.15',
    'ver-1.3.3', 'ver-1.3.12']

    Chemical elements:

    >>> natsorted(['C1H2', 'C1H4', 'C2H2', 'C2H6', 'C2N',

    'C3H6'])
    ['C1H2', 'C1H4', 'C2H2', 'C2H6', 'C2N', 'C3H6']

    Teams:

    >>> natsorted(['Team 101', 'Team 58', 'Team 30', 'Team

    1'])
    ['Team 1', 'Team 30', 'Team 58', 'Team 101']

    Pass natcasecmp as a second argument for
    case-insensitive sorting:

    >>> natsorted(['a5', 'A7', 'a15', 'a9', 'A8'],

    natcasecmp)
    ['a5', 'A7', 'A8', 'a9', 'a15']

    Enjoy!


    Message and source code are in the public domain.




    __________________________________
    Do you Yahoo!?
    Friends. Fun. Try the all-new Yahoo! Messenger.
    http://messenger.yahoo.com/
    C. Barnes, Jun 9, 2004
    #1
    1. Advertising

  2. C. Barnes wrote:
    > Summary:
    >
    > Sorts strings in a way that seems natural to humans.
    > If the
    > strings contain integers, then the integers are
    > ordered
    > numerically. For example, sorts ['Team 11', 'Team 3',
    > 'Team 1']
    > into the order ['Team 1', 'Team 3', 'Team 11'].

    [snip]

    Nice. Can be written (and, hopefully, made a bit more efficient) using
    the Decorate-Sort-Undecorate pattern too:

    import re

    _expr = re.compile(r'(\d+|\D+)')

    def try_int(s):
    if s.isdigit():
    return int(s)
    else:
    return s

    def nat_key(s):
    return [try_int(e) for e in _expr.findall(s)]

    def nat_nocase_key(s):
    return nat_key(s.lower())

    def decsorted(seq, key):
    decorated = [(key(item), item) for item in seq]
    decorated.sort()
    return [item for k, item in decorated]

    Then you could use decsorted(seq, nat_key) or decsorted(seq,
    nat_nocase_key).

    If someone is found of one-liners, he can just write nat_key as

    def nat_key(s):
    return map(lambda s: s.isdigit() and int(s) or s, _expr.findall(s))

    Note that 2.4's sorted() and list.sort() will have built-in support for
    decorate-sort-undecorate, so you'll just have to throw nat_key or
    nat_nocase_key to them, with no need for decsorted.

    --
    Ciao,
    Matteo
    Matteo Dell'Amico, Jun 9, 2004
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. sks

    Natural string sorting

    sks, Jun 18, 2004, in forum: Java
    Replies:
    6
    Views:
    501
    Roedy Green
    Jun 18, 2004
  2. Paul
    Replies:
    1
    Views:
    13,188
    Rogan Dawes
    Sep 14, 2004
  3. Derek Basch
    Replies:
    3
    Views:
    313
    Derek Basch
    Jun 17, 2004
  4. Terry Reedy
    Replies:
    3
    Views:
    358
    Andrea Griffini
    Jun 16, 2004
  5. Connelly Barnes

    Natural string sorting

    Connelly Barnes, Jun 18, 2004, in forum: Python
    Replies:
    1
    Views:
    326
    Peter Hansen
    Jun 19, 2004
Loading...

Share This Page