outline-style sorting algorithm

Discussion in 'Python' started by jwsacksteder@ramprecision.com, Apr 19, 2004.

  1. Guest

    I have a need to sort a list of elements that represent sections of a
    document in dot-separated notation. The built in sort does the wrong thing.
    This seems a rather complex problem and I was hoping someone smarter than me
    had already worked out the best way to approach this. For example, consider
    the following list-

    >>> foo

    ['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20', '1.20.1',
    '1.30']
    >>> foo.sort()
    >>> foo

    ['1.0', '1.0.1', '1.1.1', '1.10', '1.11', '1.2', '1.20', '1.20.1', '1.30',
    '1.9']

    Obviously 1.20.1 should be after 1.9 if we look at this as dot-delimited
    integers, not as decimal numbers.

    Does anyone have pointers to existing code?
     
    , Apr 19, 2004
    #1
    1. Advertising

  2. Max M Guest

    wrote:
    > I have a need to sort a list of elements that represent sections of a
    > document in dot-separated notation. The built in sort does the wrong

    thing.

    Not really. You are giving it a list of strings, and it sort those
    alphabetically. That seems like the right thing to me ;-)

    > This seems a rather complex problem and I was hoping someone smarter

    than me
    > had already worked out the best way to approach this. For example,

    consider
    > the following list-


    >>>>foo

    >
    > ['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20', '1.20.1',
    > '1.30']


    You need to convert them to another datatype first. Your best bet here
    would be a list or a tuple, as they can map directly to your data.

    '1.0.1'.split('.') == [1,0,1]

    But list are a bit easier here.

    foo_as_tuples = [f.split('.') for f in foo]
    foo_as_tuples.sort()

    Then you must convert it back to strings again.

    foo = ['.'.join(f) for f in foo_as_tuples]

    There is a standard way of sorting quickly in python, called
    decorate-sort-undecorate. It is allmost the same example as before:


    decorated = [(itm.split('.'),itm) for itm in foo]
    decorated.sort()
    foo = [d[-1] for d in decorated]

    regards Max M
     
    Max M, Apr 19, 2004
    #2
    1. Advertising

  3. Max M Guest

    wrote:
    > I have a need to sort a list of elements that represent sections of a
    > document in dot-separated notation. The built in sort does the wrong

    thing.

    Not really. You are giving it a list of strings, and it sort those
    alphabetically. That seems like the right thing to me ;-)

    > This seems a rather complex problem and I was hoping someone smarter

    than me
    > had already worked out the best way to approach this. For example,

    consider
    > the following list-


    >>>>foo

    >
    > ['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20', '1.20.1',
    > '1.30']


    You need to convert them to another datatype first. Your best bet here
    would be a list or a tuple, as they can map directly to your data.

    '1.0.1'.split('.') == [1,0,1]

    But list are a bit easier here.

    foo_as_tuples = [f.split('.') for f in foo]
    foo_as_tuples.sort()

    Then you must convert it back to strings again.

    foo = ['.'.join(f) for f in foo_as_tuples]

    There is a standard way of sorting quickly in python, called
    decorate-sort-undecorate. It is allmost the same example as before:


    decorated = [(itm.split('.'),itm) for itm in foo]
    decorated.sort()
    foo = [d[-1] for d in decorated]

    regards Max M
     
    Max M, Apr 19, 2004
    #3
  4. Eric Brunel Guest

    Max M wrote:
    > wrote:
    > > I have a need to sort a list of elements that represent sections of a
    > > document in dot-separated notation. The built in sort does the wrong

    > thing.
    >
    > Not really. You are giving it a list of strings, and it sort those
    > alphabetically. That seems like the right thing to me ;-)
    >
    > > This seems a rather complex problem and I was hoping someone smarter

    > than me
    > > had already worked out the best way to approach this. For example,

    > consider
    > > the following list-

    >
    > >>>>foo

    > >
    > > ['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20',

    > '1.20.1',
    > > '1.30']

    >
    > You need to convert them to another datatype first. Your best bet here
    > would be a list or a tuple, as they can map directly to your data.
    >
    > '1.0.1'.split('.') == [1,0,1]

    [snip]

    Nope:
    '1.0.1.split('.') == ['1', '0', '1']

    So the int's are still represented as strings and it does not solve the OP's
    problem:

    >>> foo = ['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20',

    '1.20.1', '1.30']
    >>> bar = [x.split('.') for x in foo]
    >>> bar

    [['1', '0'], ['1', '0', '1'], ['1', '1', '1'], ['1', '2'], ['1', '9'], ['1',
    '10'], ['1', '11'], ['1', '20'], ['1', '20', '1'], ['1', '30']]
    >>> bar.sort()
    >>> bar

    [['1', '0'], ['1', '0', '1'], ['1', '1', '1'], ['1', '10'], ['1', '11'], ['1',
    '2'], ['1', '20'], ['1', '20', '1'], ['1', '30'], ['1', '9']]

    And the 1.20.something are still before 1.9

    What you need to do is explicitely convert to integers the strings in the list
    resulting from the split. Here is the shortest way to do it

    >>> bar = [map(int, x.split('.')) for x in foo]
    >>> bar

    [[1, 0], [1, 0, 1], [1, 1, 1], [1, 2], [1, 9], [1, 10], [1, 11], [1, 20], [1,
    20, 1], [1, 30]]

    Now you can sort:

    >>> bar.sort()
    >>> bar

    [[1, 0], [1, 0, 1], [1, 1, 1], [1, 2], [1, 9], [1, 10], [1, 11], [1, 20], [1,
    20, 1], [1, 30]]

    Hooray! We've got the result we wanted. Now, convert integers back to string and
    re-join everything:

    >>> foo = ['.'.join(map(str, x)) for x in bar]
    >>> foo

    ['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20', '1.20.1', '1.30']

    That's what we expected...

    HTH
    --
    - Eric Brunel <eric (underscore) brunel (at) despammed (dot) com> -
    PragmaDev : Real Time Software Development Tools - http://www.pragmadev.com
     
    Eric Brunel, Apr 19, 2004
    #4
  5. wes weston Guest

    wrote:
    > I have a need to sort a list of elements that represent sections of a
    > document in dot-separated notation. The built in sort does the wrong thing.
    > This seems a rather complex problem and I was hoping someone smarter than me
    > had already worked out the best way to approach this. For example, consider
    > the following list-
    >
    >
    >>>>foo

    >
    > ['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20', '1.20.1',
    > '1.30']
    >
    >>>>foo.sort()
    >>>>foo

    >
    > ['1.0', '1.0.1', '1.1.1', '1.10', '1.11', '1.2', '1.20', '1.20.1', '1.30',
    > '1.9']
    >
    > Obviously 1.20.1 should be after 1.9 if we look at this as dot-delimited
    > integers, not as decimal numbers.
    >
    > Does anyone have pointers to existing code?
    >
    >
    >
    >
    >
    >
    >
    >




    list = ['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20',
    '1.20.1','1.30']

    def parse(str):
    list = str.split('.')
    if len(list) > 0:
    x1 = int(list[0])
    else:
    x1 = -1
    if len(list) > 1:
    x2 = int(list[1])
    else:
    x2 = -1
    if len(list) > 2:
    x3 = int(list[2])
    else:
    x3 = -1
    return x1,x2,x3

    def cmp(x1,x2):
    w1 = parse(x1)
    w2 = parse(x2)
    if w1[0] < w2[0]:
    return -1
    if w1[0] > w2[0]:
    return 1

    if w1[1] < w2[1]:
    return -1
    if w1[1] > w2[1]:
    return 1

    if w1[2] < w2[2]:
    return -1
    if w1[2] > w2[2]:
    return 1

    return 0

    #---------------------------
    if __name__ == "__main__":
    list.sort(cmp)
    for x in list:
    print x
     
    wes weston, Apr 19, 2004
    #5
  6. * (2004-04-19 15:08 +0100)
    > I have a need to sort a list of elements that represent sections of a
    > document in dot-separated notation. The built in sort does the wrong thing.
    > This seems a rather complex problem and I was hoping someone smarter than me
    > had already worked out the best way to approach this. For example, consider
    > the following list-
    >
    >>>> foo

    > ['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20', '1.20.1',
    > '1.30']
    >>>> foo.sort()
    >>>> foo

    > ['1.0', '1.0.1', '1.1.1', '1.10', '1.11', '1.2', '1.20', '1.20.1', '1.30',
    > '1.9']
    >
    > Obviously 1.20.1 should be after 1.9 if we look at this as dot-delimited
    > integers, not as decimal numbers.


    You need some general approach to avoid the DSU thing:

    def funcsort(seq, func):
    """ sort seq by func(item) """
    seq = seq[:]
    seq.sort(lambda x, y: cmp(func(x), func(y)))
    return seq

    funcsort(foo, lambda x: map(int, x.split('.')))


    Thorsten
     
    Thorsten Kampe, Apr 29, 2004
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. msnews.microsoft.com

    Document Outline in VS.NET! WOW!

    msnews.microsoft.com, Jul 25, 2004, in forum: ASP .Net
    Replies:
    3
    Views:
    1,729
    msnews.microsoft.com
    Jul 26, 2004
  2. King of Red Lions

    Text outline

    King of Red Lions, Apr 8, 2004, in forum: HTML
    Replies:
    9
    Views:
    20,563
    mikewandling
    Oct 25, 2008
  3. Terry Reedy

    Re: outline-style sorting algorithm

    Terry Reedy, Apr 19, 2004, in forum: Python
    Replies:
    5
    Views:
    324
    Fredrik Lundh
    Apr 20, 2004
  4. Delaney, Timothy C (Timothy)

    RE: outline-style sorting algorithm

    Delaney, Timothy C (Timothy), Apr 29, 2004, in forum: Python
    Replies:
    5
    Views:
    448
    Thorsten Kampe
    Apr 29, 2004
  5. Felix Collins

    HELP:sorting list of outline numbers

    Felix Collins, Aug 3, 2005, in forum: Python
    Replies:
    5
    Views:
    304
    Christopher Subich
    Aug 3, 2005
Loading...

Share This Page