outline-style sorting algorithm

jwsacksteder · Apr 19, 2004

I have a need to sort a list of elements that represent sections of a
document in dot-separated notation. The built in sort does the wrong thing.
This seems a rather complex problem and I was hoping someone smarter than me
had already worked out the best way to approach this. For example, consider
the following list-
['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20', '1.20.1',
'1.30']['1.0', '1.0.1', '1.1.1', '1.10', '1.11', '1.2', '1.20', '1.20.1', '1.30',
'1.9']

Obviously 1.20.1 should be after 1.9 if we look at this as dot-delimited
integers, not as decimal numbers.

Does anyone have pointers to existing code?

Max M · Apr 19, 2004

> I have a need to sort a list of elements that represent sections of a
> document in dot-separated notation. The built in sort does the wrong

thing.

Not really. You are giving it a list of strings, and it sort those
alphabetically. That seems like the right thing to me ;-)

> This seems a rather complex problem and I was hoping someone smarter than me
> had already worked out the best way to approach this. For example, consider
> the following list- >
> ['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20', '1.20.1',
> '1.30']

You need to convert them to another datatype first. Your best bet here
would be a list or a tuple, as they can map directly to your data.

'1.0.1'.split('.') == [1,0,1]

But list are a bit easier here.

foo_as_tuples = [f.split('.') for f in foo]
foo_as_tuples.sort()

Then you must convert it back to strings again.

foo = ['.'.join(f) for f in foo_as_tuples]

There is a standard way of sorting quickly in python, called
decorate-sort-undecorate. It is allmost the same example as before:

decorated = [(itm.split('.'),itm) for itm in foo]
decorated.sort()
foo = [d[-1] for d in decorated]

regards Max M

Max M · Apr 19, 2004

> I have a need to sort a list of elements that represent sections of a
> document in dot-separated notation. The built in sort does the wrong

thing.

Not really. You are giving it a list of strings, and it sort those
alphabetically. That seems like the right thing to me ;-)

> This seems a rather complex problem and I was hoping someone smarter than me
> had already worked out the best way to approach this. For example, consider
> the following list- >
> ['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20', '1.20.1',
> '1.30']

You need to convert them to another datatype first. Your best bet here
would be a list or a tuple, as they can map directly to your data.

'1.0.1'.split('.') == [1,0,1]

But list are a bit easier here.

foo_as_tuples = [f.split('.') for f in foo]
foo_as_tuples.sort()

Then you must convert it back to strings again.

foo = ['.'.join(f) for f in foo_as_tuples]

There is a standard way of sorting quickly in python, called
decorate-sort-undecorate. It is allmost the same example as before:

decorated = [(itm.split('.'),itm) for itm in foo]
decorated.sort()
foo = [d[-1] for d in decorated]

regards Max M

Eric Brunel · Apr 19, 2004

Max said:
I have a need to sort a list of elements that represent sections of a
document in dot-separated notation. The built in sort does the wrong

Click to expand...

thing.

Not really. You are giving it a list of strings, and it sort those
alphabetically. That seems like the right thing to me ;-)

This seems a rather complex problem and I was hoping someone smarter than me
had already worked out the best way to approach this. For example, consider
the following list-

foo

Click to expand...

['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20', '1.20.1',
'1.30']

Click to expand...

You need to convert them to another datatype first. Your best bet here
would be a list or a tuple, as they can map directly to your data.

'1.0.1'.split('.') == [1,0,1]

[snip]

Nope:
'1.0.1.split('.') == ['1', '0', '1']

So the int's are still represented as strings and it does not solve the OP's
problem:

>>> foo = ['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20', '1.20.1', '1.30']
>>> bar = [x.split('.') for x in foo]
>>> bar

Click to expand...

Click to expand...

[['1', '0'], ['1', '0', '1'], ['1', '1', '1'], ['1', '2'], ['1', '9'], ['1',
'10'], ['1', '11'], ['1', '20'], ['1', '20', '1'], ['1', '30']][['1', '0'], ['1', '0', '1'], ['1', '1', '1'], ['1', '10'], ['1', '11'], ['1',
'2'], ['1', '20'], ['1', '20', '1'], ['1', '30'], ['1', '9']]

And the 1.20.something are still before 1.9

What you need to do is explicitely convert to integers the strings in the list
resulting from the split. Here is the shortest way to do it

>>> bar = [map(int, x.split('.')) for x in foo]
>>> bar

Click to expand...

Click to expand...

[[1, 0], [1, 0, 1], [1, 1, 1], [1, 2], [1, 9], [1, 10], [1, 11], [1, 20], [1,
20, 1], [1, 30]]

Now you can sort:
[[1, 0], [1, 0, 1], [1, 1, 1], [1, 2], [1, 9], [1, 10], [1, 11], [1, 20], [1,
20, 1], [1, 30]]

Hooray! We've got the result we wanted. Now, convert integers back to string and
re-join everything:

>>> foo = ['.'.join(map(str, x)) for x in bar]
>>> foo

Click to expand...

Click to expand...

['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20', '1.20.1', '1.30']

That's what we expected...

HTH

wes weston · Apr 19, 2004

I have a need to sort a list of elements that represent sections of a
document in dot-separated notation. The built in sort does the wrong thing.
This seems a rather complex problem and I was hoping someone smarter than me
had already worked out the best way to approach this. For example, consider
the following list-

['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20', '1.20.1',
'1.30']

['1.0', '1.0.1', '1.1.1', '1.10', '1.11', '1.2', '1.20', '1.20.1', '1.30',
'1.9']

Obviously 1.20.1 should be after 1.9 if we look at this as dot-delimited
integers, not as decimal numbers.

Does anyone have pointers to existing code?

list = ['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20',
'1.20.1','1.30']

def parse(str):
list = str.split('.')
if len(list) > 0:
x1 = int(list[0])
else:
x1 = -1
if len(list) > 1:
x2 = int(list[1])
else:
x2 = -1
if len(list) > 2:
x3 = int(list[2])
else:
x3 = -1
return x1,x2,x3

def cmp(x1,x2):
w1 = parse(x1)
w2 = parse(x2)
if w1[0] < w2[0]:
return -1
if w1[0] > w2[0]:
return 1

if w1[1] < w2[1]:
return -1
if w1[1] > w2[1]:
return 1

if w1[2] < w2[2]:
return -1
if w1[2] > w2[2]:
return 1

return 0

#---------------------------
if __name__ == "__main__":
list.sort(cmp)
for x in list:
print x

Thorsten Kampe · Apr 29, 2004

* (e-mail address removed) (2004-04-19 15:08 +0100)

I have a need to sort a list of elements that represent sections of a
document in dot-separated notation. The built in sort does the wrong thing.
This seems a rather complex problem and I was hoping someone smarter than me
had already worked out the best way to approach this. For example, consider
the following list-
['1.0', '1.0.1', '1.1.1', '1.2', '1.9', '1.10', '1.11', '1.20', '1.20.1',
'1.30']['1.0', '1.0.1', '1.1.1', '1.10', '1.11', '1.2', '1.20', '1.20.1', '1.30',
'1.9']

Obviously 1.20.1 should be after 1.9 if we look at this as dot-delimited
integers, not as decimal numbers.

You need some general approach to avoid the DSU thing:

def funcsort(seq, func):
""" sort seq by func(item) """
seq = seq[:]
seq.sort(lambda x, y: cmp(func(x), func(y)))
return seq

funcsort(foo, lambda x: map(int, x.split('.')))

Thorsten

comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006
The devolution of English language and slothful c.l.p behaviors exposed!	50	Jan 24, 2012
comp.lang.c FAQ list Table of Contents	0	Jan 12, 2008
Announcement: Super 5.00 - a suite of J2EE tools.	0	Jan 7, 2004
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Jan 12, 2008
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Mar 1, 2008
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Jan 1, 2008
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Dec 15, 2007

outline-style sorting algorithm

jwsacksteder

Max M

Max M

Eric Brunel

wes weston

Thorsten Kampe

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads