Is there an easy way to sort a list by two criteria?

N

neocortex

Hello!
I am a newbie in Python. Recently, I get stuck with the problem of
sorting by two criteria. In brief, I have a two-dimensional list (for
a table or a matrix). Now, I need to sort by two columns, but I cannot
figure out how to do that. I read somewhere that it is possible to do:but it does not work.
Can anyone help me with this?
PS: I am using Python under Ubuntu 6.06.

Best,
PM
 
J

Jeff Schwab

neocortex said:
Hello!
I am a newbie in Python. Recently, I get stuck with the problem of
sorting by two criteria. In brief, I have a two-dimensional list (for
a table or a matrix). Now, I need to sort by two columns, but I cannot
figure out how to do that. I read somewhere that it is possible to do:
but it does not work.
Can anyone help me with this?
PS: I am using Python under Ubuntu 6.06.

You can specify an arbitrary comparison function with the cmp key to
sort. IOW, use table.sort(cmp=f), where f is defined to compare table
entries (rows?) by whichever criteria are required.
 
S

Steve Holden

neocortex said:
Hello!
I am a newbie in Python. Recently, I get stuck with the problem of
sorting by two criteria. In brief, I have a two-dimensional list (for
a table or a matrix). Now, I need to sort by two columns, but I cannot
figure out how to do that. I read somewhere that it is possible to do:
but it does not work.
Can anyone help me with this?
PS: I am using Python under Ubuntu 6.06.
I think your best bet would be to google for "decorate sort undecorate"
or "Schwartzian transform" unless the columns happen to be in the
right order in the rows you want to sort.

regards
Steve
 
S

Steven D'Aprano

Hello!
I am a newbie in Python. Recently, I get stuck with the problem of
sorting by two criteria. In brief, I have a two-dimensional list (for a
table or a matrix). Now, I need to sort by two columns, but I cannot
figure out how to do that.


Can you give a (small, simple) example of your data, and what you expect
if you sort it successfully?

I read somewhere that it is possible to do:
but it does not work.

No it doesn't, because the sort() method returns None, and None doesn't
have a sort() method of its own.

table.sort() will sort table in place, so you need something like this:

except naturally that just does the exact same sort twice in a row, which
is silly. So what you actually need is something like this:

where the first piece of magic tells Python to sort by the first column,
and the second by the second column. But to do that, we need to know more
about how you're setting up the table.

I'm *guessing* that you probably have something like this:


table = [ ['fred', 35, 8], # name, age, score
['bill', 29, 8],
['betty', 30, 9],
['morris', 17, 4],
['catherine', 23, 6],
['anna', 45, 8],
['george', 19, 5],
['tanya', 27, 7],
]


Now let's sort it:

[['morris', 17, 4],
['george', 19, 5],
['catherine', 23, 6],
['tanya', 27, 7],
['anna', 45, 8],
['bill', 29, 8],
['fred', 35, 8],
['betty', 30, 9]]



Does this help?
 
T

thebjorn

Hello!
I am a newbie in Python. Recently, I get stuck with the problem of
sorting by two criteria. In brief, I have a two-dimensional list (for
a table or a matrix). Now, I need to sort by two columns, but I cannot
figure out how to do that. I read somewhere that it is possible to do:>>> table.sort().sort()

but it does not work.
Can anyone help me with this?
PS: I am using Python under Ubuntu 6.06.

Best,
PM

I'm not sure which Python is default for Ubuntu 6.06, but assuming you
can access a recent one (2.4), the list.sort() function takes a key
argument (that seems to be rather sparsely documented in the tutorial
and the docstring...). E.g.:
lst = [(1,2,4),(3,2,1),(2,2,2),(2,1,4),(2,4,1)]
lst.sort(key=lambda (a,b,c):(c,b))
lst [(3, 2, 1), (2, 4, 1), (2, 2, 2), (2, 1, 4), (1, 2, 4)]

The "fancy" lambda simply takes a source-tuple and returns a tuple of
the keys to be sorted on, in this case sort on the last element, then
on the middle element.

You can use the cmp argument to get similar results (with the same
list as above):
[(3, 2, 1), (2, 4, 1), (2, 2, 2), (2, 1, 4), (1, 2, 4)]

The lambda in this case is starting to get complicated though, so
probably better to write a separate function. Using the cmp argument
is slower than using the key argument since key is only called once
for each element in the list while cmp is called for each comparison
of elements (it's not the number of times the function is called
that's the big deal, but rather that the highly optimized sort needs
to continually call back to Python in the cmp case).

The old-school way of doing this is using a Schwartzian transform
(a.k.a. decorate-sort-undecorate) which creates an auxilliary list
with the keys in correct sorting order so that sort() can work
directly:
lst [(1, 2, 4), (3, 2, 1), (2, 2, 2), (2, 1, 4), (2, 4, 1)]
decorate = [(x[2],x[1],x) for x in lst]
decorate.sort()
decorate
[(1, 2, (3, 2, 1)), (1, 4, (2, 4, 1)), (2, 2, (2, 2, 2)), (4, 1, (2,
1, 4)), (4, 2, (1, 2, 4))]
lst = [x[2] for x in decorate] # undecorate
lst [(3, 2, 1), (2, 4, 1), (2, 2, 2), (2, 1, 4), (1, 2, 4)]

hth,
-- bjorn
 
D

Duncan Booth

thebjorn said:
I'm not sure which Python is default for Ubuntu 6.06, but assuming you
can access a recent one (2.4), the list.sort() function takes a key
argument (that seems to be rather sparsely documented in the tutorial
and the docstring...). E.g.:
lst = [(1,2,4),(3,2,1),(2,2,2),(2,1,4),(2,4,1)]
lst.sort(key=lambda (a,b,c):(c,b))
lst [(3, 2, 1), (2, 4, 1), (2, 2, 2), (2, 1, 4), (1, 2, 4)]

It may be simpler just to use the key argument multiple times (not
forgetting to specify the keys in reverse order, i.e. the most significant
comes last). So with this example, sorting by column 2 then column 1 and
ignoring column 0 can be done by:
from operator import itemgetter
lst = [(1,2,4),(3,2,1),(2,2,2),(2,1,4),(2,4,1)]
lst.sort(key=itemgetter(1))
lst.sort(key=itemgetter(2))
lst
[(3, 2, 1), (2, 4, 1), (2, 2, 2), (2, 1, 4), (1, 2, 4)]


or even:
from operator import itemgetter
lst = [(1,2,4),(3,2,1),(2,2,2),(2,1,4),(2,4,1)]
for keycolumn in reversed([2,1]):
lst.sort(key=itemgetter(keycolumn))

[(3, 2, 1), (2, 4, 1), (2, 2, 2), (2, 1, 4), (1, 2, 4)]

The important point here is to remember that the sort is stable (so you can
do multiple sorts without disrupting earlier results).
 
N

neocortex

Hello!
Thank you all, so much! Now I can do double-criteria sort in at least
three ways. More than I have expected.

Best,
PM
 
B

bearophileHUGS

[repost]

Duncan Booth:
from operator import itemgetter
lst = [(1,2,4),(3,2,1),(2,2,2),(2,1,4),(2,4,1)]
lst.sort(key=itemgetter(1))
lst.sort(key=itemgetter(2))
lst
[(3, 2, 1), (2, 4, 1), (2, 2, 2), (2, 1, 4), (1, 2, 4)]

A little known thing from Python 2.5:
from operator import itemgetter
lst = [(1,2,4),(3,2,1),(2,2,2),(2,1,4),(2,4,1)]
sorted(lst, key=itemgetter(2, 1))
[(3, 2, 1), (2, 4, 1), (2, 2, 2), (2, 1, 4), (1, 2, 4)]

Bye,
bearophile
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,906
Latest member
SkinfixSkintag

Latest Threads

Top