Sorting a multidimensional array by multiple keys

R

Rehceb Rotkiv

Hello everyone,

can I sort a multidimensional array in Python by multiple sort keys? A
litte code sample would be nice!

Thx,
Rehceb
 
G

Guest

Rehceb said:
can I sort a multidimensional array in Python by multiple sort keys? A
litte code sample would be nice!

You can pass a function as argument to the sort method of a list.
The function should take two arguments and return -1, 0 or 1 as
comparison result. Just like the cmp function.

This will objects in list obj_lst by their id attributes:

def sorter(a, b):
return cmp(a.id, b.id)

obj_lst.sort(sorter)

Thomas
 
B

bearophileHUGS

Rehceb Rotkiv:
can I sort a multidimensional array in Python by multiple sort keys? A
litte code sample would be nice!

If you want a good answer you have to give me/us more details, and an
example too.

Bye,
bearophile
 
R

Rehceb Rotkiv

If you want a good answer you have to give me/us more details, and an
example too.

OK, here is some example data:

reaction is BUT by the
sodium , BUT it is
sea , BUT it is
this manner BUT the dissolved
pattern , BUT it is
rapid , BUT it is

As each line consists of 5 words, I would break up the data into an array
of five-field-arrays (Would you use lists or tuples or a combination in
Python?). The word "BUT" would be in the middle, with two fields/words
left and two fields/words right of it. I then want to sort this list by

- field 3
- field 4
- field 1
- field 0

in this hierarchy. This is the desired result:

pattern , BUT it is
rapid , BUT it is
sea , BUT it is
sodium , BUT it is
reaction is BUT by the
this manner BUT the dissolved

The first 4 lines all could not be sorted by fields 3 & 4, as they are
identical ("it", "is"), so they have been sorted first by field 1 (which
is also identical: ",") and then by field 0:

pattern
rapid
sea
sodium

I hope I have explained this in an understandable way. It would be cool
if you could show me how this can be done in Python!

Regards,
Rehceb
 
R

Rehceb Rotkiv

Wait, I made a mistake. The correct result would be

reaction is BUT by the
pattern , BUT it is
rapid , BUT it is
sea , BUT it is
sodium , BUT it is
this manner BUT the dissolved

because "by the" comes before and "the dissolved" after "it is". Sorry
for the confusion.
 
S

Steven Bethard

Rehceb said:
OK, here is some example data:

reaction is BUT by the
sodium , BUT it is
sea , BUT it is
this manner BUT the dissolved
pattern , BUT it is
rapid , BUT it is

As each line consists of 5 words, I would break up the data into an array
of five-field-arrays (Would you use lists or tuples or a combination in
Python?). The word "BUT" would be in the middle, with two fields/words
left and two fields/words right of it. I then want to sort this list by

- field 3
- field 4
- field 1
- field 0

You're probably looking for the key= argument to list.sort(). If your
function simply returns the fields in the order above, I believe you get
the right thing::
.... reaction is BUT by the
.... sodium , BUT it is
.... sea , BUT it is
.... this manner BUT the dissolved
.... pattern , BUT it is
.... rapid , BUT it is
.... '''
>>> word_lists = [line.split() for line in s.splitlines()]
>>> def key(word_list):
.... return word_list[3], word_list[4], word_list[1], word_list[0]
....[['reaction', 'is', 'BUT', 'by', 'the'],
['pattern', ',', 'BUT', 'it', 'is'],
['rapid', ',', 'BUT', 'it', 'is'],
['sea', ',', 'BUT', 'it', 'is'],
['sodium', ',', 'BUT', 'it', 'is'],
['this', 'manner', 'BUT', 'the', 'dissolved']]

STeVe
 
D

Duncan Booth

Rehceb Rotkiv said:
Wait, I made a mistake. The correct result would be

reaction is BUT by the
pattern , BUT it is
rapid , BUT it is
sea , BUT it is
sodium , BUT it is
this manner BUT the dissolved

because "by the" comes before and "the dissolved" after "it is". Sorry
for the confusion.
"reaction is BUT by the",
"sodium , BUT it is",
"sea , BUT it is",
"this manner BUT the dissolved",
"pattern , BUT it is",
"rapid , BUT it is",
]
data = [ s.split() for s in data]
from pprint import pprint
pprint(data)
[['reaction', 'is', 'BUT', 'by', 'the'],
['sodium', ',', 'BUT', 'it', 'is'],
['sea', ',', 'BUT', 'it', 'is'],
['this', 'manner', 'BUT', 'the', 'dissolved'],
['pattern', ',', 'BUT', 'it', 'is'],
['rapid', ',', 'BUT', 'it', 'is']][['reaction', 'is', 'BUT', 'by', 'the'],
['pattern', ',', 'BUT', 'it', 'is'],
['rapid', ',', 'BUT', 'it', 'is'],
['sea', ',', 'BUT', 'it', 'is'],
['sodium', ',', 'BUT', 'it', 'is'],
['this', 'manner', 'BUT', 'the', 'dissolved']]
 
A

attn.steven.kuo

(snipped)
As each line consists of 5 words, I would break up the data into an array
of five-field-arrays (Would you use lists or tuples or a combination in
Python?). The word "BUT" would be in the middle, with two fields/words
left and two fields/words right of it. I then want to sort this list by

- field 3
- field 4
- field 1
- field 0



import StringIO

buf = """
reaction is BUT by the
sodium , BUT it is
sea , BUT it is
this manner BUT the dissolved
pattern , BUT it is
rapid , BUT it is
""".lstrip()

mockfile = StringIO.StringIO(buf)

tokens = [ line.split() + [ line ] for line in mockfile ]
tokens.sort(key=lambda l: (l[3], l[4], l[1], l[0]))
for l in tokens:
print l[-1],
 
D

Duncan Booth

Peter Otten said:
Or, in Python 2.5:
Thanks, I'd forgotten itemgetter had that strangley assymmetric behaviour
of returning either a single value or a tuple.
 
P

Paulo da Silva

Rehceb Rotkiv escreveu:
Hello everyone,

can I sort a multidimensional array in Python by multiple sort keys? A
litte code sample would be nice!

class MyList(list):
# This is the index of the element to be compared
CmpIndex=0

# Comparision methods
@staticmethod
def __cmp_pars(x,y):
if isinstance(x,MyList):
x=x[MyList.CmpIndex]
if isinstance(y,MyList):
y=y[MyList.CmpIndex]
return x,y
def __cmp__(self,other):
s,o=MyList.__cmp_pars(self,other)
return cmp(s,o)
def __lt__(self,other):
s,o=MyList.__cmp_pars(self,other)
return s<o
def __le__(self,other):
s,o=MyList.__cmp_pars(self,other)
return s<=o
def __gt__(self,other):
s,o=MyList.__cmp_pars(self,other)
return s>o
def __ge__(self,other):
s,o=MyList.__cmp_pars(self,other)
return s>=o
def __eq__(self,other):
s,o=MyList.__cmp_pars(self,other)
return s==o
def __ne__(self,other):
s,o=MyList.__cmp_pars(self,other)
return s!=o

Use:

x=MyList(<list of lists>)
MyList.CmpIndex=2 # Compare by index 2
x.sort()

May be there is a better solution ...
HTH
Paulo
 
A

Alex Martelli

Thomas Krüger said:
You can pass a function as argument to the sort method of a list.
The function should take two arguments and return -1, 0 or 1 as
comparison result. Just like the cmp function.

This will objects in list obj_lst by their id attributes:

def sorter(a, b):
return cmp(a.id, b.id)

obj_lst.sort(sorter)

A MUCH better way to obtain exactly the same semantics would be:

def getid(a):
return a.id

obj_list.sort(key=getid)


Alex
 
?

=?ISO-8859-1?Q?Thomas_Kr=FCger?=

Alex said:
A MUCH better way to obtain exactly the same semantics would be:

def getid(a):
return a.id

obj_list.sort(key=getid)

Frankly speaking the purpose of the example was to show how to pass a
function as argument for the sort method.
Your code may be more efficient but it explains something different.

Thomas
 
S

Steven Bethard

Thomas said:
Frankly speaking the purpose of the example was to show how to pass a
function as argument for the sort method.
Your code may be more efficient but it explains something different.

Yes, but there's almost never a reason to use the cmp= argument to
sort() anymore. It's almost always better to use the key= argument.

STeVe
 
B

bearophileHUGS

Steven Bethard:
there's almost never a reason to use the cmp= argument to
sort() anymore. It's almost always better to use the key= argument.

I always use key now, but maybe cmp uses less memory. There can be few
situations where cmp is better still.

Bye,
bearophile
 
A

Alex Martelli

Steven Bethard said:
Yes, but there's almost never a reason to use the cmp= argument to
sort() anymore. It's almost always better to use the key= argument.

Exactly. Passing a "comparison function" rather than a key-extraction
function is rarely a good idea -- except for some extremely complicated
cases involving e.g. certain string (or more complicated) fields needing
to be sorted in the reverse direction from others. In practice, the
existence of that argument and its prominent position as the first
positional argument is due entirely to backwards compatibility issues.


Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top