sorting list of complex numbers

S

skip

The thread on sorting in Python 3 got me to thinking. How could I sort a
list of complex numbers using key?
>>> lst = [random.random()+random.random()*1j for i in range(10)]
>>> lst
[(0.32672251849959244+0.41428983433288791j), (0.35238056484609881+0.92758203977208264j), (0.19337824038125528+0.16527285180541951j), (0.47569307114525849+0.72381960418815128j), (0.21498813135082351+0.2046308266222292j), (0.30142745756937939+0.35216751451102601j), (0.77355676386939132+0.0023447924287695043j), (0.2547736124606309+0.52234837788936905j), (0.38349189081350132+0.62017617694427096j), (0.58362096773561245+0.87702443273108477j)]

As expected:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: no ordering relation is defined for complex numbers

This works:
[(0.19337824038125528+0.16527285180541951j), (0.21498813135082351+0.2046308266222292j), (0.2547736124606309+0.52234837788936905j), (0.30142745756937939+0.35216751451102601j), (0.32672251849959244+0.41428983433288791j), (0.35238056484609881+0.92758203977208264j), (0.38349189081350132+0.62017617694427096j), (0.47569307114525849+0.72381960418815128j), (0.58362096773561245+0.87702443273108477j), (0.77355676386939132+0.0023447924287695043j)]

but what if I want to sort by real, then imaginary parts? Here's a longer
list with 20 elements where there are only 10 distinct reals but 20 distinct
imaginaries:
[(1+2.73j),
(9+3.77j),
(7+27j),
(8+28j),
(2+2.8600000000000003j),
(4+3.1200000000000001j),
(2+22j),
(9+29j),
(3+2.9900000000000002j),
(6+26j),
2.6000000000000001j,
(8+3.6400000000000001j),
(3+23j),
(5+3.25j),
(1+21j),
(5+25j),
20j,
(6+3.3799999999999999j),
(7+3.5100000000000002j),
(4+24j)]

I can sort by the real parts just fine:
[2.6000000000000001j,
20j,
(1+2.73j),
(1+21j),
(2+2.8600000000000003j),
(2+22j),
(3+2.9900000000000002j),
(3+23j),
(4+3.1200000000000001j),
(4+24j),
(5+3.25j),
(5+25j),
(6+26j),
(6+3.3799999999999999j),
(7+27j),
(7+3.5100000000000002j),
(8+28j),
(8+3.6400000000000001j),
(9+3.77j),
(9+29j)]

but how do I then do a secondary sort by the imaginary part, preserving the
existing ordering on the real parts? Seems like I have to resort to a
Schwartzian transform and map the complex numbers to tuples, sort that, then
map them back. With the cmp key it would have been a fairly trivial task to
define the desired compare-real-then-imag function.

Is there a way to do this using just the key arg, no extra data structures?

Skip
 
S

skip

Duncan> Is a tuple an extra data structure?
Duncan> [2.6000000000000001j,
Duncan> 20j,
Duncan> (1+2.73j),
Duncan> (1+21j),
Duncan> (2+2.8600000000000003j),
Duncan> (2+22j),
Duncan> (3+2.9900000000000002j),
Duncan> (3+23j),
Duncan> (4+3.1200000000000001j),
Duncan> (4+24j),
Duncan> (5+3.25j),
Duncan> (5+25j),
Duncan> (6+3.3799999999999999j),
Duncan> (6+26j),
Duncan> (7+3.5100000000000002j),
Duncan> (7+27j),
Duncan> (8+3.6400000000000001j),
Duncan> (8+28j),
Duncan> (9+3.77j),
Duncan> (9+29j)]

Ah, thanks. I'm still getting used to this key thing, always having relied
...

I tried that. I could have sworn when I read through the output it hadn't
retained the order of the real parts. Wait a minute... I sorted by real
them by imag. So the trick is to sort by primary sort key last.

Timeit suggests the single sort returning the real, imag tuples is faster
than two sorts each on one field, as you might expect since many fewer calls
to the key function are made.

Thanks,

Skip
 
D

Diez B. Roggisch

The thread on sorting in Python 3 got me to thinking. How could I sort a
list of complex numbers using key?
lst = [random.random()+random.random()*1j for i in range(10)]
lst
[(0.32672251849959244+0.41428983433288791j), (0.35238056484609881+0.92758203977208264j), (0.19337824038125528+0.16527285180541951j), (0.47569307114525849+0.72381960418815128j), (0.21498813135082351+0.2046308266222292j), (0.30142745756937939+0.35216751451102601j), (0.77355676386939132+0.0023447924287695043j), (0.2547736124606309+0.52234837788936905j), (0.38349189081350132+0.62017617694427096j), (0.58362096773561245+0.87702443273108477j)]

As expected:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: no ordering relation is defined for complex numbers

This works:
[(0.19337824038125528+0.16527285180541951j), (0.21498813135082351+0.2046308266222292j), (0.2547736124606309+0.52234837788936905j), (0.30142745756937939+0.35216751451102601j), (0.32672251849959244+0.41428983433288791j), (0.35238056484609881+0.92758203977208264j), (0.38349189081350132+0.62017617694427096j), (0.47569307114525849+0.72381960418815128j), (0.58362096773561245+0.87702443273108477j), (0.77355676386939132+0.0023447924287695043j)]

but what if I want to sort by real, then imaginary parts? Here's a longer
list with 20 elements where there are only 10 distinct reals but 20 distinct
imaginaries:
[(1+2.73j),
(9+3.77j),
(7+27j),
(8+28j),
(2+2.8600000000000003j),
(4+3.1200000000000001j),
(2+22j),
(9+29j),
(3+2.9900000000000002j),
(6+26j),
2.6000000000000001j,
(8+3.6400000000000001j),
(3+23j),
(5+3.25j),
(1+21j),
(5+25j),
20j,
(6+3.3799999999999999j),
(7+3.5100000000000002j),
(4+24j)]

I can sort by the real parts just fine:
[2.6000000000000001j,
20j,
(1+2.73j),
(1+21j),
(2+2.8600000000000003j),
(2+22j),
(3+2.9900000000000002j),
(3+23j),
(4+3.1200000000000001j),
(4+24j),
(5+3.25j),
(5+25j),
(6+26j),
(6+3.3799999999999999j),
(7+27j),
(7+3.5100000000000002j),
(8+28j),
(8+3.6400000000000001j),
(9+3.77j),
(9+29j)]

but how do I then do a secondary sort by the imaginary part, preserving the
existing ordering on the real parts? Seems like I have to resort to a
Schwartzian transform and map the complex numbers to tuples, sort that, then
map them back. With the cmp key it would have been a fairly trivial task to
define the desired compare-real-then-imag function.

Is there a way to do this using just the key arg, no extra data structures?

Doesn't (untested)

key=lambda x: (x.real, x.imag)

work?

Diez
 
T

Terry Reedy

Duncan> If you don't like the tuple then just do the two sorts separately:

...

I tried that. I could have sworn when I read through the output it hadn't
retained the order of the real parts. Wait a minute... I sorted by real
them by imag. So the trick is to sort by primary sort key last.

This is called radix sorting. It requires that the sort be order
preserving, that it preserve the order of items with equal keys.
It is an old technique. It was used, for instance, to sort punched
cards on a card sorter that sorted cards on one columm at a time. One
had to be careful to properly remove and combine the cards after each
pass; make one mistake and start over. I once helped someone sort 1000s
of student record cards by 6- or 7-digit id.

Terry Jan Reedy
 
S

Steve Holden

Duncan> If you don't like the tuple then just do the two sorts separately:

...

I tried that. I could have sworn when I read through the output it hadn't
retained the order of the real parts. Wait a minute... I sorted by real
them by imag. So the trick is to sort by primary sort key last.

Timeit suggests the single sort returning the real, imag tuples is faster
than two sorts each on one field, as you might expect since many fewer calls
to the key function are made.
Only half the number, of course. The advantage of the key function is
that each element requires only one call out to a Python function, and
the comparisons then take place using a C-coded comparison function.

But you knew that already, right?

regards
Steve
 
S

Steve Holden

Duncan> If you don't like the tuple then just do the two sorts separately:

...

I tried that. I could have sworn when I read through the output it hadn't
retained the order of the real parts. Wait a minute... I sorted by real
them by imag. So the trick is to sort by primary sort key last.

Timeit suggests the single sort returning the real, imag tuples is faster
than two sorts each on one field, as you might expect since many fewer calls
to the key function are made.
Only half the number, of course. The advantage of the key function is
that each element requires only one call out to a Python function, and
the comparisons then take place using a C-coded comparison function.

But you knew that already, right?

regards
Steve
 
S

skip

Steve> Only half the number, of course. The advantage of the key
Steve> function is that each element requires only one call out to a
Steve> Python function, and the comparisons then take place using a
Steve> C-coded comparison function.

Steve> But you knew that already, right?

I knew about the C comparison function, not about the number of key calls.
I sort of assumed the key function was called whenever necessary since it
could have side effects. I confirmed that the key function is called once
per element instead of once per comparison.

Thanks,

Skip
 
T

Thomas Bellman

Steve Holden said:
Only half the number, of course. The advantage of the key function is
that each element requires only one call out to a Python function, and
the comparisons then take place using a C-coded comparison function.

You don't need any Python-coded function at all. The operator
module is your friend: key=operator.attrgetter('real', 'imag')
will create the required tuples for sorting.
 
S

Steve Holden

Thomas said:
You don't need any Python-coded function at all. The operator
module is your friend: key=operator.attrgetter('real', 'imag')
will create the required tuples for sorting.
True; "requires only one call out to the key function", then. You're
right, attrgetter will be faster still, and it's a really neat solution.

regards
Steve
 
P

Paul Rubin

Duncan Booth said:
If you don't like the tuple then just do the two sorts separately:

That only goes so far though. Suppose instead of complex numbers
you want to sort expressions, like:

a(b,c(d,e),f(g,h))

treat those as parse trees in the obvious way. You want to compare on
the root, and if the roots are equal, compare the left subtree, then
the right subtree, etc. Do you REALLY want to sort over and over
all the way to the max depth of all the trees?

I don't understand what the purpose of was of getting rid of the cmp
arg. It just seems to gratuitously make code hard to write.

Another thing that apparently broke was tuple destructuring in
function args. You can no longer say

def first((a,b)): return a
def second((a,b)): return b

make getters for the first and second elements of a tuple. Sure, you
could use operator.itemgetter for that example, but often you want to
parse out more complicated nested tuples. I do that all the time with
Python 2.5 (typically in conjunction with itertools.groupby) but
if/when I downgrade to 3.0, a bunch of my code is going to break.
 
P

Paul Rubin

but how do I then do a secondary sort by the imaginary part,...
Is there a way to do this using just the key arg, no extra data structures?

Clever solutions involving multiple sorts aside, I think what they
really want you to do is something like (untested):

class mkKey(complex):
def __lt__(self, other):
a = cmp(self.real, other.real)
return a if a else cmp(self.imag, other.imag)

then:

yourlist.sort(key=mkKey)

for fancier structures you'd need a full blown class implementation
with an init method. Either way you end up temporarily allocating a
lot of extra structures, but at least they're not all in memory
simultaneously like in the DSU pattern.
 
G

Gabriel Genellina

En Tue, 18 Nov 2008 08:41:58 -0200, Paul Rubin
Clever solutions involving multiple sorts aside, I think what they
really want you to do is something like (untested):

class mkKey(complex):
def __lt__(self, other):
a = cmp(self.real, other.real)
return a if a else cmp(self.imag, other.imag)

then:

yourlist.sort(key=mkKey)

for fancier structures you'd need a full blown class implementation
with an init method. Either way you end up temporarily allocating a
lot of extra structures, but at least they're not all in memory
simultaneously like in the DSU pattern.

Yes, the keys for all items in the list are all created when the sort
begins, and live until the sort finishes. list.sort(key=...) is actually
implemented using the DSU pattern, something like this:

for i in range(len(alist)):
alist = (key(alist), alist)
# ...sort items...
for i in range(len(alist)):
alist = alist[1]

except a custom `sortwrapper` object is used instead of a 2-item tuple.
 
A

Arnaud Delobelle

Paul Rubin said:
for fancier structures you'd need a full blown class implementation
with an init method. Either way you end up temporarily allocating a
lot of extra structures, but at least they're not all in memory
simultaneously like in the DSU pattern.

The implementation of python sort uses the DSU patterns.
 
T

Terry Reedy

Paul said:
That only goes so far though. Suppose instead of complex numbers
you want to sort expressions, like:

a(b,c(d,e),f(g,h))

treat those as parse trees in the obvious way.

Presuming that a, c, and f are function calls that return Tree
instances, you give Tree a __lt__ member that does what you say below.
You want to compare on
the root, and if the roots are equal, compare the left subtree, then
the right subtree, etc.
> Do you REALLY want to sort over and over
all the way to the max depth of all the trees?

Don't understand this.
I don't understand what the purpose of was of getting rid of the cmp
arg.

Cmp requires a 3-way classification. Sorting only requires 2-way.
cmp was called nlogn times, key n times.
cmp in practical use amounted to calling some key function on both
items, so cmp amounted to 2*n*log(n) key calls pluse extra overhead.
Another thing that apparently broke was tuple destructuring in
function args. You can no longer say

def first((a,b)): return a
def second((a,b)): return b

make getters for the first and second elements of a tuple. Sure, you
could use operator.itemgetter for that example, but often you want to
parse out more complicated nested tuples. I do that all the time with
Python 2.5 (typically in conjunction with itertools.groupby) but
if/when I downgrade to 3.0, a bunch of my code is going to break.

Do your tuple destructuring in the first statement in your body and
nothing will break. Don't upgrade until it is an upgrade for *you*.
 
H

Hrvoje Niksic

Terry Reedy said:
Do your tuple destructuring in the first statement in your body and
nothing will break.

Unless you were using a lambda, which is quite useful as argument to
"sort".
 
T

Terry Reedy

Hrvoje said:
Unless you were using a lambda, which is quite useful as argument to
"sort".

So write a separate def statement.
If you do not want to do that, don't run with 3.0.
 
H

Hrvoje Niksic

Terry Reedy said:
So write a separate def statement.

That is a valid possibility, but loses the succinctness of a short
lambda. Some uses of lambda+unpacking can also be replaced by
operator.itemgetter.
If you do not want to do that, don't run with 3.0.

Tuple unpacking in parameters is a useful feature, but hardly reason
enough to abandon the future of the language.
 
P

Paul Rubin

Why get rid of a useful feature that unclutters code?
So write a separate def statement.
If you do not want to do that, don't run with 3.0.

You are advising avoiding downgrading from 2.5 to 3.0, which is maybe
the best plan for some users under the circumstances, but some of us
normally expect a new major release to be an upgrade rather than a
downgrade.
 
T

Terry Reedy

Paul said:
Why get rid of a useful feature that unclutters code?

Because, as has been posted before, it is rarely used, it is
replaceable, its presence interfered with adding a new signature option,
and its removal uncluttered a bit the C code.
 
S

Steven D'Aprano

Why get rid of a useful feature that unclutters code?

Unfortunately, the people who find it useful are a minority. Most people
don't care for it, and the Python dev team decided that the code they
most wanted to unclutter was their own CPython implementation.

Personally, I'm with you on this one: it's a small step backwards. But
only a *small* step. Whether it is outweighed by the steps forward, well,
time will tell.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,062
Latest member
OrderKetozenseACV

Latest Threads

Top