sort array, apply rearrangement to second

V

Victor Eijkhout

I have two arrays, made with numpy. The first one has values that I want
to use as sorting keys; the second one needs to be sorted by those keys.
Obviously I could turn them into a dictionary of pairs and sort by the
first member, but I think that's not very efficient, at least in space,
and this needs to be done as efficiently as possible.

I could use a hand.

Victor.
 
A

Alf P. Steinbach

* Victor Eijkhout:
I have two arrays, made with numpy. The first one has values that I want
to use as sorting keys; the second one needs to be sorted by those keys.
Obviously I could turn them into a dictionary of pairs and sort by the
first member, but I think that's not very efficient, at least in space,
and this needs to be done as efficiently as possible.

I could use a hand.

Just do the pairing, but in a 'list', not a dictionary (a dictionary is
unordered and can't be sorted). You need to keep track of which keys belong to
which values anyway. And anything in Python is a reference: you're not copying
the data by creating the pairs. That is, the space overhead is proportional to
the number of items but is independent of the data size of each item.


Cheers & hth.,

- Alf
 
S

Steve Holden

Victor said:
I have two arrays, made with numpy. The first one has values that I want
to use as sorting keys; the second one needs to be sorted by those keys.
Obviously I could turn them into a dictionary of pairs and sort by the
first member, but I think that's not very efficient, at least in space,
and this needs to be done as efficiently as possible.

I could use a hand.
Well, my first approach would be to do it as inefficiently as I can ( or
at least no more efficiently than I can with a simple-minded approach)
and then take it from there.

If I believe this is not a premature optimization (a question about
which I am currently skeptical) I'd suggest conversion to a list of
pairs rather than a dict.

Can you use zip() on numpy arrays? That would be the easiest way to
create the list of pairs.

regards
Steve
 
M

MRAB

Victor said:
I have two arrays, made with numpy. The first one has values that I want
to use as sorting keys; the second one needs to be sorted by those keys.
Obviously I could turn them into a dictionary of pairs and sort by the
first member, but I think that's not very efficient, at least in space,
and this needs to be done as efficiently as possible.

I could use a hand.
You could sort a list of the indices, using the first array to provide
the keys.
 
R

Robert Kern

I have two arrays, made with numpy. The first one has values that I want
to use as sorting keys; the second one needs to be sorted by those keys.
Obviously I could turn them into a dictionary of pairs and sort by the
first member, but I think that's not very efficient, at least in space,
and this needs to be done as efficiently as possible.

second[first.argsort()]

Ask numpy questions on the numpy mailing list.

http://www.scipy.org/Mailing_Lists

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
R

Robert Kern

Robert Kern said:
second[first.argsort()]

Really cool. Thanks.
Ask numpy questions on the numpy mailing list.

I will. I thought that this question would have an answer in a generic
python idiom.

When dealing with numpy arrays, the generic Python idiom is often much slower.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
S

Steve Holden

Victor said:
Robert Kern said:
second[first.argsort()]

Really cool. Thanks.
Ask numpy questions on the numpy mailing list.

I will. I thought that this question would have an answer in a generic
python idiom.

Victor.

Not an unreasonable assumption, but it turns out that for most Python
users (estimate PFTA: 97%) numpy/scipt is esoteric knowledge.

regards
Steve
 
R

Raymond Hettinger

I have two arrays, made with numpy. The first one has values that I want
to use as sorting keys; the second one needs to be sorted by those keys.
Obviously I could turn them into a dictionary  of pairs and sort by the
first member, but I think that's not very efficient, at least in space,
and this needs to be done as efficiently as possible.

Alf's recommendation is clean and correct. Just make a list of
tuples.

FWIW, here's a little hack that does the work for you:
values = ['A', 'B', 'C', 'D', 'E']
keys = [50, 20, 40, 10, 30]
keyiter = iter(keys)
sorted(values, key=lambda k: next(keyiter))
['D', 'B', 'E', 'C', 'A']


Raymond
 
S

Steve Howell

I have two arrays, made with numpy. The first one has values that I want
to use as sorting keys; the second one needs to be sorted by those keys..
Obviously I could turn them into a dictionary  of pairs and sort by the
first member, but I think that's not very efficient, at least in space,
and this needs to be done as efficiently as possible.

Alf's recommendation is clean and correct.  Just make a list of
tuples.

FWIW, here's a little hack that does the work for you:
values = ['A', 'B', 'C', 'D', 'E']
keys = [50, 20, 40, 10, 30]
keyiter = iter(keys)
sorted(values, key=lambda k: next(keyiter))

['D', 'B', 'E', 'C', 'A']

Another option:

[values for i in sorted(range(len(keys)), key=lambda i: keys)]

Sort the indexes according to keys values, then use indexes to get the
values.

It might read more clearly when broken out into two lines:
sorted_indexes = sorted(range(len(keys)), key = lambda i: keys)
sorted_indexes [3, 1, 4, 2, 0]
[values for i in sorted_indexes]

['D', 'B', 'E', 'C', 'A']

The advantage of Raymond's solution is that he only creates one new
Python list, whereas my solutions create an intermediate Python list
of integers. I don't think my solution really is that space-wasteful,
though, since by the time the second list gets created, any internal
intermediate lists from CPython's sorted() implementation will
probably have been cleaned up.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top