Sorting an array by multiple elements?

P

Paul

Hi there, I have an array of arrays that I want to sort by multiple
elements.

Sample data in the array looks like: [ [ id, date, num, name ], [ id,
date, num, name ], ... ]

I need to sort by : (1) name, (2) date, and (3) id.

I can sort by any one element in the array no problem using something
like :
summary_data.sort! { |a,b| a[ 3 ] <=> b[ 3 ] }

Unfortunately, I can't do this line more than once because it blows
away any previous sorting.

I found a few pages on the internet that describe how to "sort an
array of Ruby objects by multiple class fields", however, I don't know
how to create a "class field". I'm looking at an array, not a class.

Is there a way to do this or do I need to convert my data to something
else so I can do what I need?

Please help.

Thanks.
 
J

Josh Cheek

[Note: parts of this message were removed to make it a legal post.]

Hi there, I have an array of arrays that I want to sort by multiple
elements.

Sample data in the array looks like: [ [ id, date, num, name ], [ id,
date, num, name ], ... ]

I need to sort by : (1) name, (2) date, and (3) id.

I can sort by any one element in the array no problem using something
like :
summary_data.sort! { |a,b| a[ 3 ] <=> b[ 3 ] }

Unfortunately, I can't do this line more than once because it blows
away any previous sorting.

I found a few pages on the internet that describe how to "sort an
array of Ruby objects by multiple class fields", however, I don't know
how to create a "class field". I'm looking at an array, not a class.

Is there a way to do this or do I need to convert my data to something
else so I can do what I need?

Please help.

Thanks.

data = [ [ 1 , Time.now , 4, "barny" ] ,
[ 1 , Time.now , 1, "cliff" ] ,
[ 2 , Time.now , 3, "alfie" ] ,
[ 2 , Time.now , 2, "alfie" ] ]

#you can assign the values of the arrays to specific names by grouping
data.sort do |(a_id,a_date,a_num,a_name) , (b_id,b_date,b_num,b_name)|

first = a_id <=> b_id #first condition
second = a_num <=> b_num #second condition

first.zero? ? second : first #if they are equal on first
#then return second

end.each{ |d| p d } #inspect each one

__END__

output:
[1, Sun Aug 30 01:55:02 -0500 2009, 1, "cliff"]
[1, Sun Aug 30 01:55:02 -0500 2009, 4, "barny"]
[2, Sun Aug 30 01:55:02 -0500 2009, 2, "alfie"]
[2, Sun Aug 30 01:55:02 -0500 2009, 3, "alfie"]
 
S

Stefano Crocco

|Hi there, I have an array of arrays that I want to sort by multiple
|elements.
|
|Sample data in the array looks like: [ [ id, date, num, name ], [ id,
|date, num, name ], ... ]
|
|I need to sort by : (1) name, (2) date, and (3) id.
|
|I can sort by any one element in the array no problem using something
|
|like :
|> summary_data.sort! { |a,b| a[ 3 ] <=> b[ 3 ] }
|
|Unfortunately, I can't do this line more than once because it blows
|away any previous sorting.
|
|I found a few pages on the internet that describe how to "sort an
|array of Ruby objects by multiple class fields", however, I don't know
|how to create a "class field". I'm looking at an array, not a class.
|
|Is there a way to do this or do I need to convert my data to something
|else so I can do what I need?
|
|Please help.
|
|Thanks.
|

This should do what you want. It first compares the names then, if they're
equal, it compares the dates. If the dates are also equal it compares the ids.

summary_data.sort! do |a, b|
res = a[3] <=> b[3]
res = a[1] <=> b[1] if res == 0
res = a[0] <=> b[0] if res == 0
res
end

I hope this helps

Stefano
 
D

David A. Black

Hi --

Hi there, I have an array of arrays that I want to sort by multiple
elements.

Sample data in the array looks like: [ [ id, date, num, name ], [ id,
date, num, name ], ... ]

I need to sort by : (1) name, (2) date, and (3) id.

I can sort by any one element in the array no problem using something
like :
summary_data.sort! { |a,b| a[ 3 ] <=> b[ 3 ] }

Unfortunately, I can't do this line more than once because it blows
away any previous sorting.

I found a few pages on the internet that describe how to "sort an
array of Ruby objects by multiple class fields", however, I don't know
how to create a "class field". I'm looking at an array, not a class.

Is there a way to do this or do I need to convert my data to something
else so I can do what I need?

Please help.

Thanks.

data = [ [ 1 , Time.now , 4, "barny" ] ,
[ 1 , Time.now , 1, "cliff" ] ,
[ 2 , Time.now , 3, "alfie" ] ,
[ 2 , Time.now , 2, "alfie" ] ]

#you can assign the values of the arrays to specific names by grouping
data.sort do |(a_id,a_date,a_num,a_name) , (b_id,b_date,b_num,b_name)|

first = a_id <=> b_id #first condition
second = a_num <=> b_num #second condition

first.zero? ? second : first #if they are equal on first
#then return second

You can also do:

first.nonzero? || second

which I like the look of though I find myself having to kind of step
through it mentally to keep track....

I agree with botp, though, that it's a good case for #sort_by:

data.sort_by {|id, date, num, name| [id, date, num] }


David

--
David A. Black / Ruby Power and Light, LLC / http://www.rubypal.com
Ruby/Rails training, mentoring, consulting, code-review
Latest book: The Well-Grounded Rubyist (http://www.manning.com/black2)

September Ruby training in NJ has been POSTPONED. Details to follow.
 
P

Paul Carvalho

Thanks for the great feedback! I couldn't find good documentation for
the 'sort_by' and the examples provided didn't work with my data. It
did nothing actually, which surprised me. Josh and Stefano's approach
worked well enough.

This should do what you want. It first compares the names then, if they're
equal, it compares the dates. If the dates are also equal it compares theids.

summary_data.sort! do |a, b|
 res = a[3] <=> b[3]
 res = a[1] <=> b[1] if res == 0
 res = a[0] <=> b[0] if res == 0
 res
end

This gives me a 98% solution, which is good enough. There's a small
catch or trick with my data that I can't work around.

Details on the last 2%:

Unfortunately, the data in the array is not a fixed size. Sometimes
it is [ id, date, num, name ] and sometimes it may be [ id, date, num,
name1, name2 ] (or more names.. I don't know how many since the data
collection script figures it out as it goes along).

When I do the sort by 'name' (e.g. first = a_name <=> b_name ), all
the records with _additional_ names appears at the bottom of the list
as if it were a different name (unexpected).

For example, the output (now) looks like:

[2, "2009-08-21", 2, "alfie"]
[3, "2009-08-23", 3, "alfie"]
[6, "2009-08-23", 3, "alfie"]
[1, "2009-08-21", 4, "barny"]
[5, "2009-08-24", 1, "cliff"]
[4, "2009-08-21", 1, "cliff", "bob"]

... but I expect/want the output to look like:
[2, "2009-08-21", 2, "alfie"]
[3, "2009-08-23", 3, "alfie"]
[6, "2009-08-23", 3, "alfie"]
[1, "2009-08-21", 4, "barny"]
[4, "2009-08-21", 1, "cliff", "bob"]
[5, "2009-08-24", 1, "cliff"]


Any thoughts?

Cheers!
 
D

David A. Black

Hi --

Thanks for the great feedback! I couldn't find good documentation for
the 'sort_by' and the examples provided didn't work with my data. It
did nothing actually, which surprised me. Josh and Stefano's approach
worked well enough.

The nothing result seems strange. This:

data = [ [3, "2009-08-23", 3, "alfie"],
[5, "2009-08-24", 1, "cliff"],
[6, "2009-08-23", 3, "alfie"],
[1, "2009-08-21", 4, "barny"],
[2, "2009-08-21", 2, "alfie"] ]
data.sort_by {|id, date, num, name| [name, date, id] }

gives me:

[ [2, "2009-08-21", 2, "alfie"],
[3, "2009-08-23", 3, "alfie"],
[6, "2009-08-23", 3, "alfie"],
[1, "2009-08-21", 4, "barny"],
[5, "2009-08-24", 1, "cliff"]]

For documentation: ri Enumerable#sort_by
Unfortunately, the data in the array is not a fixed size. Sometimes
it is [ id, date, num, name ] and sometimes it may be [ id, date, num,
name1, name2 ] (or more names.. I don't know how many since the data
collection script figures it out as it goes along).

When I do the sort by 'name' (e.g. first = a_name <=> b_name ), all
the records with _additional_ names appears at the bottom of the list
as if it were a different name (unexpected).

For example, the output (now) looks like:

[2, "2009-08-21", 2, "alfie"]
[3, "2009-08-23", 3, "alfie"]
[6, "2009-08-23", 3, "alfie"]
[1, "2009-08-21", 4, "barny"]
[5, "2009-08-24", 1, "cliff"]
[4, "2009-08-21", 1, "cliff", "bob"]

.. but I expect/want the output to look like:
[2, "2009-08-21", 2, "alfie"]
[3, "2009-08-23", 3, "alfie"]
[6, "2009-08-23", 3, "alfie"]
[1, "2009-08-21", 4, "barny"]
[4, "2009-08-21", 1, "cliff", "bob"]
[5, "2009-08-24", 1, "cliff"]


Any thoughts?

Yeah -- sort_by :)

data = [ [2, "2009-08-21", 2, "alfie"],
[6, "2009-08-23", 3, "alfie"],
[1, "2009-08-21", 4, "barny"],
[3, "2009-08-23", 3, "alfie"],
[5, "2009-08-24", 1, "cliff"],
[4, "2009-08-21", 1, "cliff", "bob"] ]

data.sort_by {|id, date, num, name| [name, date, id] }

gives me:

[2, "2009-08-21", 2, "alfie"]
[3, "2009-08-23", 3, "alfie"]
[6, "2009-08-23", 3, "alfie"]
[1, "2009-08-21", 4, "barny"]
[4, "2009-08-21", 1, "cliff", "bob"]
[5, "2009-08-24", 1, "cliff"]


David

--
David A. Black / Ruby Power and Light, LLC / http://www.rubypal.com
Ruby/Rails training, mentoring, consulting, code-review
Latest book: The Well-Grounded Rubyist (http://www.manning.com/black2)

September Ruby training in NJ has been POSTPONED. Details to follow.
 
P

Paul Carvalho

The nothing result seems strange. This: [snip]

  data = [ [2, "2009-08-21", 2, "alfie"],
           [6, "2009-08-23", 3, "alfie"],
           [1, "2009-08-21", 4, "barny"],
           [3, "2009-08-23", 3, "alfie"],
           [5, "2009-08-24", 1, "cliff"],
           [4, "2009-08-21", 1, "cliff", "bob"] ]

   data.sort_by {|id, date, num, name| [name, date, id] }

gives me:

        [2, "2009-08-21", 2, "alfie"]
        [3, "2009-08-23", 3, "alfie"]
        [6, "2009-08-23", 3, "alfie"]
        [1, "2009-08-21", 4, "barny"]
        [4, "2009-08-21", 1, "cliff", "bob"]
        [5, "2009-08-24", 1, "cliff"]

Thanks David,

I think I know what the happened now. When I tried the other
approach, I sorted in place using: data.sort!

But when I used "data.sort_by" it might have sorted it but didn't save
it back to the same array.

"sort_by!" doesn't exist so I'll need to save the sorted data to
another variable.

The sort with the sample data above works as I expect, but for some
reason it still isn't sorting correctly with my real data. I'll keep
looking at it. I must be missing something else, although I can't
think what it might be right now.

Sort the data, write it out to file. Should be straightforward.

Cheers.
 
J

Josh Cheek

[Note: parts of this message were removed to make it a legal post.]

The nothing result seems strange. This: [snip]

data = [ [2, "2009-08-21", 2, "alfie"],
[6, "2009-08-23", 3, "alfie"],
[1, "2009-08-21", 4, "barny"],
[3, "2009-08-23", 3, "alfie"],
[5, "2009-08-24", 1, "cliff"],
[4, "2009-08-21", 1, "cliff", "bob"] ]

data.sort_by {|id, date, num, name| [name, date, id] }

gives me:

[2, "2009-08-21", 2, "alfie"]
[3, "2009-08-23", 3, "alfie"]
[6, "2009-08-23", 3, "alfie"]
[1, "2009-08-21", 4, "barny"]
[4, "2009-08-21", 1, "cliff", "bob"]
[5, "2009-08-24", 1, "cliff"]

Thanks David,

I think I know what the happened now. When I tried the other
approach, I sorted in place using: data.sort!

But when I used "data.sort_by" it might have sorted it but didn't save
it back to the same array.

"sort_by!" doesn't exist so I'll need to save the sorted data to
another variable.

The sort with the sample data above works as I expect, but for some
reason it still isn't sorting correctly with my real data. I'll keep
looking at it. I must be missing something else, although I can't
think what it might be right now.

Sort the data, write it out to file. Should be straightforward.

Cheers.
I got the same answer as David, those last two should compare "cliff" to
"cliff" and decide they are the same, then move to dates, which are all the
same except the last character. So the one with the date ending in "21"
should be before the one with the date ending in "24". This is what I saw
when I tried it.

Here are some other things you can do, to help you deal with the sometimes
last name issue.

If you are using names, you can use a variable that takes an arbitrary
number of arguments. Something like *name, then if there is one name there,
or two, they both get put into name, which is an array. So if the name was
"barney" then the variable would look like ["barney"] and if there was also
a last name that was "rubble" then the variable would look like
["barney","rubble"].

If you are not naming your variables, but instead using indexes, you can
access then by submitting a range, like this d[2..-1] which says to return a
new array of values starting at index 2 (I think name was index 3 with your
data), and through index -1. So it will do the same as the above.

Here are a few examples.

def pi( heading , to_inspect )
puts '' , heading
to_inspect.each{ |d| p d }
end

data = [ [ 5 , :foo , "alfie" ] ,
[ 1 , :bar , "barney" , "rubble" ] ,
[ 1 , :boo , "barney" , "fife" ] ,
[ 1 , :far , "alfie" ] ]

puts 'sorting by name, then by id'

pi "with sort" ,
data.sort { |(a_id,a_whatever,*a_name) , (b_id,b_whatever,*b_name)|
first = a_name <=> b_name
first.zero? ? a_id <=> b_id : first
} #using brackets so it is a block to sort, not pi


pi "with sort_by" ,
data.sort_by{ |d| [d[2..-1],d[0]] }


pi "with sort_by" ,
data.sort_by{ |id,whatever,*name| [name,id] }
 
P

Paul Carvalho

Got it!
The sort with the sample data above works as I expect, but for some
reason it still isn't sorting correctly with my real data.  I'll keep
looking at it.  I must be missing something else, although I can't
think what it might be right now.

The 'name' element turned out to be an array. (Doh!) I flattened the
data and the sort_by now works correctly, as expected.

Thanks again for all your help.

Cheers!
 
C

Chuck Remes

Got it!


The 'name' element turned out to be an array. (Doh!) I flattened the
data and the sort_by now works correctly, as expected.

Thanks again for all your help.

Be advised that sort_by sorts in ascending order by default. If you
want to sort descending, things get a little complicated. For example,
you can sort descending by putting a minus (-) sign in front of your
Fixnum variables. That works nicely. Now try to do that with a string.
Kaboom.

Try this code to make it (mostly) work.

# Added to the String class so in Enumerable#sort_by we can
# use the #- to reverse the sort order from ascending to descending
# Not perfect; see ruby-talk ML 20090320 where it shows that it doesn't
# always work correctly
class String
def -@
self.gsub(/./) {|s| (255 - s[0]).chr }
end
end

cr
 
R

Robert Dober

=A0data.sort_by {|id, date, num, name| [id, date, num] }

Whoops, I meant [name, date, id].
Maybe you might have avoided this by using
| id, date, _, name |
I say this because the importance of naming is maybe one of the most
underrated in programming.
Cheers
R.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top