Sorting an array by multiple elements?

Discussion in 'Ruby' started by Paul, Aug 30, 2009.

  1. Paul

    Paul Guest

    Hi there, I have an array of arrays that I want to sort by multiple
    elements.

    Sample data in the array looks like: [ [ id, date, num, name ], [ id,
    date, num, name ], ... ]

    I need to sort by : (1) name, (2) date, and (3) id.

    I can sort by any one element in the array no problem using something
    like :
    > summary_data.sort! { |a,b| a[ 3 ] <=> b[ 3 ] }


    Unfortunately, I can't do this line more than once because it blows
    away any previous sorting.

    I found a few pages on the internet that describe how to "sort an
    array of Ruby objects by multiple class fields", however, I don't know
    how to create a "class field". I'm looking at an array, not a class.

    Is there a way to do this or do I need to convert my data to something
    else so I can do what I need?

    Please help.

    Thanks.
     
    Paul, Aug 30, 2009
    #1
    1. Advertising

  2. Paul

    Josh Cheek Guest

    [Note: parts of this message were removed to make it a legal post.]

    On Sun, Aug 30, 2009 at 1:30 AM, Paul <> wrote:

    > Hi there, I have an array of arrays that I want to sort by multiple
    > elements.
    >
    > Sample data in the array looks like: [ [ id, date, num, name ], [ id,
    > date, num, name ], ... ]
    >
    > I need to sort by : (1) name, (2) date, and (3) id.
    >
    > I can sort by any one element in the array no problem using something
    > like :
    > > summary_data.sort! { |a,b| a[ 3 ] <=> b[ 3 ] }

    >
    > Unfortunately, I can't do this line more than once because it blows
    > away any previous sorting.
    >
    > I found a few pages on the internet that describe how to "sort an
    > array of Ruby objects by multiple class fields", however, I don't know
    > how to create a "class field". I'm looking at an array, not a class.
    >
    > Is there a way to do this or do I need to convert my data to something
    > else so I can do what I need?
    >
    > Please help.
    >
    > Thanks.
    >
    >


    data = [ [ 1 , Time.now , 4, "barny" ] ,
    [ 1 , Time.now , 1, "cliff" ] ,
    [ 2 , Time.now , 3, "alfie" ] ,
    [ 2 , Time.now , 2, "alfie" ] ]

    #you can assign the values of the arrays to specific names by grouping
    data.sort do |(a_id,a_date,a_num,a_name) , (b_id,b_date,b_num,b_name)|

    first = a_id <=> b_id #first condition
    second = a_num <=> b_num #second condition

    first.zero? ? second : first #if they are equal on first
    #then return second

    end.each{ |d| p d } #inspect each one

    __END__

    output:
    [1, Sun Aug 30 01:55:02 -0500 2009, 1, "cliff"]
    [1, Sun Aug 30 01:55:02 -0500 2009, 4, "barny"]
    [2, Sun Aug 30 01:55:02 -0500 2009, 2, "alfie"]
    [2, Sun Aug 30 01:55:02 -0500 2009, 3, "alfie"]
     
    Josh Cheek, Aug 30, 2009
    #2
    1. Advertising

  3. On Sunday 30 August 2009, Paul wrote:
    > |Hi there, I have an array of arrays that I want to sort by multiple
    > |elements.
    > |
    > |Sample data in the array looks like: [ [ id, date, num, name ], [ id,
    > |date, num, name ], ... ]
    > |
    > |I need to sort by : (1) name, (2) date, and (3) id.
    > |
    > |I can sort by any one element in the array no problem using something
    > |
    > |like :
    > |> summary_data.sort! { |a,b| a[ 3 ] <=> b[ 3 ] }
    > |
    > |Unfortunately, I can't do this line more than once because it blows
    > |away any previous sorting.
    > |
    > |I found a few pages on the internet that describe how to "sort an
    > |array of Ruby objects by multiple class fields", however, I don't know
    > |how to create a "class field". I'm looking at an array, not a class.
    > |
    > |Is there a way to do this or do I need to convert my data to something
    > |else so I can do what I need?
    > |
    > |Please help.
    > |
    > |Thanks.
    > |


    This should do what you want. It first compares the names then, if they're
    equal, it compares the dates. If the dates are also equal it compares the ids.

    summary_data.sort! do |a, b|
    res = a[3] <=> b[3]
    res = a[1] <=> b[1] if res == 0
    res = a[0] <=> b[0] if res == 0
    res
    end

    I hope this helps

    Stefano
     
    Stefano Crocco, Aug 30, 2009
    #3
  4. Paul

    botp Guest

    On Sun, Aug 30, 2009 at 2:30 PM, Paul<> wrote:
    > Hi there, I have an array of arrays that I want to sort by multiple
    > elements.


    try #sort_by, eg,

    >data.sort_by{|row| [row[3],row[1],row[0]]}
     
    botp, Aug 30, 2009
    #4
  5. Hi --

    On Sun, 30 Aug 2009, Josh Cheek wrote:

    > On Sun, Aug 30, 2009 at 1:30 AM, Paul <> wrote:
    >
    >> Hi there, I have an array of arrays that I want to sort by multiple
    >> elements.
    >>
    >> Sample data in the array looks like: [ [ id, date, num, name ], [ id,
    >> date, num, name ], ... ]
    >>
    >> I need to sort by : (1) name, (2) date, and (3) id.
    >>
    >> I can sort by any one element in the array no problem using something
    >> like :
    >>> summary_data.sort! { |a,b| a[ 3 ] <=> b[ 3 ] }

    >>
    >> Unfortunately, I can't do this line more than once because it blows
    >> away any previous sorting.
    >>
    >> I found a few pages on the internet that describe how to "sort an
    >> array of Ruby objects by multiple class fields", however, I don't know
    >> how to create a "class field". I'm looking at an array, not a class.
    >>
    >> Is there a way to do this or do I need to convert my data to something
    >> else so I can do what I need?
    >>
    >> Please help.
    >>
    >> Thanks.
    >>
    >>

    >
    > data = [ [ 1 , Time.now , 4, "barny" ] ,
    > [ 1 , Time.now , 1, "cliff" ] ,
    > [ 2 , Time.now , 3, "alfie" ] ,
    > [ 2 , Time.now , 2, "alfie" ] ]
    >
    > #you can assign the values of the arrays to specific names by grouping
    > data.sort do |(a_id,a_date,a_num,a_name) , (b_id,b_date,b_num,b_name)|
    >
    > first = a_id <=> b_id #first condition
    > second = a_num <=> b_num #second condition
    >
    > first.zero? ? second : first #if they are equal on first
    > #then return second


    You can also do:

    first.nonzero? || second

    which I like the look of though I find myself having to kind of step
    through it mentally to keep track....

    I agree with botp, though, that it's a good case for #sort_by:

    data.sort_by {|id, date, num, name| [id, date, num] }


    David

    --
    David A. Black / Ruby Power and Light, LLC / http://www.rubypal.com
    Ruby/Rails training, mentoring, consulting, code-review
    Latest book: The Well-Grounded Rubyist (http://www.manning.com/black2)

    September Ruby training in NJ has been POSTPONED. Details to follow.
     
    David A. Black, Aug 30, 2009
    #5
  6. On Sun, 30 Aug 2009, David A. Black wrote:

    > data.sort_by {|id, date, num, name| [id, date, num] }


    Whoops, I meant [name, date, id].


    David

    --
    David A. Black / Ruby Power and Light, LLC / http://www.rubypal.com
    Ruby/Rails training, mentoring, consulting, code-review
    Latest book: The Well-Grounded Rubyist (http://www.manning.com/black2)

    September Ruby training in NJ has been POSTPONED. Details to follow.
     
    David A. Black, Aug 30, 2009
    #6
  7. Thanks for the great feedback! I couldn't find good documentation for
    the 'sort_by' and the examples provided didn't work with my data. It
    did nothing actually, which surprised me. Josh and Stefano's approach
    worked well enough.

    On Aug 30, 3:04 am, Stefano Crocco wrote:

    > This should do what you want. It first compares the names then, if they're
    > equal, it compares the dates. If the dates are also equal it compares theids.
    >
    > summary_data.sort! do |a, b|
    >  res = a[3] <=> b[3]
    >  res = a[1] <=> b[1] if res == 0
    >  res = a[0] <=> b[0] if res == 0
    >  res
    > end
    >


    This gives me a 98% solution, which is good enough. There's a small
    catch or trick with my data that I can't work around.

    Details on the last 2%:

    Unfortunately, the data in the array is not a fixed size. Sometimes
    it is [ id, date, num, name ] and sometimes it may be [ id, date, num,
    name1, name2 ] (or more names.. I don't know how many since the data
    collection script figures it out as it goes along).

    When I do the sort by 'name' (e.g. first = a_name <=> b_name ), all
    the records with _additional_ names appears at the bottom of the list
    as if it were a different name (unexpected).

    For example, the output (now) looks like:

    [2, "2009-08-21", 2, "alfie"]
    [3, "2009-08-23", 3, "alfie"]
    [6, "2009-08-23", 3, "alfie"]
    [1, "2009-08-21", 4, "barny"]
    [5, "2009-08-24", 1, "cliff"]
    [4, "2009-08-21", 1, "cliff", "bob"]

    ... but I expect/want the output to look like:
    [2, "2009-08-21", 2, "alfie"]
    [3, "2009-08-23", 3, "alfie"]
    [6, "2009-08-23", 3, "alfie"]
    [1, "2009-08-21", 4, "barny"]
    [4, "2009-08-21", 1, "cliff", "bob"]
    [5, "2009-08-24", 1, "cliff"]


    Any thoughts?

    Cheers!
     
    Paul Carvalho, Aug 30, 2009
    #7
  8. Hi --

    On Mon, 31 Aug 2009, Paul Carvalho wrote:

    > Thanks for the great feedback! I couldn't find good documentation for
    > the 'sort_by' and the examples provided didn't work with my data. It
    > did nothing actually, which surprised me. Josh and Stefano's approach
    > worked well enough.


    The nothing result seems strange. This:

    data = [ [3, "2009-08-23", 3, "alfie"],
    [5, "2009-08-24", 1, "cliff"],
    [6, "2009-08-23", 3, "alfie"],
    [1, "2009-08-21", 4, "barny"],
    [2, "2009-08-21", 2, "alfie"] ]
    data.sort_by {|id, date, num, name| [name, date, id] }

    gives me:

    [ [2, "2009-08-21", 2, "alfie"],
    [3, "2009-08-23", 3, "alfie"],
    [6, "2009-08-23", 3, "alfie"],
    [1, "2009-08-21", 4, "barny"],
    [5, "2009-08-24", 1, "cliff"]]

    For documentation: ri Enumerable#sort_by

    > Unfortunately, the data in the array is not a fixed size. Sometimes
    > it is [ id, date, num, name ] and sometimes it may be [ id, date, num,
    > name1, name2 ] (or more names.. I don't know how many since the data
    > collection script figures it out as it goes along).
    >
    > When I do the sort by 'name' (e.g. first = a_name <=> b_name ), all
    > the records with _additional_ names appears at the bottom of the list
    > as if it were a different name (unexpected).
    >
    > For example, the output (now) looks like:
    >
    > [2, "2009-08-21", 2, "alfie"]
    > [3, "2009-08-23", 3, "alfie"]
    > [6, "2009-08-23", 3, "alfie"]
    > [1, "2009-08-21", 4, "barny"]
    > [5, "2009-08-24", 1, "cliff"]
    > [4, "2009-08-21", 1, "cliff", "bob"]
    >
    > .. but I expect/want the output to look like:
    > [2, "2009-08-21", 2, "alfie"]
    > [3, "2009-08-23", 3, "alfie"]
    > [6, "2009-08-23", 3, "alfie"]
    > [1, "2009-08-21", 4, "barny"]
    > [4, "2009-08-21", 1, "cliff", "bob"]
    > [5, "2009-08-24", 1, "cliff"]
    >
    >
    > Any thoughts?


    Yeah -- sort_by :)

    data = [ [2, "2009-08-21", 2, "alfie"],
    [6, "2009-08-23", 3, "alfie"],
    [1, "2009-08-21", 4, "barny"],
    [3, "2009-08-23", 3, "alfie"],
    [5, "2009-08-24", 1, "cliff"],
    [4, "2009-08-21", 1, "cliff", "bob"] ]

    data.sort_by {|id, date, num, name| [name, date, id] }

    gives me:

    [2, "2009-08-21", 2, "alfie"]
    [3, "2009-08-23", 3, "alfie"]
    [6, "2009-08-23", 3, "alfie"]
    [1, "2009-08-21", 4, "barny"]
    [4, "2009-08-21", 1, "cliff", "bob"]
    [5, "2009-08-24", 1, "cliff"]


    David

    --
    David A. Black / Ruby Power and Light, LLC / http://www.rubypal.com
    Ruby/Rails training, mentoring, consulting, code-review
    Latest book: The Well-Grounded Rubyist (http://www.manning.com/black2)

    September Ruby training in NJ has been POSTPONED. Details to follow.
     
    David A. Black, Aug 30, 2009
    #8
  9. On Aug 30, 11:41 am, "David A. Black" wrote:
    >
    > The nothing result seems strange. This:

    [snip]
    >
    >   data = [ [2, "2009-08-21", 2, "alfie"],
    >            [6, "2009-08-23", 3, "alfie"],
    >            [1, "2009-08-21", 4, "barny"],
    >            [3, "2009-08-23", 3, "alfie"],
    >            [5, "2009-08-24", 1, "cliff"],
    >            [4, "2009-08-21", 1, "cliff", "bob"] ]
    >
    >    data.sort_by {|id, date, num, name| [name, date, id] }
    >
    > gives me:
    >
    >         [2, "2009-08-21", 2, "alfie"]
    >         [3, "2009-08-23", 3, "alfie"]
    >         [6, "2009-08-23", 3, "alfie"]
    >         [1, "2009-08-21", 4, "barny"]
    >         [4, "2009-08-21", 1, "cliff", "bob"]
    >         [5, "2009-08-24", 1, "cliff"]
    >


    Thanks David,

    I think I know what the happened now. When I tried the other
    approach, I sorted in place using: data.sort!

    But when I used "data.sort_by" it might have sorted it but didn't save
    it back to the same array.

    "sort_by!" doesn't exist so I'll need to save the sorted data to
    another variable.

    The sort with the sample data above works as I expect, but for some
    reason it still isn't sorting correctly with my real data. I'll keep
    looking at it. I must be missing something else, although I can't
    think what it might be right now.

    Sort the data, write it out to file. Should be straightforward.

    Cheers.
     
    Paul Carvalho, Aug 30, 2009
    #9
  10. Paul

    Josh Cheek Guest

    [Note: parts of this message were removed to make it a legal post.]

    On Sun, Aug 30, 2009 at 12:25 PM, Paul Carvalho <>wrote:

    > On Aug 30, 11:41 am, "David A. Black" wrote:
    > >
    > > The nothing result seems strange. This:

    > [snip]
    > >
    > > data = [ [2, "2009-08-21", 2, "alfie"],
    > > [6, "2009-08-23", 3, "alfie"],
    > > [1, "2009-08-21", 4, "barny"],
    > > [3, "2009-08-23", 3, "alfie"],
    > > [5, "2009-08-24", 1, "cliff"],
    > > [4, "2009-08-21", 1, "cliff", "bob"] ]
    > >
    > > data.sort_by {|id, date, num, name| [name, date, id] }
    > >
    > > gives me:
    > >
    > > [2, "2009-08-21", 2, "alfie"]
    > > [3, "2009-08-23", 3, "alfie"]
    > > [6, "2009-08-23", 3, "alfie"]
    > > [1, "2009-08-21", 4, "barny"]
    > > [4, "2009-08-21", 1, "cliff", "bob"]
    > > [5, "2009-08-24", 1, "cliff"]
    > >

    >
    > Thanks David,
    >
    > I think I know what the happened now. When I tried the other
    > approach, I sorted in place using: data.sort!
    >
    > But when I used "data.sort_by" it might have sorted it but didn't save
    > it back to the same array.
    >
    > "sort_by!" doesn't exist so I'll need to save the sorted data to
    > another variable.
    >
    > The sort with the sample data above works as I expect, but for some
    > reason it still isn't sorting correctly with my real data. I'll keep
    > looking at it. I must be missing something else, although I can't
    > think what it might be right now.
    >
    > Sort the data, write it out to file. Should be straightforward.
    >
    > Cheers.
    >
    >

    I got the same answer as David, those last two should compare "cliff" to
    "cliff" and decide they are the same, then move to dates, which are all the
    same except the last character. So the one with the date ending in "21"
    should be before the one with the date ending in "24". This is what I saw
    when I tried it.

    Here are some other things you can do, to help you deal with the sometimes
    last name issue.

    If you are using names, you can use a variable that takes an arbitrary
    number of arguments. Something like *name, then if there is one name there,
    or two, they both get put into name, which is an array. So if the name was
    "barney" then the variable would look like ["barney"] and if there was also
    a last name that was "rubble" then the variable would look like
    ["barney","rubble"].

    If you are not naming your variables, but instead using indexes, you can
    access then by submitting a range, like this d[2..-1] which says to return a
    new array of values starting at index 2 (I think name was index 3 with your
    data), and through index -1. So it will do the same as the above.

    Here are a few examples.

    def pi( heading , to_inspect )
    puts '' , heading
    to_inspect.each{ |d| p d }
    end

    data = [ [ 5 , :foo , "alfie" ] ,
    [ 1 , :bar , "barney" , "rubble" ] ,
    [ 1 , :boo , "barney" , "fife" ] ,
    [ 1 , :far , "alfie" ] ]

    puts 'sorting by name, then by id'

    pi "with sort" ,
    data.sort { |(a_id,a_whatever,*a_name) , (b_id,b_whatever,*b_name)|
    first = a_name <=> b_name
    first.zero? ? a_id <=> b_id : first
    } #using brackets so it is a block to sort, not pi


    pi "with sort_by" ,
    data.sort_by{ |d| [d[2..-1],d[0]] }


    pi "with sort_by" ,
    data.sort_by{ |id,whatever,*name| [name,id] }
     
    Josh Cheek, Aug 30, 2009
    #10
  11. Got it!

    > The sort with the sample data above works as I expect, but for some
    > reason it still isn't sorting correctly with my real data.  I'll keep
    > looking at it.  I must be missing something else, although I can't
    > think what it might be right now.
    >


    The 'name' element turned out to be an array. (Doh!) I flattened the
    data and the sort_by now works correctly, as expected.

    Thanks again for all your help.

    Cheers!
     
    Paul Carvalho, Aug 30, 2009
    #11
  12. Paul

    Chuck Remes Guest

    On Aug 30, 2009, at 2:05 PM, Paul Carvalho wrote:

    > Got it!
    >
    >> The sort with the sample data above works as I expect, but for some
    >> reason it still isn't sorting correctly with my real data. I'll keep
    >> looking at it. I must be missing something else, although I can't
    >> think what it might be right now.
    >>

    >
    > The 'name' element turned out to be an array. (Doh!) I flattened the
    > data and the sort_by now works correctly, as expected.
    >
    > Thanks again for all your help.


    Be advised that sort_by sorts in ascending order by default. If you
    want to sort descending, things get a little complicated. For example,
    you can sort descending by putting a minus (-) sign in front of your
    Fixnum variables. That works nicely. Now try to do that with a string.
    Kaboom.

    Try this code to make it (mostly) work.

    # Added to the String class so in Enumerable#sort_by we can
    # use the #- to reverse the sort order from ascending to descending
    # Not perfect; see ruby-talk ML 20090320 where it shows that it doesn't
    # always work correctly
    class String
    def -@
    self.gsub(/./) {|s| (255 - s[0]).chr }
    end
    end

    cr
     
    Chuck Remes, Aug 30, 2009
    #12
  13. Paul

    Robert Dober Guest

    On Sun, Aug 30, 2009 at 1:25 PM, David A. Black<> wrote:
    > On Sun, 30 Aug 2009, David A. Black wrote:
    >
    >> =A0data.sort_by {|id, date, num, name| [id, date, num] }

    >
    > Whoops, I meant [name, date, id].
    >

    Maybe you might have avoided this by using
    | id, date, _, name |
    I say this because the importance of naming is maybe one of the most
    underrated in programming.
    Cheers
    R.
     
    Robert Dober, Aug 30, 2009
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. R
    Replies:
    4
    Views:
    11,051
    Martin Honnen
    May 30, 2006
  2. P
    Replies:
    1
    Views:
    1,170
    Joe Kesselman
    Jul 7, 2006
  3. Rehceb Rotkiv
    Replies:
    16
    Views:
    890
    Alex Martelli
    Apr 2, 2007
  4. Josselin
    Replies:
    5
    Views:
    104
    Josselin
    Dec 18, 2006
  5. Tom Kirchner

    sorting by multiple criterias (sub-sorting)

    Tom Kirchner, Oct 11, 2003, in forum: Perl Misc
    Replies:
    3
    Views:
    476
    Michael Budash
    Oct 11, 2003
Loading...

Share This Page