Sorting Dates and Times in an array

P

Paul

Hi there. I am having a bit of trouble trying to solve a particular
sorting problem with an array of data.

Please bear with me for a moment regarding the setup of this problem.
I don't currently have any control over the input data. I am just
trying to format the report of this data as per the requirement.

I have the following sample array data to work with:
@date_array = [['2/22/07','1:03 pm'],['3/9/07','10:45 pm'],
['3/1/07','1:52 pm'],['3/13/07','2:00 pm'],['2/28/07','10:45 am'],
['3/5/07','4:00 pm'],['2/14/07','5:00 pm'],['3/9/07','10:18 am'],
['3/13/07','11:15 am']]

(The actual array contains more data, but this is good enough to work
on this problem. BTW, date format = mm/dd/yy.)

Requirement:
1) Sort in descending order by Date (i.e. @date_array[x][0] )
2) Sort in ascending order by Time (i.e. @date_array[x][1] )

Elsewhere in the script, I use the following line to sort by a
particular element in a row:
@fields.sort! { |a,b| b[ x ] <=> a[ x ] } # i.e. descending sort

This works quite nicely for string and numeric fields, but *not* for
Date or Time string fields. As an example, when I try it with the
above data I get the following:
----
puts 'Descending Sort - By Date:'
@date_array.sort! { |a,b| b[ 0 ] <=> a[ 0 ] }
@date_array.each_index do |row|
puts "row # #{row} = " + @date_array[row].join(" : ")
end
----
Descending Sort - By Date:
row # 0 = 3/9/07 : 10:18 am
row # 1 = 3/9/07 : 10:45 pm
row # 2 = 3/5/07 : 4:00 pm
row # 3 = 3/13/07 : 2:00 pm
row # 4 = 3/13/07 : 11:15 am
row # 5 = 3/1/07 : 1:52 pm
row # 6 = 2/28/07 : 10:45 am
row # 7 = 2/22/07 : 1:03 pm
row # 8 = 2/14/07 : 5:00 pm
----

--> Which is a nice string sort again, but not a date sort.

I know I'll have to write a method to deal with these two columns of
data, but I'm not sure where to start. I've tried googling solutions
in both this discussion group and the internet but so far haven't
turned up anything I can use.

Suggestions?
 
O

Olivier Renaud

Le lundi 19 mars 2007 17:25, Paul a =E9crit=A0:
Hi there. I am having a bit of trouble trying to solve a particular
sorting problem with an array of data.

Please bear with me for a moment regarding the setup of this problem.
I don't currently have any control over the input data. I am just
trying to format the report of this data as per the requirement.

I have the following sample array data to work with:
@date_array =3D [['2/22/07','1:03 pm'],['3/9/07','10:45 pm'],
['3/1/07','1:52 pm'],['3/13/07','2:00 pm'],['2/28/07','10:45 am'],
['3/5/07','4:00 pm'],['2/14/07','5:00 pm'],['3/9/07','10:18 am'],
['3/13/07','11:15 am']]

(The actual array contains more data, but this is good enough to work
on this problem. BTW, date format =3D mm/dd/yy.)

Requirement:
1) Sort in descending order by Date (i.e. @date_array[x][0] )
2) Sort in ascending order by Time (i.e. @date_array[x][1] )

Elsewhere in the script, I use the following line to sort by a

particular element in a row:
@fields.sort! { |a,b| b[ x ] <=3D> a[ x ] } # i.e. descending sort

This works quite nicely for string and numeric fields, but *not* for
Date or Time string fields. As an example, when I try it with the
above data I get the following:
----
puts 'Descending Sort - By Date:'
@date_array.sort! { |a,b| b[ 0 ] <=3D> a[ 0 ] }
@date_array.each_index do |row|
puts "row # #{row} =3D " + @date_array[row].join(" : ")
end
----
Descending Sort - By Date:
row # 0 =3D 3/9/07 : 10:18 am
row # 1 =3D 3/9/07 : 10:45 pm
row # 2 =3D 3/5/07 : 4:00 pm
row # 3 =3D 3/13/07 : 2:00 pm
row # 4 =3D 3/13/07 : 11:15 am
row # 5 =3D 3/1/07 : 1:52 pm
row # 6 =3D 2/28/07 : 10:45 am
row # 7 =3D 2/22/07 : 1:03 pm
row # 8 =3D 2/14/07 : 5:00 pm
----

--> Which is a nice string sort again, but not a date sort.

I know I'll have to write a method to deal with these two columns of
data, but I'm not sure where to start. I've tried googling solutions
in both this discussion group and the internet but so far haven't
turned up anything I can use.

Suggestions?


Hi Paul,

You may want to first convert your dates/times to real Time objects, so tha=
t=20
they can be compared.

Here is my solution :

require 'pp'
pp @date_array.sort_by {|ary| ary.map{|elt| Time.parse(elt) } }
[["2/14/07", "5:00 pm"],
["2/22/07", "1:03 pm"],
["2/28/07", "10:45 am"],
["3/1/07", "1:52 pm"],
["3/5/07", "4:00 pm"],
["3/9/07", "10:18 am"],
["3/9/07", "10:45 pm"],
["3/13/07", "11:15 am"],
["3/13/07", "2:00 pm"]]
=3D> nil

A quick explanation :
@date_array is sorted with sort_by, because we want to sort the array=20
according to a simple criteria. For the comparison, each element of the arr=
ay=20
(ie the date and the time string) is converted to a Time object with=20
Time#parse, and is kept in an array (this is why I used #map).

Doing so, the original arrays of Strings will be sorted, according to the=20
order of the Arrays containing the real Time objects. For example on a sing=
le=20
element of your original array :
["2/14/07", "5:00 pm"].map{|el| Time.parse(el)}
=3D> [Wed Feb 14 00:00:00 +0100 2007, Mon Mar 19 17:00:00 +0100 2007]


Instead of comparing arrays that contains one element for the date and anot=
her=20
element for the time, we could have created only one Time object representi=
ng=20
both :

@date_array.sort_by {|ary| Time.parse("#{ary.first} #{ary.last}") }

Regards.

=2D-=20
Olivier Renaud
 
R

Rob Biedenharn

Hi there. I am having a bit of trouble trying to solve a particular
sorting problem with an array of data.

Please bear with me for a moment regarding the setup of this problem.
I don't currently have any control over the input data. I am just
trying to format the report of this data as per the requirement.

I have the following sample array data to work with:
@date_array = [['2/22/07','1:03 pm'],['3/9/07','10:45 pm'],
['3/1/07','1:52 pm'],['3/13/07','2:00 pm'],['2/28/07','10:45 am'],
['3/5/07','4:00 pm'],['2/14/07','5:00 pm'],['3/9/07','10:18 am'],
['3/13/07','11:15 am']]

(The actual array contains more data, but this is good enough to work
on this problem. BTW, date format = mm/dd/yy.)

Requirement:
1) Sort in descending order by Date (i.e. @date_array[x][0] )
2) Sort in ascending order by Time (i.e. @date_array[x][1] )

Elsewhere in the script, I use the following line to sort by a
particular element in a row:
@fields.sort! { |a,b| b[ x ] <=> a[ x ] } # i.e. descending sort

This works quite nicely for string and numeric fields, but *not* for
Date or Time string fields. As an example, when I try it with the
above data I get the following:
----
puts 'Descending Sort - By Date:'
@date_array.sort! { |a,b| b[ 0 ] <=> a[ 0 ] }
@date_array.each_index do |row|
puts "row # #{row} = " + @date_array[row].join(" : ")
end
----
Descending Sort - By Date:
row # 0 = 3/9/07 : 10:18 am
row # 1 = 3/9/07 : 10:45 pm
row # 2 = 3/5/07 : 4:00 pm
row # 3 = 3/13/07 : 2:00 pm
row # 4 = 3/13/07 : 11:15 am
row # 5 = 3/1/07 : 1:52 pm
row # 6 = 2/28/07 : 10:45 am
row # 7 = 2/22/07 : 1:03 pm
row # 8 = 2/14/07 : 5:00 pm
----

--> Which is a nice string sort again, but not a date sort.

I know I'll have to write a method to deal with these two columns of
data, but I'm not sure where to start. I've tried googling solutions
in both this discussion group and the internet but so far haven't
turned up anything I can use.

Suggestions?
epoch = Time.at(0).utc => Thu Jan 01 00:00:00 UTC 1970
@date_array.sort_by {|d,t| dp=Date.parse(d,true); tp=Time.parse
(t,epoch); [ Date.today - dp, tp ] }.each {|d,t| puts("%8s %8s" %
[d,t]) }
3/13/07 11:15 am
3/13/07 2:00 pm
3/9/07 10:18 am
3/9/07 10:45 pm
3/5/07 4:00 pm
3/1/07 1:52 pm
2/28/07 10:45 am
2/22/07 1:03 pm
2/14/07 5:00 pm
=> [["3/13/07", "11:15 am"], ["3/13/07", "2:00 pm"], ["3/9/07",
"10:18 am"], ["3/9/07", "10:45 pm"], ["3/5/07", "4:00 pm"],
["3/1/07", "1:52 pm"], ["2/28/07", "10:45 am"], ["2/22/07", "1:03
pm"], ["2/14/07", "5:00 pm"]]

Parse the date with 2-digit year semantics (1969-2068 from the 'true'
arg) and parse the time based on the beginning of the epoch (really
any date would do). The sort is then reversed by date and forward by
time when dates are equal.

The use of .sort_by causes the keys to be processed once and is
better for large Arrays than .sort (and if you need .sort!, just do
@date_array = @date_array.sort_by {...} instead).

-Rob

Rob Biedenharn http://agileconsultingllc.com
(e-mail address removed)
 
R

Robert Klemme

Instead of comparing arrays that contains one element for the date and another
element for the time, we could have created only one Time object representing
both :

@date_array.sort_by {|ary| Time.parse("#{ary.first} #{ary.last}") }

I think this does not meet the OP's requirements (because of descending
date).

epoch = Date.new(0)
@date_array.sort_by {|dt,tm| [epoch - Date.parse(dt), Time.parse(tm)]}

Substraction with epoch will make date components negative and thus lead
to descending ordering.

Kind regards

robert
 
R

Rick DeNatale

Hi there. I am having a bit of trouble trying to solve a particular
sorting problem with an array of data.

Please bear with me for a moment regarding the setup of this problem.
I don't currently have any control over the input data. I am just
trying to format the report of this data as per the requirement.

I have the following sample array data to work with:
@date_array = [['2/22/07','1:03 pm'],['3/9/07','10:45 pm'],
['3/1/07','1:52 pm'],['3/13/07','2:00 pm'],['2/28/07','10:45 am'],
['3/5/07','4:00 pm'],['2/14/07','5:00 pm'],['3/9/07','10:18 am'],
['3/13/07','11:15 am']]

(The actual array contains more data, but this is good enough to work
on this problem. BTW, date format = mm/dd/yy.)

Requirement:
1) Sort in descending order by Date (i.e. @date_array[x][0] )
2) Sort in ascending order by Time (i.e. @date_array[x][1] )

(Standard Library) objects are your friend.

rick@frodo:/public/rubyscripts$ cat datesort.rb
date_array = [['2/22/07','1:03 pm'],['3/9/07','10:45 pm'],
['3/1/07','1:52 pm'],['3/13/07','2:00 pm'],['2/28/07','10:45 am'],
['3/5/07','4:00 pm'],['2/14/07','5:00 pm'],['3/9/07','10:18 am'],
['3/13/07','11:15 am']]

p date_array.sort do |a, b|
ddiff = Date.parse(b[0],true) <=> Date.parse(a[0],true)
ddiff == 0 ? Time.parse(a[0]) <=> Time.parse(b[0]) : ddiff
end

rick@frodo:/public/rubyscripts$ ruby datesort.rb
[["2/14/07", "5:00 pm"], ["2/22/07", "1:03 pm"], ["2/28/07", "10:45
am"], ["3/1/07", "1:52 pm"], ["3/13/07", "11:15 am"], ["3/13/07",
"2:00 pm"], ["3/5/07", "4:00 pm"], ["3/9/07", "10:18 am"], ["3/9/07",
"10:45 pm"]]
rick@frodo:/public/rubyscripts$


--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

IPMS/USA Region 12 Coordinator
http://ipmsr12.denhaven2.com/

Visit the Project Mercury Wiki Site
http://www.mercuryspacecraft.com/
 
O

Olivier Renaud

Le lundi 19 mars 2007 18:25, Robert Klemme a =E9crit=A0:
Instead of comparing arrays that contains one element for the date and
another element for the time, we could have created only one Time object
representing both :

@date_array.sort_by {|ary| Time.parse("#{ary.first} #{ary.last}") }

I think this does not meet the OP's requirements (because of descending
date).

epoch =3D Date.new(0)
@date_array.sort_by {|dt,tm| [epoch - Date.parse(dt), Time.parse(tm)]}

Substraction with epoch will make date components negative and thus lead
to descending ordering.

Kind regards

robert

Yes you're right, I just kept in mind the instruction "sort the dates !". B=
ut=20
I missed the subtlelty. Thanks.

=2D-=20
Olivier Renaud
 
R

Rick DeNatale

The use of .sort_by causes the keys to be processed once and is
better for large Arrays than .sort (and if you need .sort!, just do
@date_array = @date_array.sort_by {...} instead).

Not exactly the same if you have more than one reference to the Array.
A more general equivalent to sort! would be:

@date_array.replace(@date_array.sort_by {...})

Another one of those sometimes subtle differences between variables and objects.
 
R

Rob Biedenharn

Hi there. I am having a bit of trouble trying to solve a particular
sorting problem with an array of data.

Please bear with me for a moment regarding the setup of this problem.
I don't currently have any control over the input data. I am just
trying to format the report of this data as per the requirement.

I have the following sample array data to work with:
@date_array = [['2/22/07','1:03 pm'],['3/9/07','10:45 pm'],
['3/1/07','1:52 pm'],['3/13/07','2:00 pm'],['2/28/07','10:45 am'],
['3/5/07','4:00 pm'],['2/14/07','5:00 pm'],['3/9/07','10:18 am'],
['3/13/07','11:15 am']]

(The actual array contains more data, but this is good enough to work
on this problem. BTW, date format = mm/dd/yy.)

Requirement:
1) Sort in descending order by Date (i.e. @date_array[x][0] )
2) Sort in ascending order by Time (i.e. @date_array[x][1] )

(Standard Library) objects are your friend.

rick@frodo:/public/rubyscripts$ cat datesort.rb
date_array = [['2/22/07','1:03 pm'],['3/9/07','10:45 pm'],
['3/1/07','1:52 pm'],['3/13/07','2:00 pm'],['2/28/07','10:45 am'],
['3/5/07','4:00 pm'],['2/14/07','5:00 pm'],['3/9/07','10:18 am'],
['3/13/07','11:15 am']]

p date_array.sort do |a, b|
ddiff = Date.parse(b[0],true) <=> Date.parse(a[0],true)
ddiff == 0 ? Time.parse(a[0]) <=> Time.parse(b[0]) : ddiff
end


I think this was meant to be (note Time.parse(a[ 1 ]) not [0])

p date_array.sort do |a,b|
(Data.parse(b[0],true) <=> Date.parse(a[0],true)).nonzero? ||
rick@frodo:/public/rubyscripts$ ruby datesort.rb
[["2/14/07", "5:00 pm"], ["2/22/07", "1:03 pm"], ["2/28/07", "10:45
am"], ["3/1/07", "1:52 pm"], ["3/13/07", "11:15 am"], ["3/13/07",
"2:00 pm"], ["3/5/07", "4:00 pm"], ["3/9/07", "10:18 am"], ["3/9/07",
"10:45 pm"]]
rick@frodo:/public/rubyscripts$


--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

IPMS/USA Region 12 Coordinator
http://ipmsr12.denhaven2.com/

Visit the Project Mercury Wiki Site
http://www.mercuryspacecraft.com/

Rob Biedenharn http://agileconsultingllc.com
(e-mail address removed)
+1 513-295-4739
Skype: rob.biedenharn
 
P

Paul

I think this does not meet the OP's requirements (because of descending
date).

epoch = Date.new(0)
@date_array.sort_by {|dt,tm| [epoch - Date.parse(dt), Time.parse(tm)]}

Substraction with epoch will make date components negative and thus lead
to descending ordering.

Hello Robert, I tried this but I get the following errors:

irb(main):003:0> epoch = Date.new(0)
ArgumentError: wrong number of arguments (1 for 0)
from (irb):3:in `initialize'
from (irb):3

irb(main):005:0> @date_array.sort_by {|dt,tm| [epoch - Date.parse(dt),
Time.parse(tm)]}
NoMethodError: undefined method `parse' for Date:Class
from (irb):5
from (irb):5:in `sort_by'
from (irb):5

Am I missing a 'require' or something to make your lines work?
 
P

Paul

Not exactly the same if you have more than one reference to the Array.
A more general equivalent to sort! would be:

@date_array.replace(@date_array.sort_by {...})

Okay, I've tried to understand this but I'm not quite getting it yet.
Many of the replies posted here integrate the 'print' and the sorting
function, and while normally I'd applaud the efficiency that's not
what I need. I need to replace the original contents of the array
with the sorted contents. Printing afterwards just let's me confirm
that the sorting worked as expected. (By combining the two steps, I'm
only getting partial solutions.)

I've tried using 'sort_by' but I'm not sure how to apply it to multi-
dimensional arrays. What would be the best way to replace the
contents if the data array is really large?

Should I replace my other sort lines with something other? I noticed
that I cannot just change the 'sort' to 'sort_by' in the following
line:
@date_array.sort! { |a,b| b[ 7 ] <=> a[ 7 ] }

Please let me know. Thanks.

Paul.
 
P

Phrogz

I've tried using 'sort_by' but I'm not sure how to apply it to multi-
dimensional arrays. What would be the best way to replace the
contents if the data array is really large?

a) Ruby doesn't have multi-dimensional arrays in the core. Unless
you're using something like NArray, I suspect you're thinking of
arrays of arrays. In this case, your problem is 'just coding' - if you
want to sort an array of arrays, you can sort the primary axis and
then the subsequent arrays. Or you can choose to transpose the array
and sort along some other array.

b) Replace the contents? Perhaps you really want map! to convert the
contents of the array in-place, and then to sort_by{} to twiddle the
bits.

Or, use less memory and more CPU by using sort!. (It's generally easy
in computer programming to trade memory for calculation speed, in both
directions.)
 
P

Paul

a) Ruby doesn't have multi-dimensional arrays in the core. Unless
you're using something like NArray, I suspect you're thinking of
arrays of arrays. In this case, your problem is 'just coding' - if you
want to sort an array of arrays, you can sort the primary axis and
then the subsequent arrays. Or you can choose to transpose the array
and sort along some other array.

Maybe I am thinking of that.. I don't know that terminology. I
understand a multi-dimensional array, but haven't heard of an "array
of arrays" before. They sound the same to me. I'll go with your
phrase. =)
b) Replace the contents? Perhaps you really want map! to convert the
contents of the array in-place, and then to sort_by{} to twiddle the
bits.

Or, use less memory and more CPU by using sort!. (It's generally easy
in computer programming to trade memory for calculation speed, in both
directions.)

Okay, so I don't know what the difference is between these two:
@data.sort! { |a,b| b[x].to_f <=> a[x].to_f } # i.e. descending numeric sort and
@data.replace( @data.sort { |a,b| b[x].to_f <=> a[x].to_f } ) # i.e. descending numeric sort

Does the former use more CPU and less memory, and the latter the
opposite?

When I checked the Programming Ruby reference it says that Array#map!
is a 'synonym for Array#collect!' and when I check Array#collect! I
don't think it's what I want in this sort function. Or if it is, I
don't know how to write the code for it. And until today I had never
seen 'sort_by' so I'm still trying to figure out that function too.
 
R

Rick DeNatale

Okay, I've tried to understand this but I'm not quite getting it yet.
Many of the replies posted here integrate the 'print' and the sorting
function, and while normally I'd applaud the efficiency that's not
what I need. I need to replace the original contents of the array
with the sorted contents. Printing afterwards just let's me confirm
that the sorting worked as expected. (By combining the two steps, I'm
only getting partial solutions.)

Well that's what Rob and I were talking about. You've got two
alternatives here.

If you use Rob's suggestion:
@date_array = @date_array.sort_by {...}

You are changing the @date_array VARIABLE to refer to the new sorted array.

This is okay as long as you don't have other variables which refer to
the old array, and which you want to now refer to the new value. For
a shorter example:

a = [1,2,3]
b = a

a = a.reverse
p a => [3, 2, 1]
p b => [1, 2, 3]

BUT

a = [1, 2, 3]
b = a
a.reverse! # or a.replace(a.reverse)
p a => [3, 2, 1]
p b => [3, 2, 1]

Since a and b still both refer to the same object
I've tried using 'sort_by' but I'm not sure how to apply it to multi-
dimensional arrays. What would be the best way to replace the
contents if the data array is really large?

Well, your example really is a multi-dimensional array, actually
there's really no such thing in Ruby (there are add-ons line NArray
but that's another story). You've got nested arrays.
Should I replace my other sort lines with something other? I noticed
that I cannot just change the 'sort' to 'sort_by' in the following
line:
@date_array.sort! { |a,b| b[ 7 ] <=> a[ 7 ] }

sort takes an optional block with two arguments to be compared, it
should return -1 if the first argument is less than the first , 0 if
they are equal and +1 if the first is greater than the second. This
will get invoked many times while sorting a large array. So it can get
expensive if the block takes any significant time.

sort_by on the other hand takes a block which takes one argument which
is an element in the collection to be sorted. This block returns an
object which is used as a sort key for each element in the collection.

I tried to give an example using sort_by before I read your
requirement to have the sort descending by date and ascending by time.
To use sort_by you need to come up with an object which represents
the date and time and sorts that way. I didn't want to work that hard.
<G>
 
P

Phrogz

Maybe I am thinking of that.. I don't know that terminology. I
understand a multi-dimensional array, but haven't heard of an "array
of arrays" before. They sound the same to me. I'll go with your
phrase. =)

Consider:

a1 = [
[ 1, 2, 3 ],
[ 4, 5, 6 ],
[ 7, 8, 9 ]
]

This looks like a 3x3 array, since you can ask for the value at row 2,
column 0 with a1[2][0]. However, it's really four arrays - one array
with three elements, each of which is an array of 3 elements.

But the 'dimensionality' isn't guaranteed:

a2 = [
[ 1, 2, 3, 4 ],
nil,
[ 5 ],
[ 6, 7, 8, 9, 10 11 12 ]
]

You can still ask for a2[2][0], but if you ask for a2[1][0] you'll get
a runtime error when you try to call the #[] method on that nil value.

A true 2-dimensional array would be a single object with the ability
to always access any one of the n x m entries, usually with notation
like a3[ 2, 0 ].

Okay, so I don't know what the difference is between these two:
@data.sort! { |a,b| b[x].to_f <=> a[x].to_f } # i.e. descending numeric sort and
@data.replace( @data.sort { |a,b| b[x].to_f <=> a[x].to_f } ) # i.e. descending numeric sort

The former sorts the original array in-place.

The latter first creates a new copy of the original array that is
sorted (the return value of #sort) and then replaces the original with
that array. An extra array is created en route.
When I checked the Programming Ruby reference it says that Array#map!
is a 'synonym for Array#collect!' and when I check Array#collect! I
don't think it's what I want in this sort function. Or if it is, I
don't know how to write the code for it. And until today I had never
seen 'sort_by' so I'm still trying to figure out that function too.

map! is useful if you're trying to transform the values in one array
into another set of values. For example:

irb(main):001:0> a = [ "1", "12", "3.1415" ]
=> ["1", "12", "3.1415"]
irb(main):002:0> b = a.map{ |x| x.to_i }
=> [1, 12, 3]
irb(main):003:0> a
=> ["1", "12", "3.1415"]

As you can see, a new array was created (that 'b' now refers to), but
the original values in 'a' are preserved. However:

irb(main):004:0> a.map!{ |x| x.to_i }
=> [1, 12, 3]
irb(main):005:0> a
=> [1, 12, 3]

The exclamation point indicates (in this case) that the array is being
modified in place.


So, if you find that 'normal' methods are using up too much memory,
and you don't need the original values, my suggestion was:

a.map!{ ...convert strings to Time objects, losing the strings... }
a.sort!{ ...sort as you see fit... }
a.map!{ ...if you need to, convert them to something else... }


Read the documentation on the sort_by method to see how it works. It's
relevant to this discussion that it creates an additional array as it
goes (to store the sort keys and values in, and then the values
themselves). There is no sort_by! method to do it in-place.


However, all this discussion smells like premature optimization. If
you have an array of 100 items that each 'weigh' 1MB, and then you
create 3 copies of that array, you have NOT allocated an extra 300MB.
Each copy of the array has the same lightweight references to those
same heavy objects.

Before you spend too much time wondering about the internal
implementation and mangling your code in order to make it as optimal
as possible, you should be certain that the 'problem' you're working
around is really a problem.
 
P

Phrogz

And until today I had never
seen 'sort_by' so I'm still trying to figure out that function too.

Having recommended you read the documentation, let me try to explain
it (since the documentation isn't as clear as I think I can do here):

Say you have an array of 1,000 elements. When you sort them, the
sorting algorithm might perform (on average) something like 7,000
comparisons to sort your items, but in some cases it might perform
1,000,000 comparisons.

If every element is a number, that's about as good as you're going to
get.

Now imagine that every element is instead a person, and you want to
sort them by something like their body mass index plus their age times
the average weight of all their children. Something that takes a long
time to calculate.

If you write:
my_array.sort{ |a,b| a.long_calc <=> b.long_calc }
then you might have to perform that complex calculation 1,000,000
times to sort your array. Less than ideal.

If instead you write:
my_array.sort{ |person| person.long_calc }
then Ruby will first loop through your items and run long_calc for
each one, saving the values. Exactly 1,000 calls to long_calc. THEN it
will use those values to sort your original array, able to quickly
compare the saved result for each.


That's the performance reason to use sort_by. I personally use it
because it's much less typing. :)
 
R

Rick DeNatale

If you write:
my_array.sort{ |a,b| a.long_calc <=> b.long_calc }
then you might have to perform that complex calculation 1,000,000
times to sort your array. Less than ideal.

If instead you write:
my_array.sort{ |person| person.long_calc }
then Ruby will first loop through your items and run long_calc for
each one, saving the values. Exactly 1,000 calls to long_calc. THEN it
will use those values to sort your original array, able to quickly
compare the saved result for each.


That's the performance reason to use sort_by. I personally use it
because it's much less typing. :)


And of course, as the documentation for sort_by points out, it's not
always wise to use it for performance reasons. According to the doc

a = (1..100000).map {rand(100000)}

a.sort runs an order of magnitude faster than:
a.sort_by {|a| a}

It's also not in general easy to come up with a sort_by block which
does the kind of multi-attribute sort with different sort orders for
each attribute that the OP was looking for.
 
M

Martin DeMello

It's also not in general easy to come up with a sort_by block which
does the kind of multi-attribute sort with different sort orders for
each attribute that the OP was looking for.

I like this hack:

class RevCmp
attr_reader :this

def initialize(obj)
@this = obj
end

def <=>(other)
other.this <=> @this
end
end

def rev(obj)
RevCmp.new(obj)
end

a.sort_by {|x| [x.foo, rev(x.bar), x.baz]}

martin
 
R

Robert Klemme

I think this does not meet the OP's requirements (because of descending
date).

epoch = Date.new(0)
@date_array.sort_by {|dt,tm| [epoch - Date.parse(dt), Time.parse(tm)]}

Substraction with epoch will make date components negative and thus lead
to descending ordering.

Hello Robert, I tried this but I get the following errors:

irb(main):003:0> epoch = Date.new(0)
ArgumentError: wrong number of arguments (1 for 0)
from (irb):3:in `initialize'
from (irb):3

irb(main):005:0> @date_array.sort_by {|dt,tm| [epoch - Date.parse(dt),
Time.parse(tm)]}
NoMethodError: undefined method `parse' for Date:Class
from (irb):5
from (irb):5:in `sort_by'
from (irb):5

Am I missing a 'require' or something to make your lines work?

Maybe it's the version?

13:25:57 [~]: irb
irb(main):001:0> Date.new 0
NameError: uninitialized constant Date
from (irb):1
from :0
irb(main):002:0> require 'date'
=> true
irb(main):003:0> Date.new 0
=> #<Date: 3442115/2,0,2299161>
irb(main):004:0> RUBY_VERSION
=> "1.8.5"
irb(main):005:0>

Kind regards

robert
 
P

Paul

Thanks to everyone for your info and feedback. I wanted to avoid any
additional 'require' statements in the script so I reviewed the 'Time'
class some more. After having reviewed all of the solutions, I came
up with my own that seems to work for both the date and time fields:
@date_array.sort! {|a,b| Time.parse( b[ 0 ] ) <=> Time.parse( a[ 0 ] ) } # i.e. descending sort by date

Given that I am manipulating an array of arrays that may not really
get to be more than 1,000 lines, I think this will work well enough
for now. I would not have been able to figure it out if I hadn't had
all of your solutions to review and compare. Thanks again.

Cheers! Paul.


Maybe it's the version?

13:25:57 [~]: irb
irb(main):002:0> require 'date'
=> true
irb(main):003:0> Date.new 0
=> #<Date: 3442115/2,0,2299161>
irb(main):004:0> RUBY_VERSION
=> "1.8.5"
irb(main):005:0>

Kind regards

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top