Is this a sensible implementation for Array#group_by ?

M

Max Williams

I was looking for a method to split an array into smaller arrays based
on some property of the members (set with a code block). I came up with
the following and wanted to know what you guys think about a) whether
it's sensible b) whether it's been done already:

class Array
def group_by
result_array = []
self.each do |element|
key = yield element
if (found = result_array.assoc(key))
found[1] << element
else
result_array << [key, [element]]
end
end
return result_array.collect{|a| a[1]}
end
end
=> ["apple", "banana", "pear", "plum", "nectarine", "orange", "melon"]
#group by length of name=> [["apple", "melon"], ["banana", "orange"], ["pear", "plum"],
["nectarine"]]
#group by whether it has an e=> [["apple", "pear", "nectarine", "orange", "melon"], ["banana",
"plum"]]

One thing i haven't done is make it sort: since the grouping key is just
an object and most objects aren't sortable. I could get round this but
not without slowing it down and the user can always sort the results by
comparing the first member of each subarray.

I'm just after some feedback really. I'm guessing i just couldn't find
the good implementation of it :)
 
M

Max Williams

Adam said:
It seems sensible to me, although I'd use a hash instead of a
associative array - it just looks cleaner. I didn't check the
performance difference though.
I thought about a hash first, but for some reason shied away from a hash
where the keys could be any object, including nil. There's no reason to
be afraid of that though, is there? I think a hash would probably be
faster.
 
M

Max Williams

It's also in Ruby 1.8.7 Enumerable.

ah...we're still on 1.8.6 round these parts. We need to change up
really...i keep seeing this cool stuff.
Another possibility: use a set.
investigates....ah yes, sets, i'd completely overlooked those. For some
reason the Set class is hard to find in the api, or at least hard for me
to find in this particular api - http://www.ruby-doc.org/core/

Converting to_set, then calling divide, then calling to_a again can't be
very efficient though, can it?

thanks
 
E

Erik Veenstra

Indeed, Facets does have an Enumerable#group_by. And it has an
Enumerable#cluster_by as well. And the latter is the one you're
looking for, because you want an Array and not a Hash.

Group_by uses each, because it's faster than inject.

gegroet,
Erik V. - http://www.erikveen.dds.nl/

----------------------------------------------------------------

module Enumerable
def group_by
res = {}
each{|e| (res[yield(e)] ||= []) << e}
res
end

def cluster_by(&block)
#group_by(&block).values # In case of unsortable keys.
group_by(&block).sort.transpose.pop || []
end
end

----------------------------------------------------------------

a = %w(apple banana pear plum nectarine orange melon)

a.group_by{|e| e.length} # ==> {5=>["apple", "melon"],
6=>["banana", "orange"], 9=>["nectarine"], 4=>["pear", "plum"]}
a.cluster_by{|e| e.length} # ==> [["pear", "plum"], ["apple",
"melon"], ["banana", "orange"], ["nectarine"]]

----------------------------------------------------------------
 
M

Max Williams

Adam Shelley wrote
I think the Facets library already has a similar method.

Erik said:
Indeed, Facets does have an Enumerable#group_by. And it has an
Enumerable#cluster_by as well. And the latter is the one you're
looking for, because you want an Array and not a Hash.

Facets - investigates again... now that is *very* good indeed. Wow. I
had a feeling this would exist already in a better form :)

thanks a lot everyone.
 
R

Robert Klemme

2008/9/9 Max Williams said:
Adam Shelley wrote



Facets - investigates again... now that is *very* good indeed. Wow. I
had a feeling this would exist already in a better form :)

If I would want to do it myself, then I'd probably do

module Enumerable
def group_by
result = Hash.new {|h,k| h[k] = []}
each {|el| result[yield el] << el}
result
end
end

Kind regards

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,040
Latest member
papereejit

Latest Threads

Top