how do you do this

G

George George

Given an array of strings e.g.
x = ["abc","abcde" "def","xyzwj"] and of different lengths,
how can you efficiently create new arrays of strings which are of the
same length. for example the above array can be transformed into

x1 = ["abc","def"]
x2 = ["abcde","xyzwj"]

Thank you.
 
I

Ilan Berci

George said:
Given an array of strings e.g.
x = ["abc","abcde" "def","xyzwj"] and of different lengths,
how can you efficiently create new arrays of strings which are of the
same length. for example the above array can be transformed into

x1 = ["abc","def"]
x2 = ["abcde","xyzwj"]

Thank you.

y = {}
x.each do |v|
y[v.length] || = []
y[v.length] << v
end
y.values

or if you prefer less lines..

x.inject({}) do |h, v|
(y[v.length] || = []) << v
h
end.values
 
P

Paul Smith

Given an array of strings e.g.
=A0x =3D ["abc","abcde" "def","xyzwj"] and of different lengths,
how can you efficiently create new arrays of strings which are of the
same length. for example the above array can be transformed into

Well, here's something close:

h =3D {}
x.each do |i|
h[i.length] ||=3D []
h[i.length] << i
end

h is now a hash: {3=3D>["abc", "def"], 5=3D>["abcde", "xyzwj"]}

That's close enough to what you want that I'm sure you can run with
it. Look in the "Group by unique entries of a hash" thread for more
ideas.
x1 =3D ["abc","def"]
x2 =3D ["abcde","xyzwj"]

Thank you.



--=20
Paul Smith
http://www.nomadicfun.co.uk

(e-mail address removed)
 
J

Jesús Gabriel y Galán

Given an array of strings e.g.
=A0x =3D ["abc","abcde" "def","xyzwj"] and of different lengths,
how can you efficiently create new arrays of strings which are of the
same length. for example the above array can be transformed into

x1 =3D ["abc","def"]
x2 =3D ["abcde","xyzwj"]

You might want to look at group_by:

%w{abc ads adfdf adfdw fefm mfekmw fmdms}.group_by {|x| x.length}

Jesus.
 
P

Paul Smith

George said:
Given an array of strings e.g.
=A0x =3D ["abc","abcde" "def","xyzwj"] and of different lengths,
how can you efficiently create new arrays of strings which are of the
same length. for example the above array can be transformed into

x1 =3D ["abc","def"]
x2 =3D ["abcde","xyzwj"]

Thank you.

y =3D {}
x.each do |v|
=A0y[v.length] || =3D []
=A0y[v.length] << v
end
y.values

LOL :) I love Ruby and Rubytalk :)
or if you prefer less lines..

x.inject({}) do |h, v|
=A0(y[v.length] || =3D []) << v
=A0h
end.values

Must.... master..... inject.....

--=20
Paul Smith
http://www.nomadicfun.co.uk

(e-mail address removed)
 
P

Paul Smith

2009/10/1 Jes=FAs Gabriel y Gal=E1n said:
Given an array of strings e.g.
=A0x =3D ["abc","abcde" "def","xyzwj"] and of different lengths,
how can you efficiently create new arrays of strings which are of the
same length. for example the above array can be transformed into

x1 =3D ["abc","def"]
x2 =3D ["abcde","xyzwj"]

You might want to look at group_by:

=A0%w{abc ads adfdf adfdw fefm mfekmw fmdms}.group_by {|x| x.length}

I really can't believe Ruby sometimes. This is so freaking awesome!

Must get back to real work...
--=20
Paul Smith
http://www.nomadicfun.co.uk

(e-mail address removed)
 
B

Bertram Scharpf

Hi,

Am Donnerstag, 01. Okt 2009, 21:23:30 +0900 schrieb George George:
Given an array of strings e.g.
x = ["abc","abcde" "def","xyzwj"] and of different lengths,
how can you efficiently create new arrays of strings which are of the
same length. for example the above array can be transformed into

x1 = ["abc","def"]
x2 = ["abcde","xyzwj"]

x = %w(abc abcde def xyzwj)
x.inject( Hash.new { |h,k| h[k] = [] }) { |h,e| h[e.length].push e ; h }

Bertram
 
H

Harry Kakueki

Given an array of strings e.g.
x = ["abc","abcde" "def","xyzwj"] and of different lengths,
how can you efficiently create new arrays of strings which are of the
same length. for example the above array can be transformed into

x1 = ["abc","def"]
x2 = ["abcde","xyzwj"]

Thank you.


p x.map{|a| a.length}.uniq.map{|b| x.select{|c| c.length == b}}

#> [["abc", "def"], ["abcde", "xyzwj"]]


Harry
 
J

Jesús Gabriel y Galán

Hi --



It's interesting how often the need for group_by without the keys
comes up. Meaning, in this case, to get the new arrays you'd
ultimately do:

=A0arr.group_by(&:length).values

Yup, although in this case, I'm going to guess that he will either

- Access the list of words of a specific number

desired_length =3D something
a =3D %w{abc ads adfdf adfdw fefm mfekmw fmdms}.group_by {|x| x.length}

a[desired_length]

- Sort the groups by length

a =3D %w{abc ads adfdf adfdw fefm mfekmw fmdms}.group_by {|x| x.length}
a.sort.map {|x| x[1]} # or something

and I believe there was at least one similar case mentioned here
recently. I wonder whether it would be cool to have a method that did
this -- in effect:

=A0module Enumerable
=A0 =A0def group_by_without_keys(&block)
=A0 =A0 =A0group_by(&block).values
=A0 =A0end
=A0end

I'm not sure what it should be called, though.

values_grouped_by
?

Jesus.
 
R

Ryan Davis

George said:
Given an array of strings e.g.
x = ["abc","abcde" "def","xyzwj"] and of different lengths,
how can you efficiently create new arrays of strings which are of the
same length. for example the above array can be transformed into

x1 = ["abc","def"]
x2 = ["abcde","xyzwj"]

Thank you.

y = {}
x.each do |v|
y[v.length] || = []
y[v.length] << v
end
y.values

or if you prefer less lines..

x.inject({}) do |h, v|
(y[v.length] || = []) << v
h
end.values

Syntax error in both cases. It needs to be "||=", not "|| =".

Well... inject ALWAYS loses, but fanboys sure seems to like it for no
good reason.

By using better names and the right tool for the job, this becomes a
LOT more readable, maintanable, and faster all in one fell swoop:

by_length = Hash.new { |h,k| h[k] = [] }
strings.each do |string|
by_length[string.length] << string
end
by_length.values # I think this part is a mistake, but I wanted to match

I think the readability is more important than speed by a long shot...
But just in case you're not convinced, check out the benchmarks:

% ./blah.rb 10000
# of iterations = 10000
user system total real
null_time 0.000000 0.000000 0.000000 ( 0.001370)
mine 7.790000 0.050000 7.840000 ( 7.869737)
yours-inject 15.170000 0.050000 15.220000 ( 15.554334)
yours-each 11.850000 0.100000 11.950000 ( 12.013553)

inject is twice as slow as mine. stop using it.
 
J

Josh Cheek

[Note: parts of this message were removed to make it a legal post.]

Well... inject ALWAYS loses, but fanboys sure seems to like it for no good
reason.

By using better names and the right tool for the job, this becomes a LOT
more readable, maintanable, and faster all in one fell swoop:

by_length = Hash.new { |h,k| h[k] = [] }
strings.each do |string|
by_length[string.length] << string
end
by_length.values # I think this part is a mistake, but I wanted to match

I think the readability is more important than speed by a long shot... But
just in case you're not convinced, check out the benchmarks:

% ./blah.rb 10000
# of iterations = 10000
user system total real
null_time 0.000000 0.000000 0.000000 ( 0.001370)
mine 7.790000 0.050000 7.840000 ( 7.869737)
yours-inject 15.170000 0.050000 15.220000 ( 15.554334)
yours-each 11.850000 0.100000 11.950000 ( 12.013553)

inject is twice as slow as mine. stop using it.

I generalized yours, and made the returned groups sorted by the results from
the call. In this more comparable situation, inject is about 11% slower, not
twice as slow.

Inject Test
Rehearsal --------------------------------------------------
Without Inject 14.160000 0.100000 14.260000 ( 14.364824)
With Inject 15.950000 0.120000 16.070000 ( 16.258609)
---------------------------------------- total: 30.330000sec

user system total real
Without Inject 14.200000 0.110000 14.310000 ( 14.553592)
With Inject 16.000000 0.120000 16.120000 ( 16.422186)

Inject is about 11.38% slower


Here is the code:


#!/usr/bin/env ruby
require 'benchmark'

class Symbol
def to_proc
Proc.new{|obj| obj.send self } # give 1.9ish syntax
end
end


module Enumerable

def group_by_without_inject( &get_key )
groups = Hash.new { |h,k| h[k] = Array.new }
each do |obj|
groups[ get_key[obj] ] << obj
end
groups.keys.sort!.map!{|key| groups[key] }
end

def group_by_with_inject( &get_key )
groups = inject Hash.new{ |h,k| h[k] = Array.new } do |groups,obj|
groups[ get_key[obj] ] << obj
groups
end
groups.keys.sort!.map!{|key| groups[key] }
end

end


puts "Inject Test"
benchmarks = Benchmark.bmbm do|b|
x = ["abc","abcde","def","xyzwj"]

b.report("Without Inject") do
500_000.times{ x.group_by_without_inject &:length }
end

b.report("With Inject") do
500_000.times{ x.group_by_with_inject &:length }
end
end

benchmarks.map!{|b| b.real }
percent_slower = sprintf( "%.2f" , 100 - 100 * benchmarks.first /
benchmarks.last )
puts '' , "Inject is about #{ percent_slower }% slower"
 
T

Tony Arcieri

[Note: parts of this message were removed to make it a legal post.]

Must.... master..... inject.....

I think it's much more readable to build hashes with something like:

h = {}
blah.each do |v|
h[...] = ...
end

than:

blah.inject({}) do |h, v|
h[...] = ...
h
end

Gratuitous use of inject FTL. Ruby isn't an immutable state functional
language.
 
R

Ryan Davis

I generalized yours, and made the returned groups sorted by the
results from
the call. In this more comparable situation, inject is about 11%
slower, not
twice as slow.

This "more comparable" situation is full of bugs and isn't comparable.

Yes, I should have said "your [ilan's] inject version is twice as slow
as mine" instead of "inject is twice as slow as mine" but my numbers
still stand. If you use the right tool for the job and it'll pay off
in both maintainability and speed.

Your version isn't maintainable, has bugs(*) and obfuscates a ton,
missing my point entirely. Simpler code wins HANDS DOWN. As I said the
first time: "I think the readability is more important than speed by a
long shot". FWIW, my results running your code as-is was exactly 2x
yours (22% slower, not 11% slower).

*) calling sort! within a law of demeter violation is ALWAYS a bug.
*) calling (almost) any bang method on a temporary value is usually a
bug.
 
D

David A. Black

Hi --

Must.... master..... inject.....

I think it's much more readable to build hashes with something like:

h = {}
blah.each do |v|
h[...] = ...
end

than:

blah.inject({}) do |h, v|
h[...] = ...
h
end

I agree, and I think that's it's nice that 1.9 provides
Enumerator#with_object, which lets you avoid that explicit feeding of
the accumulator back into the loop.


David
 
J

Josh Cheek

[Note: parts of this message were removed to make it a legal post.]

Your version isn't maintainable, has bugs(*) and obfuscates a ton, missing
my point entirely. Simpler code wins HANDS DOWN. As I said the first time:
"I think the readability is more important than speed by a long shot".

Perhaps, but it is simpler because it is too specific to to the given
problem. Once it must be rewritten in several different places, it is no
longer more maintainable. It will also clutter the code, making it less
readable.

*) calling sort! within a law of demeter violation is ALWAYS a bug.
That is fair.

*) calling (almost) any bang method on a temporary value is usually a bug.

Why is that?
 
Y

Yossef Mendelssohn

ug.

Why is that?

What's the point of calling a bang method there? You don't care at all
about the object you're mutating, and it seems like a blatant case of
premature optimization.

Now, consider the specific cases where the bang method doesn't return
the same value as the non-bang (viz. uniq!).
 
J

Josh Cheek

[Note: parts of this message were removed to make it a legal post.]

What's the point of calling a bang method there? You don't care at all
about the object you're mutating, and it seems like a blatant case of
premature optimization.

Now, consider the specific cases where the bang method doesn't return
the same value as the non-bang (viz. uniq!).
I see, thank you.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,431
Messages
2,571,679
Members
48,796
Latest member
Greg L.

Latest Threads

Top