How to define doubles: Array-to-hash

T

Tom Ha

Hi there,

Assume the following array (contains doubles):

original = [1, 2, 3, 3, 3, 3, 4, 4, 5]

What's the most efficient way to obtain a hash that tells me what
elements of "original" have more than 1 appearance/entry (= doubles)?

The result should look like this:

result = {"3"=>4, "4"=>2}

Thanks for your help!
Tom-n00b
 
T

Tim Hunter

Tom said:
Hi there,

Assume the following array (contains doubles):

original = [1, 2, 3, 3, 3, 3, 4, 4, 5]

What's the most efficient way to obtain a hash that tells me what
elements of "original" have more than 1 appearance/entry (= doubles)?

The result should look like this:

result = {"3"=>4, "4"=>2}

Thanks for your help!
Tom-n00b

I won't argue that it's the most "efficient" but the following both
works and is easy to understand.


irb(main):007:0> original.each do |o|
irb(main):008:1* o = o.to_s
irb(main):009:1> result[o] ||= 0
irb(main):010:1> result[o] += 1
irb(main):011:1> end
=> [1, 2, 3, 3, 3, 3, 4, 4, 5]
irb(main):012:0> result
=> {"1"=>1, "2"=>1, "3"=>4, "4"=>2, "5"=>1}
irb(main):013:0> result.delete_if {|k, v| v == 1}
=> {"3"=>4, "4"=>2}
 
B

Bertram Scharpf

Hi,

Am Montag, 10. Aug 2009, 00:54:31 +0900 schrieb Tom Ha:
original = [1, 2, 3, 3, 3, 3, 4, 4, 5]

The result should look like this:

result = {"3"=>4, "4"=>2}

Here are two:

original.inject({}) { |h,e| h[e] ||= 0 ; h[e] += 1 ; h }

and

h = Hash.new { |h,k| h[k] = 0 }
original.each { |e| h[e] += 1 }
h

then

h.reject! { |k,v| k == 1 }

Bertram
 
7

7stud --

Bertram said:
Here are two:

original.inject({}) { |h,e| h[e] ||= 0 ; h[e] += 1 ; h }

and

h = Hash.new { |h,k| h[k] = 0 }
original.each { |e| h[e] += 1 }
h

then

h.reject! { |k,v| k == 1 }

Bertram

Your code doesn't produce the desired result, the op wanted an efficient
solution, and this:

h = Hash.new { |h,k| h[k] = 0 }

is equivalent to:

h = Hash.new(0)


How about a regex?

original = [1, 2, 3, 3, 3, 3, 4, 4, 5]
results = {}

str = original.to_s

str.scan(/((\d)\2+)/).each do |match|
results[match[1]] = match[0].length
end

p results

--output:--
{"3"=>4, "4"=>2}

...but that is no faster than:


results = Hash.new(0)

original.each do |elmt|
results[elmt.to_s] += 1
end

results.delete_if {|key, val| val == 1}

p results

--output:--
{"3"=>4, "4"=>2}
 
D

David A. Black

Hi --

Hi,

Am Montag, 10. Aug 2009, 08:17:42 +0900 schrieb 7stud --:
Bertram said:
h = Hash.new { |h,k| h[k] = 0 }

h = Hash.new { |h,k| h[k] = 0 }

is equivalent to:

h = Hash.new(0)

Arrgh. Of course.

Actually they're not equivalent.
h = Hash.new(0) => {}
h[1] => 0
h => {}
h = Hash.new {|h,k| h[k] = 0 } => {}
h[1] => 0
h
=> {1=>0}


David

--
David A. Black / Ruby Power and Light, LLC / http://www.rubypal.com
Q: What's the best way to get a really solid knowledge of Ruby?
A: Come to our Ruby training in Edison, New Jersey, September 14-17!
Instructors: David A. Black and Erik Kastner
More info and registration: http://rubyurl.com/vmzN
 
B

Bertram Scharpf

Hi,

Am Montag, 10. Aug 2009, 15:32:44 +0900 schrieb David A. Black:
Am Montag, 10. Aug 2009, 08:17:42 +0900 schrieb 7stud --:
Bertram Scharpf wrote:
h = Hash.new { |h,k| h[k] = 0 }

h = Hash.new { |h,k| h[k] = 0 }

is equivalent to:

h = Hash.new(0)

Arrgh. Of course.

Actually they're not equivalent.
h = Hash.new(0) => {}
h[1] => 0
h => {}
h = Hash.new {|h,k| h[k] = 0 } => {}
h[1] => 0
h
=> {1=>0}

Arrgh. Of course.

I remind that my code called Hash#[] just in

h[e] += 1

and therefore it is even more efficient to use the default value
solution.

Bertram
 
L

Lars Christensen

Your code doesn't produce the desired result, the op wanted an efficient
solution, and this:

Actually, he wanted the "most efficient".

This is faster than the inject/hash method:

result = []
lastv = nil
count = 1
original.sort.each do |v|
if v == lastv
count += 1
else
result.push(lastv, count) if lastv
count = 1
lastv = v
end
end
result.push(lastv, count)
hash = Hash[*result]

This is O(n logn+n) while the inject method should be theoretically be
O(n) (assuming that Hash tables lookups/insertions are constant-time
and inject is linear). However, its significantly faster (250%) on
1.8.6-mswin32 and a little faster (50%) on 1.9.1-mingw32, even on
large arrays (millions of elements). Anyone who can explain why?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,150
Latest member
MakersCBDReviews
Top