You are changing subject in between: Originally you started out with
memory consumption being the issue. Now you talk about speed. Your
solution is neither fast nor easy on the memory. For speed see the
benchmark below. It's not easy on the memory because you still create
a *ton* of temporary one element arrays (four per iteration if I'm not
mistaken) - this avoids the single large copy but allocating and
GC'ing all these temporaries imposes a significant overhead. The
obvious and simple solution here is to do it all in *one* iteration.
So why invent something new if there is an easy and efficient solution
available?
I can get comparable results for my solution
Rehearsal --------------------------------------------------
serially 7.410000 0.120000 7.530000 ( 7.731576)
classic 5.890000 0.080000 5.970000 ( 6.123102)
---------------------------------------- total: 13.500000sec
user system total real
serially 6.500000 0.000000 6.500000 ( 6.581172)
classic 3.160000 0.010000 3.170000 ( 3.199095)
See
http://xtargets.com/snippets/posts/show/69 for details
What kills me and makes the final overhead is a fast way to
*non-recursively* flatten an array of arrays. My solution
is
def self.flatten(o)
length = o.inject(0) {|m,i|m+i.length}
out = Array.new(length)
base = 0
o.each do |sub|
top = base + sub.length
out[base..top]=sub
base = top
end
end
It is much faster than the inject and concat simple
method but still too slow.
Any ideas ( apart from code it in C )
Even without flattening you won't beat #each:
13:19:42 [Temp]: ./ser.rb
Rehearsal -----------------------------------------------------
classic 3.188000 0.000000 3.188000 ( 3.392000)
serially 12.968000 0.031000 12.999000 ( 13.770000)
serially! 12.563000 0.000000 12.563000 ( 13.187000)
serially_chunk 3.578000 0.000000 3.578000 ( 3.628000)
serially_chunk_rk 3.594000 0.000000 3.594000 ( 3.643000)
inject 4.015000 0.016000 4.031000 ( 4.124000)
each 2.922000 0.016000 2.938000 ( 3.091000)
------------------------------------------- total: 42.891000sec
user system total real
classic 3.110000 0.000000 3.110000 ( 3.237000)
serially 12.515000 0.000000 12.515000 ( 12.973000)
serially! 12.000000 0.031000 12.031000 ( 12.471000)
serially_chunk 2.938000 0.000000 2.938000 ( 3.140000)
serially_chunk_rk 3.172000 0.000000 3.172000 ( 3.331000)
inject 3.781000 0.000000 3.781000 ( 3.839000)
each 2.891000 0.000000 2.891000 ( 2.956000)
robert
#!ruby
require 'benchmark'
ITER = 10
module Enumerable
def serially(&b)
a = []
each{|x| t = [x].instance_eval(&b); a << t[0] unless t.empty?}
a
end
end
require 'enumerator'
class Serializer
attr_accessor

bj
def collect(*args, &block)
@obj.collect!(*args, &block)
end
def select(*args, &block)
@obj = @obj.select(*args, &block)
end
def self.flatten(o)
length = o.inject(0) {|m,i|m+i.length}
out = Array.new(length)
base = 0
o.each do |sub|
top = base + sub.length
out[base..top]=sub
base = top
end
end
end
module Enumerable
def flatten
length = @obj.inject(0) {|m,i|m+i.length}
out = Array.new(length)
base = 0
@obj.each do |sub|
out[base..base+sub.length]=sub
base = base + sub.length
end
end
def serially_chunk(slice=50, &b)
t = []
o = Serializer.new
self.each_slice(slice) do |x|
o.obj = x
o.instance_eval(&b)
t << o.obj
end
Serializer.flatten t
end
def serially_chunk_rk(slice=50, &b)
t = []
o = Serializer.new
each_slice(slice) do |x|
o.obj = x
o.instance_eval(&b)
t.concat o.obj
end
t
end
end
a = (1..50000)
Benchmark.bmbm(15) do |bench|
bench.report("classic") do
ITER.times do
# a = (1..50000)
b = a.select {|x| x>280}.collect{|x|"0x" << x.to_s}.select{|x| x != "0x15"}
end
end
bench.report("serially") do
ITER.times do
# a = (1..50000)
b = a.serially { select {|x| x>280}.collect{|x|"0x" << x.to_s}.select{|x| x != "0x15"} }
end
end
bench.report("serially!") do
ITER.times do
# a = (1..50000)
b = a.serially { select {|x| x>280}.collect! {|x|"0x" << x.to_s}.select {|x| x != "0x15"} }
end
end
bench.report("serially_chunk") do
ITER.times do
# a = (1..50000)
b = a.serially_chunk(1000) do
select {|x| x > 280}
collect do |x|
"0x" << x.to_s
end
select {|x| x != "0x15"}
end
end
end
bench.report("serially_chunk_rk") do
ITER.times do
# a = (1..50000)
b = a.serially_chunk_rk(1000) do
select {|x| x > 280}
collect do |x|
"0x" << x.to_s
end
select {|x| x != "0x15"}
end
end
end
bench.report("inject") do
ITER.times do
# a = (1..50000)
b = a.inject([]) do |acc, x|
if x > 280
x = "0x" << x.to_s
acc << x unless x == "0x15"
end
acc
end
end
end
bench.report("each") do
ITER.times do
# a = (1..50000)
b = []
a.each do |x|
if x > 280
x = "0x" << x.to_s
b << x unless x == "0x15"
end
end
end
end
end