It looks like unpack is a clear winner:
Then on second thought, it looks like the problem with scan was the
construction of the regex, although using scan without pre-compiling
the regex is about twice as slow as unpack, using scan with a
pre-compiled regexp looks like it's about 100 times faster!
rick@frodo:~/rubyscripts$ cat benchstringsplit.rb
require 'benchmark'
include Benchmark
class String
To_chars_regex = Regexp.new('/./')
def to_chars_array_with_unpack
unpack('a'*length)
end
def to_chars_array_with_scan
scan /./
end
def to_chars_array_with_scan_precomp
scan To_chars_regex
end
end
iterations = 100
str = "abcdefghijklmnopqrstuvwxyz" * 5
bmbm do | x |
5.times do
x.report("unpack #{str.length} character string") do
iterations.times do
str.to_chars_array_with_unpack
end
end
x.report("scan #{str.length} character string") do
iterations.times do
str.to_chars_array_with_scan
end
end
x.report("scan-precomp #{str.length} character string") do
iterations.times do
str.to_chars_array_with_scan_precomp
end
end
str += str
end
end
rick@frodo:~/rubyscripts$ ruby benchstringsplit.rb
Rehearsal ----------------------------------------------------------------------
unpack 130 character string 0.960000 0.010000 0.970000 ( 0.984373)
scan 130 character string 2.150000 0.000000 2.150000 ( 2.178162)
scan-precomp 130 character string 0.010000 0.000000 0.010000 ( 0.012862)
unpack 260 character string 0.910000 0.000000 0.910000 ( 0.910658)
scan 260 character string 2.040000 0.000000 2.040000 ( 2.100734)
scan-precomp 260 character string 0.010000 0.000000 0.010000 ( 0.010890)
unpack 520 character string 0.940000 0.000000 0.940000 ( 0.942446)
scan 520 character string 1.990000 0.000000 1.990000 ( 2.020499)
scan-precomp 520 character string 0.010000 0.000000 0.010000 ( 0.010869)
unpack 1040 character string 0.980000 0.010000 0.990000 ( 0.995709)
scan 1040 character string 2.140000 0.000000 2.140000 ( 2.160120)
scan-precomp 1040 character string 0.010000 0.000000 0.010000 ( 0.013315)
unpack 2080 character string 1.130000 0.000000 1.130000 ( 1.214512)
scan 2080 character string 2.110000 0.000000 2.110000 ( 2.132072)
scan-precomp 2080 character string 0.010000 0.000000 0.010000 ( 0.011119)
------------------------------------------------------------ total: 15.420000sec
user system total real
unpack 130 character string 1.270000 0.000000 1.270000 ( 1.338689)
scan 130 character string 2.530000 0.000000 2.530000 ( 2.710398)
scan-precomp 130 character string 0.010000 0.000000 0.010000 ( 0.011328)
unpack 260 character string 1.350000 0.000000 1.350000 ( 1.445329)
scan 260 character string 2.420000 0.000000 2.420000 ( 2.532545)
scan-precomp 260 character string 0.000000 0.000000 0.000000 ( 0.010712)
unpack 520 character string 1.080000 0.010000 1.090000 ( 1.086219)
scan 520 character string 2.120000 0.000000 2.120000 ( 2.128990)
scan-precomp 520 character string 0.010000 0.000000 0.010000 ( 0.010815)
unpack 1040 character string 1.080000 0.000000 1.080000 ( 1.078558)
scan 1040 character string 2.120000 0.000000 2.120000 ( 2.129707)
scan-precomp 1040 character string 0.010000 0.000000 0.010000 ( 0.010945)
unpack 2080 character string 1.210000 0.000000 1.210000 ( 1.267488)
scan 2080 character string 2.460000 0.000000 2.460000 ( 2.627165)
scan-precomp 2080 character string 0.010000 0.000000 0.010000 ( 0.012961)