duplicating characters in a string

A

Adam Akhtar

If i have a string "abc" and want to make it like this "aabbcc", how do
i go about it?

I thought convert it into an array and then use string.map{|x| x << x}

though i havnt tested that yet. Is there a better way or a string method
that saves me from having to convert it into an array?
 
T

Tim Hunter

Adam said:
If i have a string "abc" and want to make it like this "aabbcc", how do
i go about it?

I thought convert it into an array and then use string.map{|x| x << x}

though i havnt tested that yet. Is there a better way or a string method
that saves me from having to convert it into an array?

I strongly doubt that this is the only way, but I'll offer

irb(main):004:0> "abc".gsub(/./) { |c| c + c}
=> "aabbcc"

Now let's see how many other ways people can think of...
 
G

Gary Wright

If i have a string "abc" and want to make it like this "aabbcc",
how do
i go about it?

I thought convert it into an array and then use string.map{|x| x << x}

though i havnt tested that yet. Is there a better way or a string
method
that saves me from having to convert it into an array?

Ask 10 Ruby programmers this question and I'm sure you'll get 11
different answers. If your definition of 'best' is 'fastest' you'll
simply have to code and benchmark a couple of solutions. Here is one
way to do it:
=> "aabbcc"

Gary Wright
 
A

Adam Akhtar

Ask 10 Ruby programmers this question and I'm sure you'll get 11
different answers. If your definition of 'best' is 'fastest' you'll
simply have to code and benchmark a couple of solutions.

Well actually im new to ruby so although fastest/best solutions would be
appreciated I'd actually prefer ones that were not to advanced or
cryptic to read but at the same time not to innefficent.

Thanks so far for your suggestions!!
 
7

7stud --

Adam said:
Well actually im new to ruby so although fastest/best solutions would be
appreciated I'd actually prefer ones that were not to advanced or
cryptic to read but at the same time not to innefficent.

Thanks so far for your suggestions!!


str = 'abc'
repeat = 2

new_str = ""
0.upto(str.length) do |i|
new_str << str[i, 1] * repeat
end

puts new_str
 
7

7stud --

Adam said:
Well actually im new to ruby so although fastest/best solutions would be
appreciated I'd actually prefer ones that were not to advanced or
cryptic to read but at the same time not to innefficent.

Thanks so far for your suggestions!!

str = 'abc'
duplicate = 2

new_str = ""
str.each_byte do |byte|
duplicate.times do
new_str << byte
end
end

puts new_str
 
S

Sebastian Hungerecker

Robert said:
You do not need groups:

irb(main):003:0> "abc".gsub /./, '\\&\\&'
=> "aabbcc"

You do not need double backslashes either:
=> "aabbcc"
 
J

Joel VanderWerf

Rodrigo said:
"abc".gsub(/./) { |x| x * 2 }

That's the most elegant way to refer to captures, but if speed matters:

"abc".gsub(/(.)/, "\\1\\1")

The "\\1" is always easy to mess up, which is why I generally prefer the
block form.

It's only about a factor of two faster, according to the following, but
that matters sometimes.

require 'benchmark'

Benchmark.bmbm do |b|
s = "abc" * 1_000_000
re = /(.)/

b.report("s.gsub(re){...}") do
s.gsub(re) { |x| x * 2 }
end

b.report("s.gsub(re,...)") do
s.gsub(re, "\\1\\1")
end
end

__END__

Rehearsal ---------------------------------------------------
s.gsub(re){...} 5.970000 0.000000 5.970000 ( 6.009089)
s.gsub(re,...) 2.900000 0.060000 2.960000 ( 3.148006)
------------------------------------------ total: 8.930000sec

user system total real
s.gsub(re){...} 5.760000 0.010000 5.770000 ( 5.925049)
s.gsub(re,...) 2.800000 0.060000 2.860000 ( 2.914432)
 
R

Robert Klemme

2008/3/6 said:
You do not need double backslashes either:

=> "aabbcc"

I know but I prefer to have them in because it's clearer what happens.
Sometimes a single backslash works and sometimes not. IMHO some of
the recurring discussions about the number of backslashes needed for
replacement strings would not happen or be easier and shorter if there
was a clear rule that \x always results in a non escaped string or
error and in order to have a backslash in a string there must be two
in the source. In other words, I would forbid '\1' and instead
require '\\1'.

Kind regards

robert
 
R

Robert Klemme

2008/3/6 said:
That's the most elegant way to refer to captures, but if speed matters:

"abc".gsub(/(.)/, "\\1\\1")

The "\\1" is always easy to mess up, which is why I generally prefer the
block form.

It's only about a factor of two faster, according to the following, but
that matters sometimes.

Rehearsal ---------------------------------------------------
s.gsub(re){...} 5.970000 0.000000 5.970000 ( 6.009089)
s.gsub(re,...) 2.900000 0.060000 2.960000 ( 3.148006)
------------------------------------------ total: 8.930000sec

user system total real
s.gsub(re){...} 5.760000 0.010000 5.770000 ( 5.925049)
s.gsub(re,...) 2.800000 0.060000 2.860000 ( 2.914432)

What version did you test with? I get much more dramatic differences:

15:56:11 ~
$ ruby /c/Temp/gs.rb
Rehearsal -----------------------------------------------------------
s.gsub(re){|x| x * 2} 25.313000 0.031000 25.344000 ( 25.354000)
s.gsub(re){|x| x << x} 22.812000 0.000000 22.812000 ( 22.845000)
s.gsub(re, '\1\1') 6.516000 0.015000 6.531000 ( 6.539000)
s.gsub(re, '\&\&') 6.578000 0.016000 6.594000 ( 6.595000)
s.gsub(/./){|x| x * 2} 25.172000 0.016000 25.188000 ( 25.182000)
s.gsub(/./){|x| x << x} 22.843000 0.000000 22.843000 ( 22.857000)
s.gsub(/(.)/, '\1\1') 6.344000 0.015000 6.359000 ( 6.355000)
s.gsub(/./, '\&\&') 6.188000 0.032000 6.220000 ( 6.217000)
------------------------------------------------ total: 121.891000sec

user system total real
s.gsub(re){|x| x * 2} 25.484000 0.015000 25.499000 ( 25.502000)
s.gsub(re){|x| x << x} 22.813000 0.031000 22.844000 ( 22.856000)
s.gsub(re, '\1\1') 6.312000 0.000000 6.312000 ( 6.337000)
s.gsub(re, '\&\&') 6.359000 0.000000 6.359000 ( 6.359000)
s.gsub(/./){|x| x * 2} 25.922000 0.000000 25.922000 ( 25.994000)
s.gsub(/./){|x| x << x} 22.672000 0.015000 22.687000 ( 22.707000)
s.gsub(/(.)/, '\1\1') 6.375000 0.016000 6.391000 ( 6.389000)
s.gsub(/./, '\&\&') 6.235000 0.000000 6.235000 ( 6.239000)
16:00:22 ~
$ ruby -v
ruby 1.8.6 (2007-03-13 patchlevel 0) [i386-cygwin]
16:00:26 ~
$ cat /c/Temp/gs.rb
require 'benchmark'

s = ("abc" * 1_000_000).freeze
re = /(.)/
re2 = /./

Benchmark.bmbm do |b|
b.report("s.gsub(re){|x| x * 2}") do
s.gsub(re) { |x| x * 2 }
end

b.report("s.gsub(re){|x| x << x}") do
s.gsub(re) { |x| x << x }
end

b.report("s.gsub(re, '\\1\\1')") do
s.gsub(re, "\\1\\1")
end

b.report("s.gsub(re, '\\&\\&')") do
s.gsub(re, "\\&\\&")
end


b.report("s.gsub(/./){|x| x * 2}") do
s.gsub(/./) { |x| x * 2 }
end

b.report("s.gsub(/./){|x| x << x}") do
s.gsub(/./) { |x| x << x }
end

b.report("s.gsub(/(.)/, '\\1\\1')") do
s.gsub(/(.)/, "\\1\\1")
end

b.report("s.gsub(/./, '\\&\\&')") do
s.gsub(/./, "\\&\\&")
end
end
16:00:31 ~
$

Kind regards

robert
 
J

Joel VanderWerf

Robert said:
What version did you test with? I get much more dramatic differences:

$ ruby -v
ruby 1.8.6 (2007-09-24 patchlevel 111) [i686-linux]

Surprising. The slower cases in your results are all the block cases.
Does cygwin ruby perform poorly with blocks in general? Or is it because
my ruby is compiled for i686 instead of i386?
 
T

Todd Benson

On Fri, Mar 7, 2008 at 9:02 AM, Robert Klemme
What version did you test with? I get much more dramatic differences:

FreeBSD was way faster (probably due to optimized compiling) but the
differences were about the same. This is on the same machine...


FreeBSD:
Rehearsal -----------------------------------------------------------
s.gsub(re){|x| x * 2} 9.523438 0.015625 9.539062 ( 9.875376)
s.gsub(re){|x| x << x} 8.453125 0.031250 8.484375 ( 9.576175)
s.gsub(re, '\1\1') 3.320312 0.039062 3.359375 ( 3.482295)
s.gsub(re, '\&\&') 3.429688 0.000000 3.429688 ( 3.545766)
s.gsub(/./){|x| x * 2} 9.523438 0.015625 9.539062 ( 9.885323)
s.gsub(/./){|x| x << x} 8.460938 0.015625 8.476562 ( 9.092159)
s.gsub(/(.)/, '\1\1') 3.312500 0.007812 3.320312 ( 3.436467)
s.gsub(/./, '\&\&') 3.117188 0.007812 3.125000 ( 3.251097)
------------------------------------------------- total: 49.273438sec

user system total real
s.gsub(re){|x| x * 2} 9.539062 0.007812 9.546875 ( 9.913748)
s.gsub(re){|x| x << x} 8.312500 0.023438 8.335938 ( 8.757943)
s.gsub(re, '\1\1') 3.335938 0.000000 3.335938 ( 3.456838)
s.gsub(re, '\&\&') 3.335938 0.007812 3.343750 ( 3.463104)
s.gsub(/./){|x| x * 2} 9.351562 0.031250 9.382812 ( 9.720232)
s.gsub(/./){|x| x << x} 8.343750 0.023438 8.367188 ( 8.666177)
s.gsub(/(.)/, '\1\1') 3.304688 0.031250 3.335938 ( 3.447005)
s.gsub(/./, '\&\&') 3.093750 0.039062 3.132812 ( 3.404753)


WinXP
Rehearsal -----------------------------------------------------------
s.gsub(re){|x| x * 2} 25.844000 0.235000 26.079000 ( 26.062000)
s.gsub(re){|x| x << x} 22.906000 0.312000 23.218000 ( 23.250000)
s.gsub(re, '\1\1') 8.016000 0.391000 8.407000 ( 8.406000)
s.gsub(re, '\&\&') 8.000000 0.484000 8.484000 ( 8.485000)
s.gsub(/./){|x| x * 2} 25.437000 0.438000 25.875000 ( 25.875000)
s.gsub(/./){|x| x << x} 22.766000 0.437000 23.203000 ( 23.218000)
s.gsub(/(.)/, '\1\1') 7.875000 0.656000 8.531000 ( 8.532000)
s.gsub(/./, '\&\&') 7.609000 0.500000 8.109000 ( 8.109000)
------------------------------------------------ total: 131.906000sec

user system total real
s.gsub(re){|x| x * 2} 25.750000 0.609000 26.359000 ( 26.359000)
s.gsub(re){|x| x << x} 23.469000 0.266000 23.735000 ( 23.734000)
s.gsub(re, '\1\1') 8.500000 0.609000 9.109000 ( 9.109000)
s.gsub(re, '\&\&') 8.296000 0.672000 8.968000 ( 8.969000)
s.gsub(/./){|x| x * 2} 26.032000 0.391000 26.423000 ( 26.422000)
s.gsub(/./){|x| x << x} 23.234000 0.390000 23.624000 ( 23.625000)
s.gsub(/(.)/, '\1\1') 8.734000 0.547000 9.281000 ( 9.282000)
s.gsub(/./, '\&\&') 8.235000 0.656000 8.891000 ( 8.890000)


My personal favorite, though, is still s.gsub(/./) {|x| x * 2}.

Todd
 
7

7stud --

Robert said:
2008/3/6 said:
It's only about a factor of two faster, according to the following, but
that matters sometimes.

Rehearsal ---------------------------------------------------
s.gsub(re){...} 5.970000 0.000000 5.970000 ( 6.009089)
s.gsub(re,...) 2.900000 0.060000 2.960000 ( 3.148006)
------------------------------------------ total: 8.930000sec

user system total real
s.gsub(re){...} 5.760000 0.010000 5.770000 ( 5.925049)
s.gsub(re,...) 2.800000 0.060000 2.860000 ( 2.914432)

What version did you test with? I get much more dramatic differences:

15:56:11 ~
$ ruby /c/Temp/gs.rb
Rehearsal -----------------------------------------------------------
s.gsub(re){|x| x * 2} 25.313000 0.031000 25.344000 ( 25.354000)
s.gsub(re){|x| x << x} 22.812000 0.000000 22.812000 ( 22.845000)
s.gsub(re, '\1\1') 6.516000 0.015000 6.531000 ( 6.539000)
s.gsub(re, '\&\&') 6.578000 0.016000 6.594000 ( 6.595000)
s.gsub(/./){|x| x * 2} 25.172000 0.016000 25.188000 ( 25.182000)
s.gsub(/./){|x| x << x} 22.843000 0.000000 22.843000 ( 22.857000)
s.gsub(/(.)/, '\1\1') 6.344000 0.015000 6.359000 ( 6.355000)
s.gsub(/./, '\&\&') 6.188000 0.032000 6.220000 ( 6.217000)
------------------------------------------------ total: 121.891000sec

user system total real
s.gsub(re){|x| x * 2} 25.484000 0.015000 25.499000 ( 25.502000)
s.gsub(re){|x| x << x} 22.813000 0.031000 22.844000 ( 22.856000)
s.gsub(re, '\1\1') 6.312000 0.000000 6.312000 ( 6.337000)
s.gsub(re, '\&\&') 6.359000 0.000000 6.359000 ( 6.359000)
s.gsub(/./){|x| x * 2} 25.922000 0.000000 25.922000 ( 25.994000)
s.gsub(/./){|x| x << x} 22.672000 0.015000 22.687000 ( 22.707000)
s.gsub(/(.)/, '\1\1') 6.375000 0.016000 6.391000 ( 6.389000)
s.gsub(/./, '\&\&') 6.235000 0.000000 6.235000 ( 6.239000)
16:00:22 ~
$ ruby -v
ruby 1.8.6 (2007-03-13 patchlevel 0) [i386-cygwin]
16:00:26 ~
$ cat /c/Temp/gs.rb
require 'benchmark'

s = ("abc" * 1_000_000).freeze
re = /(.)/
re2 = /./

Benchmark.bmbm do |b|
b.report("s.gsub(re){|x| x * 2}") do
s.gsub(re) { |x| x * 2 }
end

b.report("s.gsub(re){|x| x << x}") do
s.gsub(re) { |x| x << x }
end

b.report("s.gsub(re, '\\1\\1')") do
s.gsub(re, "\\1\\1")
end

b.report("s.gsub(re, '\\&\\&')") do
s.gsub(re, "\\&\\&")
end


b.report("s.gsub(/./){|x| x * 2}") do
s.gsub(/./) { |x| x * 2 }
end

b.report("s.gsub(/./){|x| x << x}") do
s.gsub(/./) { |x| x << x }
end

b.report("s.gsub(/(.)/, '\\1\\1')") do
s.gsub(/(.)/, "\\1\\1")
end

b.report("s.gsub(/./, '\\&\\&')") do
s.gsub(/./, "\\&\\&")
end
end
16:00:31 ~
$


And taking your non-perl champion:

s.gsub(/(.)/, '\1\1')

(I refuse to consider any code that uses perl syntax), and pitting it
against:

s = 'abc'
new_str = ""
s.each_byte do |byte|
2.times do
new_str << byte
end
end


I get:

gsub:
t1 exec time(1,000,000 loops): 5.625847 total

each_byte:
t2 exec time(1,000,000 loops): 5.325978 total
 
J

Joel VanderWerf

7stud said:
And taking your non-perl champion:

s.gsub(/(.)/, '\1\1')

(I refuse to consider any code that uses perl syntax), and pitting it
against:

s = 'abc'
new_str = ""
s.each_byte do |byte|
2.times do
new_str << byte
end
end


I get:

gsub:
t1 exec time(1,000,000 loops): 5.625847 total

each_byte:
t2 exec time(1,000,000 loops): 5.325978 total

Hm, I don't see an improvement, unless I replace 2.times... with an
explicit unrolling of that inner loop (which is heading downhill in the
elegance department):

require 'benchmark'

Benchmark.bmbm do |b|
s = "abc" * 1_000_000
re = /(.)/

b.report("s.gsub(re) {|x| x*2}") do
s.gsub(re) {|x| x*2}
end

b.report("s.gsub(re, '\1\1')") do
s.gsub(re, '\1\1')
end

b.report("7stud1") do
new_str = ""
s.each_byte do |byte|
2.times do
new_str << byte
end
end
end

b.report("7stud2") do
new_str = ""
s.each_byte do |byte|
new_str << byte << byte
end
end
end

__END__

Rehearsal ---------------------------------------------------
s.gsub(re){...} 6.220000 0.010000 6.230000 ( 6.325230)
s.gsub(re,...) 3.020000 0.060000 3.080000 ( 3.115286)
7stud 4.110000 0.000000 4.110000 ( 4.142912)
7stud 2.050000 0.000000 2.050000 ( 2.072398)
----------------------------------------- total: 15.470000sec

user system total real
s.gsub(re){...} 5.910000 0.020000 5.930000 ( 5.993456)
s.gsub(re,...) 3.000000 0.000000 3.000000 ( 3.017119)
7stud 4.100000 0.020000 4.120000 ( 4.300772)
7stud 2.030000 0.010000 2.040000 ( 2.068190)
 
S

Sebastian Hungerecker

Robert said:
In other words, I would forbid '\1' and instead
require '\\1'.

Then why have single quotes at all?
Personally I'd rather have \ always being a literal backslash inside single
quotes and just live with fact that you can't have single quotes inside
single quotes.
 
W

William James

(I refuse to consider any code that uses perl syntax)

So you won't be able to do this any more:

2 + 2

You'll have to use Forth:

2 2 +

or Lisp:

(+ 2 2)

Some awk syntax:

gsub( /./, "&&", s )
 
W

William James

If i have a string "abc" and want to make it like this "aabbcc", how do
i go about it?

I thought convert it into an array and then use string.map{|x| x << x}

though i havnt tested that yet. Is there a better way or a string method
that saves me from having to convert it into an array?

s = "aabbcc"
==>"aabbcc"
s = s.squeeze
==>"abc"
s = s.unsqueeze
==>"aabbcc"

Yes, that should have been saved till April 1.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,832
Latest member
GlennSmall

Latest Threads

Top