Characters and strings oddness

S

Stefan Kruger

Hi there Rubyists -

I'm trying to learn the language, coming from a long background in Perl.

Here's what I want to do: pick out a double-quoted string inside
another string,
respecting embedded, backslashed quotes:

line = ' a string "properly \"quoted\" that ends" here '
quoted = '"properly \"quoted\" that ends"'

If Ruby had the same regexes as Perl, I'd say something like

line.gsub!(/^\s*(\".*?(?<!\\)\")\s*/, '')

and have my quoted string pop out in $1. In fact, TextMate groks that
regex, too.
Ruby don't like the negative look-behind, unfortunately.

Ok, let's do a loop then. I finally arrived at:

string = "\""
if line.gsub!(/^.*?(\")/, '')
(0..line.length).each do |i|
string << line
break if i>0 && (line[i, 1] == "\"") && (line[i-1, 1] != '\\')
end
end

which works. However.. I'm puzzled by Ruby's way of handling strings. A string
is - essentially - a set of bytes, not unlike a char[] in C. Is there
really no way
of defining character literals in Ruby? I was surprised to find that I
couldn't say

Stefan-Krugers-Computer:~ stefan$ irb
irb(main):001:0> string = 'this is a string'
=> "this is a string"
irb(main):002:0> string[0] == 't'
=> false

whereas i *can* say

irb(main):003:0> string[0, 1] == 't'
=> true

Now, in my little loop experiment above I tried the following:

delim = "\""
string = delim
if line.gsub!(/^.*?(\")/, '')
(0..line.length).each do |i|
string << line
break if i>0 && (line[i, 1] == delim) && (line[i-1, 1] != '\\')
end
end

TypeError: can't convert nil into String

method << in test.rb at line 9
at top level in test.rb at line 9
at top level in test.rb at line 8

I'm at a loss to understand why that gives an error.
 
Z

Zachary Holt

Hi there Rubyists - Hi.


irb(main):002:0> string[0] == 't'
string[0].chr == 't'
or
string[0] == ?t

delim = "\""
string = delim
if line.gsub!(/^.*?(\")/, '')
(0..line.length).each do |i|
string << line
break if i>0 && (line[i, 1] == delim) && (line[i-1, 1] != '\\')
end
end

TypeError: can't convert nil into String

method << in test.rb at line 9
at top level in test.rb at line 9
at top level in test.rb at line 8

I'm at a loss to understand why that gives an error.

Try 0...line.length (three dots, not two). You're indexing past the
end of the string.
 
P

Phrogz

irb(main):002:0> string[0] == 't'

string[0].chr == 't'
or
string[0] == ?t

or string[0..0] == 't'

It's a common gotcha that indexing a string by a single integer
returns the (integer) CODE of the character/byte at that location, not
the one-character string containing that byte.
 
D

Daniel DeLorme

Stefan said:
line = ' a string "properly \"quoted\" that ends" here '
quoted = '"properly \"quoted\" that ends"'

If Ruby had the same regexes as Perl, I'd say something like

line.gsub!(/^\s*(\".*?(?<!\\)\")\s*/, '')

and have my quoted string pop out in $1. In fact, TextMate groks that
regex, too.

How is that supposed to work? With the initial anchor there's no way
that regexp would ever match even if ruby supported look-behind. I
suggest this: /"((?:\\.|[^\\])+)"/

Daniel
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top