regular expressions question

G

Gregory Brown

It reminds me of a saying: "If you throw a rock into a pack of dogs, the
dog that yelps loudest is the one that got hit."

That's the point you missed. Rubyists don't throw rocks at dogs.
 
Z

Zach Dennis

Jeff said:
Quite simply, Ruby is *supposed* to be about consistency ... Having the
"everything is an object, principal of least surprise" mantra, then
using these which act like a global ( $ ) but aren't actually ( local
scope ) is just vile.

You have made a common mistake, you are thinking Ruby is meant for your
principle of least surprise, when it is matz's least surprise...

Quoting Matz,

"Besides that, he doesn't understand what POLS means. Since someone
will surprise for any arbitrary choice, it is impossible to satisfy
"least surprise" in his sense. The truth is two folds: a) when there
are two or more choices in the language design decision, I take the
one that makes _me_ surprise least. "

http://groups.google.com/group/comp.lang.ruby/msg/875965f29cfb77bc

Zach
 
R

Ryan Leavengood

I never called out anyone in particular, didn't have anyone in mind in
fact, but it's funny how some people are insecure enough that they're
getting offended by what I wrote, and getting so defensive that they
feel the need to lash out in return.

I'm not sure if anyone lashed out really, just pointed out your
rudeness for your own good. You see, by writing the way you have here,
you have sewn bad will, and any further communication from you will
automatically be tainted by your previous behavior. Your opinion will
automatically be deemed less important than most other people. I'm
sure you don't care, in your infinite wisdom and unfailing purity of
perfect thought, but I figured you should know.
It reminds me of a saying: "If you throw a rock into a pack of dogs, the
dog that yelps loudest is the one that got hit."

That's lovely, LOL.

Ryan
 
A

Adam Sroka

<<bunch of snipped stuff about rudeness and such>>

I'm kind of new here, but it seems like debating the quality of one
another's rudeness is slightly OT.

Incidentally, I earned my living as a back-end Perl programmer for a
couple years early in my career. I wrote literate OO code in Perl. The
thing I loved about Perl was that, like a natural language, there were
several ways to say the same thing some obvious, some flowery, some a
bit too terse. Among the things I did *not* like about Perl were some of
the whacky implicit variables like $1, etc, and especially @_. However,
I never needed to use them explicitly, because when I wanted them they
were always there implicitly (Those who know Perl know what I am saying.)

My point is this: I am (or was) a Perl programmer and I do not like the
"Perlish" syntax one bit. I didn't like it in Perl, and I don't like it
in Ruby either. However, I object to the notion that Perlishness makes
it offensive. Non-obviousness is what makes it offensive. The matcher
syntax is much clearer.
 
R

Ryan Leavengood

I'm kind of new here, but it seems like debating the quality of one
another's rudeness is slightly OT.

You are right of course, plus I was starting to get hypocritical and
be rude myself.
My point is this: I am (or was) a Perl programmer and I do not like the
"Perlish" syntax one bit. I didn't like it in Perl, and I don't like it
in Ruby either. However, I object to the notion that Perlishness makes
it offensive. Non-obviousness is what makes it offensive. The matcher
syntax is much clearer.

I had a strong dislike for Perl after some bad experiences at one job,
but for some reason the "Perlisms" in Ruby don't bother me as much,
because as a whole Ruby tends to be so much more readable. Like
anything, I think there are times when using the $1 variables is
appropriate, and times when using MatchData is appropriate. To say one
or the other is the only proper way to code Ruby regular expression
matching is getting a little too pedantic.

In the same way I can't control perceived mailing list "rudeness", I
don't think the people in the MatchData camp can control how other
people code (no matter how passionate they are about the topic.)
Especially on a section of Ruby code style that is not really debated
(whereas lots of people will denounce camelCaseMethods etc.)

Anyhow, even this discussion is off-topic for the original post, so
I'm stopping here.

Ryan
 
R

Robert Klemme

ako... said:
yes, thank you. this is a better description of the problem. i am not
a native english speaker, so may be this is one of the reasons why my
question is not clear.

i saw a solution to this problem that uses split at the end. it of
course won't work if you change your example and allow quoted strings
in source-words and destination-words. a quoted string can contain
anything, spaces too and your keywords too, so the subsequent split
won't work.

well, i did not realise that the term "group's captures" is that rare.
i thought it was a standard term. but may be i am brainwashed by
microsoft. so i have this code in .net which might help to clarify
what i am talking about:

string text = "One car red car blue car";
string pat = @"^(?:(\w+)\s+)*(\w+)$";
Regex r = new Regex(pat, RegexOptions.IgnoreCase);

// Match the regular expression pattern against a text
string.
Match m = r.Match(text);
if (m.Success)
{
Console.WriteLine("match: [{0}]", m);
foreach (Group g in m.Groups)
{
Console.WriteLine("group: [{0}]", g);
foreach (Capture c in g.Captures)
{
Console.WriteLine("\tcapture: [{0}]", c);
}
}
}

the output is:

match: [One car red car blue car]
group: [One car red car blue car]
capture: [One car red car blue car]
group: [blue]
capture: [One]
capture: [car]
capture: [red]
capture: [car]
capture: [blue]
group: [car]
capture: [car]

as you see, the first group is $0, the second group is $1, and the
third is $2. but $1 and $2 contain captures too. it is like if $1 and
$2 were arrays in Ruby.

in my opinion this is a big limitation of ruby's regular expressions.
it just must be as powerful as .net ; -)

konstantin

I don't know whether your question was answered in the lengthy thread
already. In case not: in Ruby to get all matches of a group you need to
iterate through the whole string with #scan. There is no such thing as this
feature of .net - and frankly I haven't missed it so far. To get at all the
words in your example this is sufficient:
=> ["One", "car", "red", "car", "blue", "car"]

If you actually need group matches, you'll have to do something like this
s.scan(/\w(\w+)/).map{|m| m[0]}
=> ["ne", "ar", "ed", "ar", "lue", "ar"]

alternative
ma=[] => []
s.scan(/\w(\w+)/) {|m| ma << m[0]} => "One car red car blue car"
ma
=> ["ne", "ar", "ed", "ar", "lue", "ar"]

Of course this is quite a silly example... The main point here is that you
must refrain from anchoring the regexp at the beginning if you want to
iterate like this.

HTH

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,772
Messages
2,569,593
Members
45,104
Latest member
LesliVqm09
Top