newbie pattern match question :-)

D

davidcsnow

I'm new to Perl, and am trying to match a string enclosed within
parentheses, storing said string in a variable. Simple enough I would
guess.

An example of the text to search would be: "hello(world)", with the
variable holding 'world' once done.

How far off am I with:

$text = "hello(world)";
$new = $text =~ m/\(*\)/;
Quite far offI am assuming!!

Thanks in advance,
David.
 
G

Gunnar Hjalmarsson

I'm new to Perl, and am trying to match a string enclosed within
parentheses, storing said string in a variable. Simple enough I would
guess.

An example of the text to search would be: "hello(world)", with the
variable holding 'world' once done.

How far off am I with:

$text = "hello(world)";
$new = $text =~ m/\(*\)/;

There are at least three things that need to be corrected:

1) The '*' symbol is a quantifier in a Perl regular expression, meaning
zero or more of the preceeding character or character class.
Accordingly, you need to insert that character class.

perldoc perlre

2) You need parentheses to make the regex capture the desired string.

3) Also $new need to be surrounded by parentheses in order to enforce
list context.

perldoc perlop (the m// operator)
 
C

Chris Mattern

I'm new to Perl, and am trying to match a string enclosed within
parentheses, storing said string in a variable. Simple enough I would
guess.

An example of the text to search would be: "hello(world)", with the
variable holding 'world' once done.

How far off am I with:

$text = "hello(world)";
$new = $text =~ m/\(*\)/;
Quite far offI am assuming!!
Close, but not quite. First problem: you are making a typical
newbie mistake--confusing regexp syntax with the filematching
or "globbing" syntax found in many shells (and, for that
matter, in Perl when you do filename globbing). "*" doesn't
mean "match any number of any character", it means "match
any number (including zero) of the preceding character."
so this matches zero to any number of left parens, followed
by one right paren. In the string "hello(world)" your regexp
will match ")". The rexexp character for "match any
character (except newline)" is the period (.). So the
expression you want for "match any number of any characters"
is ".*".

Second problem: m// doesn't return what it matched. Well,
it can, but not in a scalar context, and the methods by which
you can get it to return what's matched are not I think
the most useful for what you're trying to do here. m//
returns 1 if you got a match and "" if you didn't, which
is handy if you're using it to check a conditional, but
less so if you want to extract what was matched. There
are a variety of ways to get what was matched, but for
your case the best way is probably to use capturing
parentheses, particularly since you *don't* want the
whole match--you want to strip off the parentheses.
This will do the trick:

if ($text =~ m/\((.*)\)/) {
$new = $1;
}
else {
# it wasn't there!
}

After a successful match, the capturing parentheses
populate the number variables--the first capture
goes in $1, the second in $2 and so on. Note that
these variables *only* get changed if the match
was successful, which is why the code I gave
makes the assignment only on a successful match.

--
Christopher Mattern

"Which one you figure tracked us?"
"The ugly one, sir."
"...Could you be more specific?"
 
D

davidcsnow

Thanks for the help guys - great advice there!!!

The below does exactly what I was looking for, so thanks for the help,
and for the advice on the doc.

$text = "hello(world)";
($new) = $text =~ /\((.*)\)/;

kind regards,
David.
 
A

A. Sinan Unur

(e-mail address removed) wrote in @c13g2000cwb.googlegroups.com:
Thanks for the help guys - great advice there!!!

The below does exactly what I was looking for, so thanks for the help,
and for the advice on the doc.

$text = "hello(world)";
($new) = $text =~ /\((.*)\)/;

You might also benefit from reading

perldoc -q matching

Sinan
 
T

Tad McClellan

The below does exactly what I was looking for,


Have you tried it with a string like this yet?

my $text = "hello(world) hello(dolly)";

$text = "hello(world)";
($new) = $text =~ /\((.*)\)/;


Why aren't there "my" declarations on those statements?

You do have "use strict" turned on, don't you?



You are likely to want non-greedy matching:

my($new) = $text =~ /\((.*?)\)/;

or greedy-but-not-paren matching:

my($new) = $text =~ /\(([^)]*)\)/;

which is altogether too ugly, so let's do an "x"treme makeover:

my ($new) = $text =~ /
\( # an open paren
( # memory 1 contains...
[^)]* # ...chars that are not close parens
) # end of memory 1
\) # a close paren
/x;

or, if you _want_ to get more that just the first one:

my @new = $text =~ /
\( # an open paren
( # memory 1 contains...
[^)]* # ...chars that are not close parens
) # end of memory 1
\) # a close paren
/xg;
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top