Regex confusion...

G

guthrie

sorry for the beginner question, but...

With this code
my $img = "0-12345-abc";
print " Match.1 ", (defined $img);
print " Match.2 ", ($img =~ /\S/);
print " Matched::", $1;
print " Match.3 ", ($img =~ /^(\d)-(\d+)-(\w)$/);
print " Matched::", $1, ", ", $2, ", ", $3;
I expected:
true, true, "012345-abc", true,
and then: 0, 12345, abc

Instead I get:
true true "" false
"" "" ""

Actually: img= (0-52557-wind)
Match.1 1 Match.2 1 Matched::
Match.3 Matched::, ,

Seems simple enough, what am I missing!
- why doesn't the full string first match against /\S/ return the
string
- why doesn't the second (extracting) match work.

The actual code I'm trying for is:
if(defined $img and $img =~ /\S/) {
if ($img =~ /^(\d)-(\d+)-(\w)$/)
{ my ($t, $zip, $type) = ($1, $2, $3); }
else { die "ERROR: invalid URL arguments :: ${img}\n";}
print "Debug: (", $t, ", ", $zip, ", ", $type, ")\n"; ##
Debug...

Thanks for any hints. Sorry for the confusion!
Greg
 
G

guthrie

Oops, small correction;
With this code
my $img = "0-12345-abc";
print " Match.1 ", (defined $img);
print " Match.2 ", ($img =~ /\S/);
print " Matched::", $&;
print " pre-Matched::", $`, "\n";
print " post-Matched::", $', "\n";
I get::
Match.1 1 Match.2 1 Matched::0
pre-Matched::
post-Matched::-52557-wind

I expected:
pre="", match="0-12345-abc", post=""
 
N

Narthring

sorry for the beginner question, but...

With this code
my $img = "0-12345-abc";
print " Match.1 ", (defined $img);
print " Match.2 ", ($img =~ /\S/);
print " Matched::", $1;
print " Match.3 ", ($img =~ /^(\d)-(\d+)-(\w)$/);
print " Matched::", $1, ", ", $2, ", ", $3;
I expected:
true, true, "012345-abc", true,
and then: 0, 12345, abc

Instead I get:
true true "" false
"" "" ""

Actually: img= (0-52557-wind)
Match.1 1 Match.2 1 Matched::
Match.3 Matched::, ,

Seems simple enough, what am I missing!
- why doesn't the full string first match against /\S/ return the
string
- why doesn't the second (extracting) match work.

The actual code I'm trying for is:
if(defined $img and $img =~ /\S/) {
if ($img =~ /^(\d)-(\d+)-(\w)$/)
{ my ($t, $zip, $type) = ($1, $2, $3); }
else { die "ERROR: invalid URL arguments :: ${img}\n";}
print "Debug: (", $t, ", ", $zip, ", ", $type, ")\n"; ##
Debug...

Thanks for any hints. Sorry for the confusion!
Greg

$img =~ /^(\d)-(\d+)-(\w+)$/

\w matches a single 'word' character, not an entire word.
 
J

John W. Krahn

guthrie said:
sorry for the beginner question, but...

With this code
my $img = "0-12345-abc";
print " Match.1 ", (defined $img);
print " Match.2 ", ($img =~ /\S/);
print " Matched::", $1;
print " Match.3 ", ($img =~ /^(\d)-(\d+)-(\w)$/);
print " Matched::", $1, ", ", $2, ", ", $3;
I expected:
true, true, "012345-abc", true,
and then: 0, 12345, abc

Instead I get:
true true "" false
"" "" ""

Actually: img= (0-52557-wind)
Match.1 1 Match.2 1 Matched::
Match.3 Matched::, ,

Seems simple enough, what am I missing!
- why doesn't the full string first match against /\S/ return the
string

The character class \S matches a single character so it can't match the full
string. The expression ($img =~ /\S/) will only return "true" or "false"
because you don't use the /g global option and/or you don't have any capturing
parentheses in the pattern.

- why doesn't the second (extracting) match work.

Because the pattern /^(\d)-(\d+)-(\w)$/ doesn't match the string
'0-52557-wind'. -(\w)$ will only match one character between a hyphen and
the end of the line but your string has four characters (wind) between the
hyphen and the end of the line.



John
 
G

guthrie

-- Many thanks, very silly of me.

I thought these were word & space matches, not just a single
character.
I did (mis-) read the documentation several times! :)

Thanks again.
Greg
 
B

Ben Morrow

Quoth guthrie said:
sorry for the beginner question, but...

With this code
my $img = "0-12345-abc";
print " Match.1 ", (defined $img);
print " Match.2 ", ($img =~ /\S/);
print " Matched::", $1;

You should never use the $N variables without checking the match
succeeded. In any case, your pattern has no capturing parens, so $1 will
be empty.

Others have already noted that \S and \w only match single characters.
The actual code I'm trying for is:
if(defined $img and $img =~ /\S/) {
if ($img =~ /^(\d)-(\d+)-(\w)$/)
{ my ($t, $zip, $type) = ($1, $2, $3); }

This can be simplified to

if (
my ($t, $zip, $type) =
$img =~ /^(\d)-(\d+)-(\w+)$/
) {

which avoids the need to use the $N variables altogether.

Ben
 
C

comp.llang.perl.moderated

...





It might be quicker to check for sucess first then do the asignment

$_ = "......";
if ( /^(\d)-(\d+)-(\w+)$/ )
{
#use $1,2,3 or asign
($t, $zip, $type) = ($1, $2,
^^^^^^^^^^^^^^^^^^^^^^^^^^

Are you sure... wouldn't your solution require an extra copy from
$N.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,045
Latest member
DRCM

Latest Threads

Top