Regex: Exact semantics of ^ and $ when using /m

Wolfgang Thomas · Jun 24, 2006

Hi,

I am afraid that this question has been asked before, but I could not
find the answer in the FAQ nor in the "Programming Perl" book, nor by
googling.

My question refers to the /m modifier for regular expressions.
According to "Programming Perl" /m lets ^ and $ match next to new lines
within the string instead of considering only the beginning and end of
the string.

Therefore I wonder why the following example does not match:

my $s = "123\n456";
if ($s =~ /3$^4/m) {print "match (4)\n";}

Even more confusing (for me) is that
if ($s =~ /3$4/m) {print "match (2)\n";}
matches, whereas
if ($s =~ /34/m) {print "match (3)\n";}
does not match.

Could someone please point me to an explanation of that behavior?

DJ Stunks · Jun 24, 2006

Wolfgang said:
Hi,

I am afraid that this question has been asked before, but I could not
find the answer in the FAQ nor in the "Programming Perl" book, nor by
googling.

are you aware that Perl comes with documentation of its own for all the
functions and syntax that you might ever want to use?

I would suggest perlre.

My question refers to the /m modifier for regular expressions.
According to "Programming Perl" /m lets ^ and $ match next to new lines
within the string instead of considering only the beginning and end of
the string.

you have your answer: "next to". they are called "zero width
assertions" which means they match, but they do not consume any
characters from the string.

From perlre:

By default, the "^" character is guaranteed to match only the
beginning of the string, the "$" character only the end (or
before the newline at the end), and Perl does certain optimizations
with the assumption that the string contains only one line. Embedded
newlines will not be matched by "^" or "$". You may, however, wish
to treat a string as a multi-line buffer, such that the "^" will
match after any newline within the string, and "$" will match before
any newline. At the cost of a little more overhead, you can do this
by using the /m modifier on the pattern match operator.

Therefore I wonder why the following example does not match:

my $s = "123\n456";
if ($s =~ /3$^4/m) {print "match (4)\n";}

this is because there's a character after that $ and before that ^: a
\n.

try: if ($s =~ m'3$.*^4'ms) {print "match (4)\n";}

Even more confusing (for me) is that
if ($s =~ /3$4/m) {print "match (2)\n";}
matches,

did you have warnings enabled? if so, did you notice the complaint
"Use of uninitialized value in concatenation (.) or string at..."? The
compiler is not taking that '$' as a regex metacharacter - it is
grouping it with the 4 and assuming you are trying to interpolate $4.
$4 is not defined, the match is now for /3/ which matches.

Could someone please point me to an explanation of that behavior?

HTH,
-jp

Mumia W. · Jun 24, 2006

Wolfgang said:
Hi,

Hi Wolfgang.

Therefore I wonder why the following example does not match:

my $s = "123\n456";
if ($s =~ /3$^4/m) {print "match (4)\n";}
[...]

^ only matches the beginning of a line when it appears at the beginning
of the RE, and $ only matches the end of a line when it appears at the
end of the RE.

Use \n to match newlines embedded inside an RE:
if ($s =~ /3\n4/) { ... }

Dr.Ruud · Jun 24, 2006

Mumia W. schreef:

^ only matches the beginning of a line when it appears at the
beginning of the RE, and $ only matches the end of a line when it
appears at the end of the RE.

No.

perl -wle '"a\nb" =~ /a$(?:\n)^b/m and print 1'

perl -wle '"a\nb" =~ / a $ \n ^ b /mx and print 1'

perl -wle '"a\nb" =~ / a $ \s ^ b /mx and print 1'

perl -wle '"a\nb" =~ / a $ (?:[^^]) ^ b /mx and print 1'

etc.

Mumia W. · Jun 25, 2006

Dr.Ruud said:
Mumia W. schreef:

^ only matches the beginning of a line when it appears at the
beginning of the RE, and $ only matches the end of a line when it
appears at the end of the RE.

Click to expand...

No.

perl -wle '"a\nb" =~ /a$(?:\n)^b/m and print 1'

perl -wle '"a\nb" =~ / a $ \n ^ b /mx and print 1'

perl -wle '"a\nb" =~ / a $ \s ^ b /mx and print 1'

perl -wle '"a\nb" =~ / a $ (?:[^^]) ^ b /mx and print 1'

etc.

Hmm, it seems I was wrong about "^"; thanks.

Wolfgang Thomas · Jun 25, 2006

DJ said:
you have your answer: "next to". they are called "zero width
assertions" which means they match, but they do not consume any
characters from the string.

That's the key to understanding the behavior.

Thank you guys.

What does it mean: "Trailing \ in regex m/\\\/ at ...	0	Nov 6, 2010
Regexp discovery - using ^ with /m is a time sink	5	Feb 14, 2009
unexplained warning message in m{...} regexp	34	Apr 24, 2009
regex @a = m / \| /g and captures?	5	Oct 17, 2003
The Semantics of 'volatile'	73	Jun 2, 2009
Regex testing and UTF8 awarenes or Regex and numeric pattern matching	2	Mar 10, 2009
qr// doesn't handle m modifier?	5	Sep 1, 2006
How to debug a regex with (?DEFINE)?	0	Aug 7, 2012

Regex: Exact semantics of ^ and $ when using /m

Wolfgang Thomas

DJ Stunks

Mumia W.

Dr.Ruud

Mumia W.

Wolfgang Thomas

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads