Want metachars escaped so they are not interpreted in regexp

M

Markus Dehmann

I have a pattern that I want to match against some text. The pattern may
contain some chars that happen to be metachars, but I don't want them to
be interpreted as metachars:

my $pattern = 'Hello...?';
$_ = 'Oh Hello x';
if(m/$pattern/){
print "Oh no, it matches!\n"; # it matches indeed!
}

How do I prepare the $pattern properly before using it in the regexp?

I used something like this:

sub encode {
my $s = $_[0];
$s =~ s/(.)/sprintf "\\x%x",ord($1)/ge;
return $s;
}

to encode the pattern before using it in the regexp, but I think it's
inefficient and inelegant...?

Does anyone have a better escape function?

Thanks!
Markus

P.S.: I'm sorry if this is a FAQ (it should be!), but I googled and
didn't find anything.
 
A

Anno Siegel

Markus Dehmann said:
I have a pattern that I want to match against some text. The pattern may
contain some chars that happen to be metachars, but I don't want them to
be interpreted as metachars:

perldoc -f quotemeta

Anno
 
R

robic0

I have a pattern that I want to match against some text. The pattern may
contain some chars that happen to be metachars, but I don't want them to
be interpreted as metachars:

my $pattern = 'Hello...?';
$_ = 'Oh Hello x';
if(m/$pattern/){
print "Oh no, it matches!\n"; # it matches indeed!
}

How do I prepare the $pattern properly before using it in the regexp?

I used something like this:

sub encode {
my $s = $_[0];
$s =~ s/(.)/sprintf "\\x%x",ord($1)/ge;
return $s;
}

to encode the pattern before using it in the regexp, but I think it's
inefficient and inelegant...?

Does anyone have a better escape function?

Thanks!
Markus

P.S.: I'm sorry if this is a FAQ (it should be!), but I googled and
didn't find anything.

Be carefull how you use strings mixed with escape characters when they
are called out as declared constants in your program. For the most part
you don't want to fight the expression parser of your interpreter.
After digesting your constant string, I call it "in-solution" like in
Chemistry. When "in-solution" it is safe from mangling of editors and
parsers. Typically, the string you want to match against would be data
read in by your program from a file. In that case it is always "in-solution".
-robic0-


use strict;
use warnings;

my ($pat_convert);

$pat_convert = convertPatternMeta ( 'Hello...?' );
showMatchResult ($pat_convert, 'Hello...? this is a big string x');
showMatchResult ($pat_convert, 'Oh Hello x');

$pat_convert = convertPatternMeta ( '*?+' );
showMatchResult ($pat_convert, 'Hello...? this (*?+) is a big string x');
showMatchResult ($pat_convert, '*?+ and so is this');

## ------------------------------------
## Helpers
##
sub convertPatternMeta
{
my ($pattern) = shift;
my @regx_esc_codes =
(
"\\", '/', '(', ')', '[', ']', '?', '|',
'+', '.', '*', '$', '^', '{', '}', '@'
);
foreach my $tc (@regx_esc_codes) {
# code template for regex
my $xxx = "\$pattern =~ s/\\$tc/\\\\\\$tc/g;";
eval $xxx;
if ($@) {
# the compiler will show the escape char, add
# it char to @regx_esc_codes
$@ =~ s/^[\s]+//s; $@ =~ s/[\s]+$//s;
die "$@";
}
}
return $pattern;
}
##
sub showMatchResult
{
my ($pattern, $string) = @_;
my $result_txt = '';
my ($result) = $string =~ /$pattern/;
if ($result) { $result_txt = 'DOES match'}
else { $result_txt = 'Does NOT match' }
print "\nString: $string\n$result_txt\nPattern: $pattern\n";
}
__DATA__

String: Hello...? this is a big string x
DOES match
Pattern: Hello\.\.\.\?

String: Oh Hello x
Does NOT match
Pattern: Hello\.\.\.\?

String: Hello...? this (*?+) is a big string x
DOES match
Pattern: \*\?\+

String: *?+ and so is this
DOES match
Pattern: \*\?\+
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,763
Messages
2,569,563
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top