FAQ 6.17 How do I efficiently match many regular expressions at once?


P

PerlFAQ Server

This is an excerpt from the latest version perlfaq6.pod, which
comes with the standard Perl distribution. These postings aim to
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

--------------------------------------------------------------------

6.17: How do I efficiently match many regular expressions at once?

(contributed by brian d foy)

If you have Perl 5.10 or later, this is almost trivial. You just smart
match against an array of regular expression objects:

my @patterns = ( qr/Fr.d/, qr/B.rn.y/, qr/W.lm./ );

if( $string ~~ @patterns ) {
...
};

The smart match stops when it finds a match, so it doesn't have to try
every expression.

Earlier than Perl 5.10, you have a bit of work to do. You want to avoid
compiling a regular expression every time you want to match it. In this
example, perl must recompile the regular expression for every iteration
of the "foreach" loop since it has no way to know what $pattern will be:

my @patterns = qw( foo bar baz );

LINE: while( <DATA> ) {
foreach $pattern ( @patterns ) {
if( /\b$pattern\b/i ) {
print;
next LINE;
}
}
}

The "qr//" operator showed up in perl 5.005. It compiles a regular
expression, but doesn't apply it. When you use the pre-compiled version
of the regex, perl does less work. In this example, I inserted a "map"
to turn each pattern into its pre-compiled form. The rest of the script
is the same, but faster:

my @patterns = map { qr/\b$_\b/i } qw( foo bar baz );

LINE: while( <> ) {
foreach $pattern ( @patterns ) {
if( /$pattern/ )
{
print;
next LINE;
}
}
}

In some cases, you may be able to make several patterns into a single
regular expression. Beware of situations that require backtracking
though.

my $regex = join '|', qw( foo bar baz );

LINE: while( <> ) {
print if /\b(?:$regex)\b/i;
}

For more details on regular expression efficiency, see *Mastering
Regular Expressions* by Jeffrey Friedl. He explains how regular
expressions engine work and why some patterns are surprisingly
inefficient. Once you understand how perl applies regular expressions,
you can tune them for individual situations.



--------------------------------------------------------------------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in
perlfaq.pod.
 
Ad

Advertisements


Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top