FAQ 6.17 How do I efficiently match many regular expressions at once?

Discussion in 'Perl Misc' started by PerlFAQ Server, Apr 28, 2011.

  1. This is an excerpt from the latest version perlfaq6.pod, which
    comes with the standard Perl distribution. These postings aim to
    reduce the number of repeated questions as well as allow the community
    to review and update the answers. The latest version of the complete
    perlfaq is at http://faq.perl.org .

    --------------------------------------------------------------------

    6.17: How do I efficiently match many regular expressions at once?

    (contributed by brian d foy)

    If you have Perl 5.10 or later, this is almost trivial. You just smart
    match against an array of regular expression objects:

    my @patterns = ( qr/Fr.d/, qr/B.rn.y/, qr/W.lm./ );

    if( $string ~~ @patterns ) {
    ...
    };

    The smart match stops when it finds a match, so it doesn't have to try
    every expression.

    Earlier than Perl 5.10, you have a bit of work to do. You want to avoid
    compiling a regular expression every time you want to match it. In this
    example, perl must recompile the regular expression for every iteration
    of the "foreach" loop since it has no way to know what $pattern will be:

    my @patterns = qw( foo bar baz );

    LINE: while( <DATA> ) {
    foreach $pattern ( @patterns ) {
    if( /\b$pattern\b/i ) {
    print;
    next LINE;
    }
    }
    }

    The "qr//" operator showed up in perl 5.005. It compiles a regular
    expression, but doesn't apply it. When you use the pre-compiled version
    of the regex, perl does less work. In this example, I inserted a "map"
    to turn each pattern into its pre-compiled form. The rest of the script
    is the same, but faster:

    my @patterns = map { qr/\b$_\b/i } qw( foo bar baz );

    LINE: while( <> ) {
    foreach $pattern ( @patterns ) {
    if( /$pattern/ )
    {
    print;
    next LINE;
    }
    }
    }

    In some cases, you may be able to make several patterns into a single
    regular expression. Beware of situations that require backtracking
    though.

    my $regex = join '|', qw( foo bar baz );

    LINE: while( <> ) {
    print if /\b(?:$regex)\b/i;
    }

    For more details on regular expression efficiency, see *Mastering
    Regular Expressions* by Jeffrey Friedl. He explains how regular
    expressions engine work and why some patterns are surprisingly
    inefficient. Once you understand how perl applies regular expressions,
    you can tune them for individual situations.



    --------------------------------------------------------------------

    The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
    are not necessarily experts in every domain where Perl might show up,
    so please include as much information as possible and relevant in any
    corrections. The perlfaq-workers also don't have access to every
    operating system or platform, so please include relevant details for
    corrections to examples that do not work on particular platforms.
    Working code is greatly appreciated.

    If you'd like to help maintain the perlfaq, see the details in
    perlfaq.pod.
     
    PerlFAQ Server, Apr 28, 2011
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Gancy
    Replies:
    4
    Views:
    187
    Rasto Levrinc
    Feb 3, 2005
  2. PerlFAQ Server
    Replies:
    0
    Views:
    125
    PerlFAQ Server
    Jan 9, 2011
  3. PerlFAQ Server
    Replies:
    0
    Views:
    105
    PerlFAQ Server
    Jan 19, 2011
  4. PerlFAQ Server
    Replies:
    0
    Views:
    165
    PerlFAQ Server
    Apr 19, 2011
  5. Noman Shapiro
    Replies:
    0
    Views:
    235
    Noman Shapiro
    Jul 17, 2013
Loading...

Share This Page