FAQ 6.9 How can I quote a variable to use in a regex?

Discussion in 'Perl Misc' started by PerlFAQ Server, Apr 12, 2011.

  1. This is an excerpt from the latest version perlfaq6.pod, which
    comes with the standard Perl distribution. These postings aim to
    reduce the number of repeated questions as well as allow the community
    to review and update the answers. The latest version of the complete
    perlfaq is at http://faq.perl.org .

    --------------------------------------------------------------------

    6.9: How can I quote a variable to use in a regex?

    The Perl parser will expand $variable and @variable references in
    regular expressions unless the delimiter is a single quote. Remember,
    too, that the right-hand side of a "s///" substitution is considered a
    double-quoted string (see perlop for more details). Remember also that
    any regex special characters will be acted on unless you precede the
    substitution with \Q. Here's an example:

    $string = "Placido P. Octopus";
    $regex = "P.";

    $string =~ s/$regex/Polyp/;
    # $string is now "Polypacido P. Octopus"

    Because "." is special in regular expressions, and can match any single
    character, the regex "P." here has matched the <Pl> in the original
    string.

    To escape the special meaning of ".", we use "\Q":

    $string = "Placido P. Octopus";
    $regex = "P.";

    $string =~ s/\Q$regex/Polyp/;
    # $string is now "Placido Polyp Octopus"

    The use of "\Q" causes the <.> in the regex to be treated as a regular
    character, so that "P." matches a "P" followed by a dot.



    --------------------------------------------------------------------

    The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
    are not necessarily experts in every domain where Perl might show up,
    so please include as much information as possible and relevant in any
    corrections. The perlfaq-workers also don't have access to every
    operating system or platform, so please include relevant details for
    corrections to examples that do not work on particular platforms.
    Working code is greatly appreciated.

    If you'd like to help maintain the perlfaq, see the details in
    perlfaq.pod.
    PerlFAQ Server, Apr 12, 2011
    #1
    1. Advertising

  2. PerlFAQ Server

    ccc31807 Guest

    On Apr 12, 6:00 am, PerlFAQ Server <> wrote:
    > 6.9: How can I quote a variable to use in a regex?


    I have applications that process files in a remote directory that I
    get using SCP. The files that I process are named like
    'USA_20110412.txt' with each day's file having the date. Sometimes
    they miss a day, and the next day I'll pick up two files from the
    directory. I pick up my files with a file glob, like this: 'USA_*.txt'
    and it's worked for five or six years without a hitch.

    This application runs for a number of different kinds of files with
    different prefixes, so I'll get files like 'USA_*.txt', 'USB_*.txt',
    'USC_*.txt', etc. I use different rules for processing each kind of
    file.

    Users have complained from time to time about having to process
    multiple files individually, so this morning I decided to fix this and
    put the common code in a module and pass the function the file glob to
    the function. (The variable $CONFIG{GLOB_A} contains a string like
    'USA_*.txt'.) I call the function like this:

    COMMON::process_files($CONFIG{glob};

    and COMMON.pm contains this function definition

    starts like this

    sub process_files
    {
    my $glob = shift;
    ...
    do_something if $file =~ /$glob/;
    ...
    }

    Guess what, guys? It didn't work! Unfortunately, the OS file glob
    'USA_*.txt' should have been given to the regex as 'USA_.*txt' (notice
    that the dot and the star have swapped places).

    Question: is there any way I can use the OS file glob in a regex
    without changing it? Can I put my file descriptor in a variable and
    pass it reliably to the regex?

    CC.
    ccc31807, Apr 12, 2011
    #2
    1. Advertising

  3. PerlFAQ Server

    Uri Guttman Guest

    >>>>> "c" == ccc31807 <> writes:

    c> Guess what, guys? It didn't work! Unfortunately, the OS file glob
    c> 'USA_*.txt' should have been given to the regex as 'USA_.*txt' (notice
    c> that the dot and the star have swapped places).

    you don't have it right either way. . is not a meta char in globs, it
    matches . (since . is a common part separator in file names). * in globs
    matches 0 or more char in the current spot - it does not matter the char
    to the left.

    in regexes, . matches any one char and * matches 0 or more of the char
    to its left.

    so just swapping * and . makes no sense.

    c> Question: is there any way I can use the OS file glob in a regex
    c> without changing it? Can I put my file descriptor in a variable and
    c> pass it reliably to the regex?

    there is no direct way to use a glob pattern in a regex. there may be a
    module that can convert it for you. the simplest solution is to replace
    .. with \. and * with .* (in that order). that will handle those two
    chars. there are other glob things you may need but most users don't
    know them.

    uri

    --
    Uri Guttman ------ -------- http://www.sysarch.com --
    ----- Perl Code Review , Architecture, Development, Training, Support ------
    --------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
    Uri Guttman, Apr 12, 2011
    #3
  4. PerlFAQ Server

    brian d foy Guest

    In article
    <>,
    ccc31807 <> wrote:

    > Question: is there any way I can use the OS file glob in a regex
    > without changing it? Can I put my file descriptor in a variable and
    > pass it reliably to the regex?


    Isn't that what glob() is for?
    brian d foy, Apr 12, 2011
    #4
  5. PerlFAQ Server

    ccc31807 Guest

    On Apr 12, 1:43 pm, "Uri Guttman" <> wrote:
    > so just swapping * and . makes no sense.


    As a file glob, 'USA_*.txt' gets all the files that begin with 'USA_'
    and end with '.txt' with (perhaps) a few characters in between.

    As a regular expression, /USA_.*txt/ matches a string that contains
    the literal characters 'USA_', followed by zero or more characters,
    followed by 'txt'.

    As a matter of fact, swapping the dot and star actually didn't work in
    my application (due to other reasons not material here). I solved the
    problem by assigning all the literal characters before the star to a
    variable, and then matched the variable. The fact that the file name
    ends in 'txt' doesn't matter.

    I would have liked to use the same value both as a file glob for the
    purpose of getting the file (which I do by running pscp as an external
    process) and as a regular expression to process just the files I need,
    but in the end it doesn't matter -- except maybe as a reminder that
    the file glob syntax and the regex syntax isn't identical.

    CC.
    ccc31807, Apr 12, 2011
    #5
  6. PerlFAQ Server

    Uri Guttman Guest

    >>>>> "c" == ccc31807 <> writes:

    c> On Apr 12, 1:43 pm, "Uri Guttman" <> wrote:
    >> so just swapping * and . makes no sense.


    c> As a file glob, 'USA_*.txt' gets all the files that begin with 'USA_'
    c> and end with '.txt' with (perhaps) a few characters in between.

    please don't tell me how globs work.

    c> As a regular expression, /USA_.*txt/ matches a string that contains
    c> the literal characters 'USA_', followed by zero or more characters,
    c> followed by 'txt'.

    please don't tell me how regexes work.


    c> As a matter of fact, swapping the dot and star actually didn't work in
    c> my application (due to other reasons not material here). I solved the
    c> problem by assigning all the literal characters before the star to a
    c> variable, and then matched the variable. The fact that the file name
    c> ends in 'txt' doesn't matter.

    it wouldn't work under any circumstances. regexes are not globs.

    the regex will match USA_txt without the . which you should have. the
    glob MUST have a . matched. they are different patterns.

    c> I would have liked to use the same value both as a file glob for the
    c> purpose of getting the file (which I do by running pscp as an external
    c> process) and as a regular expression to process just the files I need,
    c> but in the end it doesn't matter -- except maybe as a reminder that
    c> the file glob syntax and the regex syntax isn't identical.

    did you see what i said about a proper way to fix it? i don't think
    so. swapping is wrong on many levels. my solution is correct on all
    levels.

    uri

    --
    Uri Guttman ------ -------- http://www.sysarch.com --
    ----- Perl Code Review , Architecture, Development, Training, Support ------
    --------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
    Uri Guttman, Apr 12, 2011
    #6
  7. PerlFAQ Server

    Willem Guest

    brian d foy wrote:
    ) In article
    )<>,
    ) ccc31807 <> wrote:
    )
    )> Question: is there any way I can use the OS file glob in a regex
    )> without changing it? Can I put my file descriptor in a variable and
    )> pass it reliably to the regex?
    )
    ) Isn't that what glob() is for?

    Wouldn't it be nice if there were a glob() that worked on strings ?


    SaSW, Willem
    --
    Disclaimer: I am in no way responsible for any of the statements
    made in the above text. For all I know I might be
    drugged or something..
    No I'm not paranoid. You all think I'm paranoid, don't you !
    #EOT
    Willem, Apr 12, 2011
    #7
  8. PerlFAQ Server

    Alan Curry Guest

    In article <>,
    Willem <> wrote:
    >brian d foy wrote:
    >
    >Wouldn't it be nice if there were a glob() that worked on strings ?


    An excellent summary of the original long-form question.

    There is such a function in the POSIX C library, called fnmatch(). It
    doesn't seem to be present in POSIX.pm, but look what I got by googling
    "perl fnmatch":

    http://search.cpan.org/~mjp/File-FnMatch-0.02/FnMatch.pm

    Seems to be just the thing.

    --
    Alan Curry
    Alan Curry, Apr 12, 2011
    #8
  9. On 2011-04-12, Willem <> wrote:
    > Wouldn't it be nice if there were a glob() that worked on strings ?


    Did you actually inspect File::Glob before doing your wishful
    thinking? (You claim you do not need learning how glob() operates,
    right?)

    Ilya
    Ilya Zakharevich, Apr 12, 2011
    #9
  10. On 2011-04-12, Ilya Zakharevich <> wrote:
    > On 2011-04-12, Willem <> wrote:
    >> Wouldn't it be nice if there were a glob() that worked on strings ?

    >
    > Did you actually inspect File::Glob before doing your wishful
    > thinking? (You claim you do not need learning how glob() operates,
    > right?)


    My apologies - in the last sentence I mixed you up with someone else. :-(

    Yours,
    Ilya
    Ilya Zakharevich, Apr 12, 2011
    #10
  11. PerlFAQ Server

    ccc31807 Guest

    On Apr 12, 5:44 pm, Tad McClellan <> wrote:
    > We need a reminder that the funny characters in different languages
    > mean different things?


    Unfortunately, sometimes we do, especially when we don't concentrate
    on the particular tool currently in use.

    I use vi, emacs, and Textpad for different things, and sometimes I
    have all three open and in use at the same time, and sometimes I type
    a Ctl-p in vi, or a :x in Textpad, or a Ctl-v in emacs, and
    momentarily wonder what's wrong with the command. I'm just not good
    enough to juggle three balls at the same time. Hell, I'm often not
    good enough to juggle ONE ball at the same time!

    CC.
    ccc31807, Apr 13, 2011
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    3
    Views:
    13,212
  2. Stimp
    Replies:
    2
    Views:
    2,260
    Stimp
    Sep 20, 2006
  3. Eric Layman
    Replies:
    3
    Views:
    625
    Rad [Visual C# MVP]
    Apr 14, 2007
  4. Replies:
    6
    Views:
    440
    Stefan Ram
    Jun 15, 2008
  5. PerlFAQ Server
    Replies:
    0
    Views:
    94
    PerlFAQ Server
    Jan 25, 2011
Loading...

Share This Page