Calculating a negated character class

Discussion in 'Perl Misc' started by Klaus, Jun 18, 2012.

  1. Klaus

    Klaus Guest

    Hello everybody,

    I am trying to do a simple task: creating a regular expression with
    qr{...}xms containing a simple character class. Then I can obviously
    create the negated character class by putting a caret symbol at the
    beginning inside the [...].

    So far so good.

    However, when I try (naively) to calculate the negated class from the
    original character class, I get a compile time error:

    Invalid [] range "x-i" in regex;
    marked by <-- HERE in m/[^(?msx-i <-- HERE :[abc123])]/
    at C:\test_regexp.pl line 6.

    Here is my program
    =========
    use 5.012;
    use warnings;

    my $regexp_positive = qr{[abc123]}xms;
    my $regexp_negated = qr{[^abc123]}xms;
    my $calculated_negated = qr{[^$regexp_positive]}xms;

    say "regexp_positive = $regexp_positive";
    say "regexp_negated = $regexp_negated";
    say "calculated_negated = $calculated_negated";
    ========

    I understand that putting "(?msx-i" into a character class is not the
    way forward, but how do I calculate the negated character class ?

    Thanks in advance for your response.

    -- Klaus
    Klaus, Jun 18, 2012
    #1
    1. Advertising

  2. Klaus

    Klaus Guest

    On 18 juin, 14:50, Ben Morrow <> wrote:
    > Quoth Klaus <>:
    > > I understand that putting "(?msx-i" into a character class is not the
    > > way forward, but how do I calculate the negated character class ?

    >
    >     my $cclass          = "abc123";
    >     my $regexp_positive = qr/[$cclass]/xms;
    >     my $regexp_negated  = qr/[^$cclass]/xms;


    Thanks for your reply. I can see clearer now.

    So the way forward is isolating the class from the regexp construct.

    Here is an updated version of my original program and it works!

    =============
    use 5.012;
    use warnings;

    my $regexp1_orig = qr{[abc123]}xms;
    my $regexp2_orig = qr{[^def456]}xms;

    say "regexp1_orig = $regexp1_orig";
    say "regexp2_orig = $regexp2_orig";

    my $regexp1_negated = negated($regexp1_orig);
    my $regexp2_negated = negated($regexp2_orig);

    say "regexp1_negated = $regexp1_negated";
    say "regexp2_negated = $regexp2_negated";

    sub negated {
    my ($caret, $class) = $_[0] =~ m{\A \(\? [\w\-]* : \[ (\^?) (.*?)
    \]\) \z}xms
    or die "Can't parse regexp: $_[0]";

    my $neg_caret = $caret eq '^' ? '' : '^';
    my $neg_regexp = qr{[$neg_caret$class]}xms;

    return $neg_regexp;
    }
    =============

    The output is:

    regexp1_orig = (?msx-i:[abc123])
    regexp2_orig = (?msx-i:[^def456])
    regexp1_negated = (?msx-i:[^abc123])
    regexp2_negated = (?msx-i:[def456])
    Klaus, Jun 18, 2012
    #2
    1. Advertising

  3. Klaus

    Klaus Guest

    On 18 juin, 18:31, Ben Morrow <> wrote:
    > Quoth Klaus <>:
    > > sub negated {
    > >     my ($caret, $class) = $_[0] =~ m{\A \(\? [\w\-]* : \[ (\^?)(.*?)
    > > \]\) \z}xms

    >
    > In 5.14 the stringification syntax for qrs has changed. It now looks
    > like
    >
    >     (?^umsx:[abc123])
    >
    > This was done to allow for future extensions to the set of /x flags. You
    > can either adjust your code to take account of this, or, better, use the
    > regexp_pattern function exported by the re module:
    >
    >     use re qw/regexp_pattern/;
    >
    >     my ($pattern, $flags) = regexp_pattern $_[0];
    >     my ($caret, $class) = $pattern =~ /\A \[ (\^?) (.*?) \] \z/xms
    >         or die ...;


    Thank you very much for this information, I wasn't aware that the
    stringification syntax differs between different versions perl.

    I will use re qw/regexp_pattern/ as follows:

    ==============
    use 5.012;
    use warnings;

    use re qw/regexp_pattern/;

    my $regexp1_orig = qr{[abc123]}xms;
    my $regexp2_orig = qr{[^def456]}xms;

    say "regexp1_orig = $regexp1_orig";
    say "regexp2_orig = $regexp2_orig";

    my $regexp1_negated = negated($regexp1_orig);
    my $regexp2_negated = negated($regexp2_orig);

    say "regexp1_negated = $regexp1_negated";
    say "regexp2_negated = $regexp2_negated";

    sub negated {
    my ($pattern, $flags) = regexp_pattern($_[0]);
    my ($caret, $class) =
    $pattern =~ m{\A \[ (\^?) (.*?) \] \z}xms
    or die "Can't parse regexp: $_[0]";

    my $neg_caret = $caret eq '^' ? '' : '^';
    my $neg_regexp = qr{[$neg_caret$class]}xms;

    return $neg_regexp;
    }
    ==============
    Klaus, Jun 18, 2012
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Velvet
    Replies:
    9
    Views:
    14,812
    Joerg Jooss
    Jan 19, 2006
  2. E11
    Replies:
    1
    Views:
    4,742
    Thomas Weidenfeller
    Oct 12, 2005
  3. Ronny

    Negated Perl Regexp

    Ronny, May 30, 2006, in forum: Perl Misc
    Replies:
    15
    Views:
    208
    Ted Zlatanov
    Jun 1, 2006
  4. Replies:
    6
    Views:
    161
    Brian McCauley
    Jun 8, 2007
  5. Sebastian
    Replies:
    17
    Views:
    350
    Gene Wirchenko
    Feb 4, 2013
Loading...

Share This Page