I'm struggling with an EZ way to do this regex

Discussion in 'Perl Misc' started by advice please wireless 802.11 on RH8, Sep 11, 2008.

  1. I'm pretty good at regexes- at least for most common uses. But
    although I can brute force a solution here I'm not happy with it!

    Lets say we have an array like

    my @a = qw(10 20 22 23 25);

    and some text like

    '44,33,4.44.64.10,32,25,88,20,6,55'

    and I want a regex that replaces any number in the string with say
    'XX', as long as that number is not in the array @a, yielding:

    $_ = 'XX,XX,XX.XX.XX.10,XX,25,XX,20,XX,XX'

    The most *elegant* approach I've dreamed up is to join the array with
    OR (|), then somehow use that to compare in the text. But I'm not sure
    how to negatively compare.

    my $a = join '|',@a;
    s/(something)($a)/XXX/g;

    I think this may be one of those oddball assertions that I never
    mastered.

    My other idea was to @t = split /,/
    then iterate over each element with

    grep /^$element$/,@t

    but that ain't so pretty either..



    Can someone give me a nudge in the right direction to do this in A
    single, simple, elegant regex with no array conversions or looping? I
    can usually dream one up but not this time!
    advice please wireless 802.11 on RH8, Sep 11, 2008
    #1
    1. Advertising

  2. advice please wireless 802.11 on RH8

    Peter Scott Guest

    On Thu, 11 Sep 2008 06:43:04 -0700, advice please wireless 802.11 on RH8
    wrote:
    > Lets say we have an array like
    >
    > my @a = qw(10 20 22 23 25);
    >
    > and some text like
    >
    > '44,33,4.44.64.10,32,25,88,20,6,55'
    >
    > and I want a regex that replaces any number in the string with say
    > 'XX', as long as that number is not in the array @a, yielding:
    >
    > $_ = 'XX,XX,XX.XX.XX.10,XX,25,XX,20,XX,XX'


    my @a = qw(10 20 22 23 25);
    $_ = '44,33,4.44.64.10,32,25,88,20,6,55';
    my %keep = map { $_, 1 } @a;
    s/(\d+)/$keep{$1} ? $1 : 'XX'/ge;

    --
    Peter Scott
    http://www.perlmedic.com/
    http://www.perldebugged.com/
    Peter Scott, Sep 11, 2008
    #2
    1. Advertising

  3. advice please wireless 802.11 on RH8

    Ben Morrow Guest

    Quoth "advice please wireless 802.11 on RH8" <>:
    > I'm pretty good at regexes- at least for most common uses. But
    > although I can brute force a solution here I'm not happy with it!
    >
    > Lets say we have an array like
    >
    > my @a = qw(10 20 22 23 25);
    >
    > and some text like
    >
    > '44,33,4.44.64.10,32,25,88,20,6,55'
    >
    > and I want a regex that replaces any number in the string with say
    > 'XX', as long as that number is not in the array @a, yielding:
    >
    > $_ = 'XX,XX,XX.XX.XX.10,XX,25,XX,20,XX,XX'
    >
    > The most *elegant* approach I've dreamed up is to join the array with
    > OR (|), then somehow use that to compare in the text. But I'm not sure
    > how to negatively compare.
    >
    > my $a = join '|',@a;
    > s/(something)($a)/XXX/g;
    >
    > I think this may be one of those oddball assertions that I never
    > mastered.


    Something like

    s/ (?! $a ) \d+ /XX/gx

    is what you want, but that hits lots of nasty corner cases like '1'
    being in the array and '12' in the string. I *think*

    s/ (^|\D) (?! (?: $a) (?: \D|$) ) \d+ /$1XX/gx

    works correctly, but that's hardly pretty. With 5.10 you can remove the
    nasty $1 capture using \K:

    s/ (?: ^|\D) \K (?! (?: $a) (?: \D|$) ) \d+ /XX/gx

    but it's not much of an improvement.

    I would put the numbers to be matched in a hash:

    my %ok;
    @ok{@a} = 1;

    and then split the string and match against the hash:

    my @split = split /\D/;
    for (@split) {
    $_ = "XX" unless $ok{$_};
    }
    $_ = join ",", @split;

    Ben

    --
    I have two words that are going to make all your troubles go away.
    "Miniature". "Golf".
    []
    Ben Morrow, Sep 11, 2008
    #3
  4. advice please wireless 802.11 on RH8

    Peter Scott Guest

    On Thu, 11 Sep 2008 15:16:08 +0100, Ben Morrow wrote:
    > I would put the numbers to be matched in a hash:
    >
    > my %ok;
    > @ok{@a} = 1;
    >
    > and then split the string and match against the hash:
    >
    > my @split = split /\D/;
    > for (@split) {
    > $_ = "XX" unless $ok{$_};
    > }
    > $_ = join ",", @split;


    Not all of the inter-digit characters in the input string were commas.

    --
    Peter Scott
    http://www.perlmedic.com/
    http://www.perldebugged.com/
    Peter Scott, Sep 11, 2008
    #4
  5. advice please wireless 802.11 on RH8

    Ben Morrow Guest

    Quoth Peter Scott <>:
    > On Thu, 11 Sep 2008 15:16:08 +0100, Ben Morrow wrote:
    > > I would put the numbers to be matched in a hash:
    > >
    > > my %ok;
    > > @ok{@a} = 1;
    > >
    > > and then split the string and match against the hash:
    > >
    > > my @split = split /\D/;
    > > for (@split) {
    > > $_ = "XX" unless $ok{$_};
    > > }
    > > $_ = join ",", @split;

    >
    > Not all of the inter-digit characters in the input string were commas.


    I noticed that, but the OP mentioned split /,/ so I presumed they were
    typos. If not, something like

    my @split = split /(\D+)/;
    for (@split) {
    /\D/ and next;
    $_ = "XX" unless $ok{$_};
    }
    $_ = join "", @split;

    should do.

    Ben

    --
    If you put all the prophets, | You'd have so much more reason
    Mystics and saints | Than ever was born
    In one room together, | Out of all of the conflicts of time.
    The Levellers, 'Believers'
    Ben Morrow, Sep 11, 2008
    #5
  6. [A complimentary Cc of this posting was NOT [per weedlist] sent to
    Ben Morrow
    <>], who wrote in article <>:
    > > '44,33,4.44.64.10,32,25,88,20,6,55'
    > >
    > > and I want a regex that replaces any number in the string with say 'XX',


    I do not know what is a "number". I assume you mean "a sequence of digits".

    > Something like
    >
    > s/ (?! $a ) \d+ /XX/gx


    s/ \b (?! $a \b ) \d+ /XX/gx

    Hope this helps,
    Ilya
    Ilya Zakharevich, Sep 11, 2008
    #6
  7. advice please wireless 802.11 on RH8

    Ben Morrow Guest

    Quoth Ilya Zakharevich <>:
    > [A complimentary Cc of this posting was NOT [per weedlist] sent to
    > Ben Morrow
    > <>], who wrote in article
    > <>:
    > > > '44,33,4.44.64.10,32,25,88,20,6,55'
    > > >
    > > > and I want a regex that replaces any number in the string with say 'XX',

    >
    > I do not know what is a "number". I assume you mean "a sequence of digits".
    >
    > > Something like
    > >
    > > s/ (?! $a ) \d+ /XX/gx

    >
    > s/ \b (?! $a \b ) \d+ /XX/gx


    Duh! I was thinking I needed a \d\D boundary, but of course for the
    string given a \w\W boundary works just as well.

    Thanks

    Ben

    --
    "If a book is worth reading when you are six, *
    it is worth reading when you are sixty." [C.S.Lewis]
    Ben Morrow, Sep 12, 2008
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Martin
    Replies:
    6
    Views:
    659
    Martin
    Dec 29, 2003
  2. One Handed Man \( OHM#\)

    Struggling With Concept

    One Handed Man \( OHM#\), Jun 12, 2004, in forum: ASP .Net
    Replies:
    1
    Views:
    366
    Jared
    Jun 12, 2004
  3. Guest
    Replies:
    5
    Views:
    391
    Guest
    Dec 26, 2004
  4. John
    Replies:
    1
    Views:
    400
    =?Utf-8?B?RWx0b24gVw==?=
    Oct 20, 2005
  5. Replies:
    3
    Views:
    728
    Reedick, Andrew
    Jul 1, 2008
Loading...

Share This Page