I'm struggling with an EZ way to do this regex

  • Thread starter advice please wireless 802.11 on RH8
  • Start date
A

advice please wireless 802.11 on RH8

I'm pretty good at regexes- at least for most common uses. But
although I can brute force a solution here I'm not happy with it!

Lets say we have an array like

my @a = qw(10 20 22 23 25);

and some text like

'44,33,4.44.64.10,32,25,88,20,6,55'

and I want a regex that replaces any number in the string with say
'XX', as long as that number is not in the array @a, yielding:

$_ = 'XX,XX,XX.XX.XX.10,XX,25,XX,20,XX,XX'

The most *elegant* approach I've dreamed up is to join the array with
OR (|), then somehow use that to compare in the text. But I'm not sure
how to negatively compare.

my $a = join '|',@a;
s/(something)($a)/XXX/g;

I think this may be one of those oddball assertions that I never
mastered.

My other idea was to @t = split /,/
then iterate over each element with

grep /^$element$/,@t

but that ain't so pretty either..



Can someone give me a nudge in the right direction to do this in A
single, simple, elegant regex with no array conversions or looping? I
can usually dream one up but not this time!
 
P

Peter Scott

Lets say we have an array like

my @a = qw(10 20 22 23 25);

and some text like

'44,33,4.44.64.10,32,25,88,20,6,55'

and I want a regex that replaces any number in the string with say
'XX', as long as that number is not in the array @a, yielding:

$_ = 'XX,XX,XX.XX.XX.10,XX,25,XX,20,XX,XX'

my @a = qw(10 20 22 23 25);
$_ = '44,33,4.44.64.10,32,25,88,20,6,55';
my %keep = map { $_, 1 } @a;
s/(\d+)/$keep{$1} ? $1 : 'XX'/ge;
 
B

Ben Morrow

Quoth "advice please wireless 802.11 on RH8 said:
I'm pretty good at regexes- at least for most common uses. But
although I can brute force a solution here I'm not happy with it!

Lets say we have an array like

my @a = qw(10 20 22 23 25);

and some text like

'44,33,4.44.64.10,32,25,88,20,6,55'

and I want a regex that replaces any number in the string with say
'XX', as long as that number is not in the array @a, yielding:

$_ = 'XX,XX,XX.XX.XX.10,XX,25,XX,20,XX,XX'

The most *elegant* approach I've dreamed up is to join the array with
OR (|), then somehow use that to compare in the text. But I'm not sure
how to negatively compare.

my $a = join '|',@a;
s/(something)($a)/XXX/g;

I think this may be one of those oddball assertions that I never
mastered.

Something like

s/ (?! $a ) \d+ /XX/gx

is what you want, but that hits lots of nasty corner cases like '1'
being in the array and '12' in the string. I *think*

s/ (^|\D) (?! (?: $a) (?: \D|$) ) \d+ /$1XX/gx

works correctly, but that's hardly pretty. With 5.10 you can remove the
nasty $1 capture using \K:

s/ (?: ^|\D) \K (?! (?: $a) (?: \D|$) ) \d+ /XX/gx

but it's not much of an improvement.

I would put the numbers to be matched in a hash:

my %ok;
@ok{@a} = 1;

and then split the string and match against the hash:

my @split = split /\D/;
for (@split) {
$_ = "XX" unless $ok{$_};
}
$_ = join ",", @split;

Ben
 
P

Peter Scott

I would put the numbers to be matched in a hash:

my %ok;
@ok{@a} = 1;

and then split the string and match against the hash:

my @split = split /\D/;
for (@split) {
$_ = "XX" unless $ok{$_};
}
$_ = join ",", @split;

Not all of the inter-digit characters in the input string were commas.
 
B

Ben Morrow

Quoth Peter Scott said:
Not all of the inter-digit characters in the input string were commas.

I noticed that, but the OP mentioned split /,/ so I presumed they were
typos. If not, something like

my @split = split /(\D+)/;
for (@split) {
/\D/ and next;
$_ = "XX" unless $ok{$_};
}
$_ = join "", @split;

should do.

Ben
 
I

Ilya Zakharevich

[A complimentary Cc of this posting was NOT [per weedlist] sent to
Ben Morrow

I do not know what is a "number". I assume you mean "a sequence of digits".
Something like

s/ (?! $a ) \d+ /XX/gx

s/ \b (?! $a \b ) \d+ /XX/gx

Hope this helps,
Ilya
 
B

Ben Morrow

Quoth Ilya Zakharevich said:
[A complimentary Cc of this posting was NOT [per weedlist] sent to
Ben Morrow

I do not know what is a "number". I assume you mean "a sequence of digits".
Something like

s/ (?! $a ) \d+ /XX/gx

s/ \b (?! $a \b ) \d+ /XX/gx

Duh! I was thinking I needed a \d\D boundary, but of course for the
string given a \w\W boundary works just as well.

Thanks

Ben
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,900
Latest member
Nell636132

Latest Threads

Top