Replace all occurences of a char, except the first

B

Bart Van der Donck

Hello,

I got intrigued by the following challenge which seems to be
impossible.

Is it possible to write a regular expression that replaces all
occurences of a character except the first occurence ?

a''bc'def'g' -> a'bcdefg
'''ab'cd'efg -> 'abcdefg
abc'd'e''f'g -> abc'defg
etc.
 
J

John W. Krahn

Bart said:
I got intrigued by the following challenge which seems to be
impossible.

Is it possible to write a regular expression that replaces all
occurences of a character except the first occurence ?

a''bc'def'g' -> a'bcdefg
'''ab'cd'efg -> 'abcdefg
abc'd'e''f'g -> abc'defg
etc.

$ perl -e'
my @x = ( "a##bc#def#g#", "###ab#cd#efg", "abc#d#e##f#g" );
for ( @x ) {
print;
my $count;
s/(#)/ $count++ ? "" : $1 /eg;
print " -> $_\n";
}
'
a##bc#def#g# -> a#bcdefg
###ab#cd#efg -> #abcdefg
abc#d#e##f#g -> abc#defg




John
 
T

Tim Greer

Bart said:
Hello,

I got intrigued by the following challenge which seems to be
impossible.

Is it possible to write a regular expression that replaces all
occurences of a character except the first occurence ?

a''bc'def'g' -> a'bcdefg
'''ab'cd'efg -> 'abcdefg
abc'd'e''f'g -> abc'defg
etc.

I assume you mean that it'll have to locate any character that's
repeated (not a predetermined one) and only remove those additional
occurrences, and only use a regular expression?

I'd use a different method to accomplish the task, but it is indeed
interesting to try and do this only with a regex.

The following should work and only remove the second or higher instance
of any character (so you a string with "'''sfsf'et'st464y'''" will
result in "'sfet46y". Pretty cool, eh?

Here's the regex logic:

$string =~ s|(.)| ($` =~ m/$1/) ? '' : $1 |eg;

Here's an example based on your strings, using only a regular
expression:

script.pl:

!/usr/bin/perl
use warnings;
use strict;

my $linea = "a''bc'def'g'";
my $lineb = "'''ab'cd'efg";
my $linec = "abc'd'e''f'g";

print "$linea -> ";
$linea =~ s|(.)| ($` =~ m/$1/) ? '' : $1 |eg;
print "$linea\n";

print "$lineb -> ";
$lineb =~ s|(.)| ($` =~ m/$1/) ? '' : $1 |eg;
print "$lineb\n";

print "$linec -> ";
$linec =~ s|(.)| ($` =~ m/$1/) ? '' : $1 |eg;
print "$linec\n";


The output:
~]$ ./script.pl
a''bc'def'g' -> a'bcdefg
'''ab'cd'efg -> 'abcdefg
abc'd'e''f'g -> abc'defg

Is that what you were trying to do?
 
T

Tad J McClellan

Bart Van der Donck said:
I got intrigued by the following challenge which seems to be
impossible.

Is it possible to write a regular expression that replaces all
occurences of a character except the first occurence ?


No, it is not possible to replace anything with only a regular expression.

A regular expression either "matches" or "does not match", it cannot "replace".

An operator that uses a regular expression as on of its operands,
such as s/// can probably do that without much trouble though.
 
B

Ben Morrow

Quoth Tim Greer said:
I assume you mean that it'll have to locate any character that's
repeated (not a predetermined one) and only remove those additional
occurrences, and only use a regular expression?

I'd use a different method to accomplish the task, but it is indeed
interesting to try and do this only with a regex.

The following should work and only remove the second or higher instance
of any character (so you a string with "'''sfsf'et'st464y'''" will
result in "'sfet46y". Pretty cool, eh?

Here's the regex logic:

$string =~ s|(.)| ($` =~ m/$1/) ? '' : $1 |eg;

Without using /e:

~% perl -le'$_ = "abccbdcdc"; 1 while s/(.)(.*)\1/$1$2/g; print'
abcd

Using 5.10's \K we can remove the replacement part:

~% perl5.10.0 -le'$_="abccbdcdc"; 1 while s/(.).*\K\1//g; print'
abcd

and if we reverse the string before and after (so we can use look*ahead*
instead, which can be variable-length) we can remove the while loop:

~% perl -le'$_ = reverse "abccbdcdc"; s/(.)(?=.*\1)//g;
print scalar reverse'
abcd

Ben
 
J

John W. Krahn

Bart said:
I got intrigued by the following challenge which seems to be
impossible.

Is it possible to write a regular expression that replaces all
occurences of a character except the first occurence ?

a''bc'def'g' -> a'bcdefg
'''ab'cd'efg -> 'abcdefg
abc'd'e''f'g -> abc'defg
etc.

$ perl -e'
my @x = ( "a##bc#def#g#", "###ab#cd#efg", "abc#d#e##f#g" );
for ( @x ) {
print;
/#/ && substr( $_, $+[0] ) =~ tr/#//d;
print " -> $_\n";
}
'
a##bc#def#g# -> a#bcdefg
###ab#cd#efg -> #abcdefg
abc#d#e##f#g -> abc#defg



John
 
T

Tim Greer

Ben said:
Without using /e:

~% perl -le'$_ = "abccbdcdc"; 1 while s/(.)(.*)\1/$1$2/g; print'
abcd

Using 5.10's \K we can remove the replacement part:

~% perl5.10.0 -le'$_="abccbdcdc"; 1 while s/(.).*\K\1//g; print'
abcd

and if we reverse the string before and after (so we can use
look*ahead* instead, which can be variable-length) we can remove the
while loop:

~% perl -le'$_ = reverse "abccbdcdc"; s/(.)(?=.*\1)//g;
print scalar reverse'
abcd

Ben

This is why I enjoy Perl. There's usually several ways of doing it. I
was purposely trying to be quirky, but in all seriousness for the views
of this thread, Ben's suggestion above is better (more efficient than
using my solution with $` -- it was all in good fun though). I wasn't
familiar with \K (I'm still in 5.8.x), so that's pretty cool.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top