M
Mark
From a line of arbitrary text, possibly followed by some amount of
text from the beginning of the string ' Reference #\d+', where \d+
represents one or more digit characters, I want to output the line
without the ending ' Reference...' string. For example, the input line
'some arbitrary text Refer' would become 'some arbitrary text'.
Here are two programs that seem to do what I want, but they seem
overly complicated for this task. I'm looking for a simpler solution,
possibly by using a better regular expression than I have chosen in my
first sample code.
First sample:
use strict ;
use warnings ;
my $re = qr'^(.*)\ ( (R$)|
(Re$)|
(Ref$)|
(Refe$)|
(Refer$)|
(Refere$)|
(Referenc$)|
(Reference\ {0,1}$)|
(Reference\ \#\d{0,}$)
)'x ;
while(<DATA>) {
chomp ;
print "in : >$_<\n" ;
if (my($result) = /$re/g) {
print "out: >$result<\n" ;
}
else {
print "out: >$_<\n" ;
}
}
__DATA__
Refer
One Referenc
two three Reference
xx yy Reference Reference
def Refere Reference #xx
abc the def Refere Reference #
abc the def Refere Reference #12
Second sample:
use strict ;
use warnings ;
my $PATTERN = 'Reference #000000' ;
my $pos ;
while (<DATA>) {
chomp ;
$pos = -1 ;
while ((my $ind = index($_,' R',$pos)) != -1) {
$pos = $ind + 1 ;
}
print "in : >$_<\n" ;
my $result = $_ ;
if ($pos > 0) {
my $re = substr($_,$pos) ;
$re =~ s/\d+$/\\d+/ ;
$re = qr/^$re/ ;
if ($PATTERN =~ /$re/) {
$result = substr($_,0,$pos-1) ;
}
}
print "out: >$result<\n" ;
}
__DATA__
Refer
One Referenc
two three Reference
xx yy Reference Reference
def Refere Reference #xx
abc the def Refere Reference #
abc the def Refere Reference #12
text from the beginning of the string ' Reference #\d+', where \d+
represents one or more digit characters, I want to output the line
without the ending ' Reference...' string. For example, the input line
'some arbitrary text Refer' would become 'some arbitrary text'.
Here are two programs that seem to do what I want, but they seem
overly complicated for this task. I'm looking for a simpler solution,
possibly by using a better regular expression than I have chosen in my
first sample code.
First sample:
use strict ;
use warnings ;
my $re = qr'^(.*)\ ( (R$)|
(Re$)|
(Ref$)|
(Refe$)|
(Refer$)|
(Refere$)|
(Referenc$)|
(Reference\ {0,1}$)|
(Reference\ \#\d{0,}$)
)'x ;
while(<DATA>) {
chomp ;
print "in : >$_<\n" ;
if (my($result) = /$re/g) {
print "out: >$result<\n" ;
}
else {
print "out: >$_<\n" ;
}
}
__DATA__
Refer
One Referenc
two three Reference
xx yy Reference Reference
def Refere Reference #xx
abc the def Refere Reference #
abc the def Refere Reference #12
Second sample:
use strict ;
use warnings ;
my $PATTERN = 'Reference #000000' ;
my $pos ;
while (<DATA>) {
chomp ;
$pos = -1 ;
while ((my $ind = index($_,' R',$pos)) != -1) {
$pos = $ind + 1 ;
}
print "in : >$_<\n" ;
my $result = $_ ;
if ($pos > 0) {
my $re = substr($_,$pos) ;
$re =~ s/\d+$/\\d+/ ;
$re = qr/^$re/ ;
if ($PATTERN =~ /$re/) {
$result = substr($_,0,$pos-1) ;
}
}
print "out: >$result<\n" ;
}
__DATA__
Refer
One Referenc
two three Reference
xx yy Reference Reference
def Refere Reference #xx
abc the def Refere Reference #
abc the def Refere Reference #12