X
xyz88888
I encountered a problem with a PERL script I wrote, and I would like
to find out if this is just a problem with the ActiveState PERL
interpreter, or a problem in the PERL spec itself. I am using the
ActivePerl interpreter, build 820 version 5.8.8. I looked for other
free downloadable PERL interpreters to compare results, but had no
luck. So I'm hoping to get some feedback from any of you interested
folks.
The input: a multimedia production storyboard in ASCII-text format.
Sections of the file contain dialogue for voice-overs, and I want to
parse out just this stuff and ignore the rest. In this case, I need
all lines starting with the string '(rules script) "' copied to the
output file of my script.
The process: my PERL script reads in the file line-by-line, does
string matching against the first part of each line and based on some
logic, determines if and how that line should be copied to the output
file.
The problem: in some instances, the string to be used in the matching
function needs to be stripped out before copied to the output file. I
do this using "positional parameters" ($1, $2,...) from the match
command. Whenever the string contains non-alphanumeric characters
(with the "\" escape), those characters seem to cause the substitution
command to fail. After numerous codings, I concluded that based on the
various outputs.
The code: For brevity I am only including the combinations of match
conditionals and the substitution commands I tried that are related to
this specific problem. I'll copy the whole script down below for those
who want to see the whole thing, although everything else works fine.
The match function is written to match the string '(rules script) "'
at the beginning of a line.
Variation 1:
} elsif (/(^\(rules script\) ")/) {
s/$1//;
print OUTFILE;
}
Result 1:
(rules script) "If you need.....
Comments:
This was the first attempt. The outer parans go with the elsif, the
middle parans set the value for $1 (I expected!), and the inner parans
are escaped as they are part of the string matching. Based on the
output I concluded no sub-ing of any kind was performed - input
matched output verbatim.
variation 2:
} elsif (/(^\(rules script\) \")/) {
s/$1//;
print OUTFILE;
}
Result 2:
(rules script) "If you need.....
Comments:
My first suspician was the double-quote at the end of the match. So I
put an "\" before it to ensure it was treated as a literal and copied
into the value for $1. No change.
Variation 3:
} elsif (/(^\(rules script\) ")/) {
print STDERR "\n1=$1\n";
s/\($1\)\s\"//;
print OUTFILE;
}
Result 3:
(to STDERR)1=(rules script) "
OUTPUT: (rules script) "If you need.....
My suspicion was confirmed. $1 has the correct string after the
successful match, but for some reason the sub command is failing. I
hoped that putting in the non-alphanums into the first parameter of
the sub command would work, but it did not. Onto 4 (and qualified
success!)...
Variation 4:
} elsif (/(rules script)/) {
print STDERR "\n1=$1\n";
s/\($1\)\s\"//;
print OUTFILE;
}
Result 4:
(to STDERR)1=(rules script) "
OUTPUT: If you need.....
Comments:
At last, I got the result I wanted! Unfortunately I had to literally
spell it out for the script. Wanted to rule out that it was the non-
alphanums that were mucking up the sub.
Going for broke, I tried doing the sub and matching together in the
conditional test, and that worked fine too.
} elsif (s/^(\(rules script\)\s\")//) {
Conclusions:
* using the positional parameter ($1) in sub-ing = ok
* using non-alphanums in sub-ing = ok
* using a positional parameter containing alpha-nums in sub-ing = NA-
AH!
So my curiosity is whether this is a bug in the ActiveState version of
PERL, the PERL spec itself (not likely), or my logic (usually my first
suspicion, but this time I think I'm off the hook of guilt, no?).
Please reply if you have any insight or gave it a go yourself and got
some useful results. Thanks.
- DK
P.S. As promised, here is the whole script.
$x = 0;
$cont = "f";
$csm = "f";
open (INFILE, "lcm01_v1d-TO.txt");
open (OUTFILE, ">outfile.txt");
while (<INFILE>) {
++$x;
chomp;
if (/^(lcm01_\d{3}\w*)\s?/) {
$csm = "f";
print OUTFILE "\n$_\nNARRATOR: ";
} elsif (s/^(\(rules script\)\s\")//) {
print STDERR "\n1=$1\n";
s/\($1\)\s\"//;
print STDERR "\n$x:\n$_\n";
s/"\s*$/ /;
print OUTFILE;
print OUTFILE "When you have completed this exercise, click the
Next button to continue.\n";
} elsif ($cont eq "t") {
chomp; chomp; chomp;
if ((/^Notes$/)
|| (/^Correct Answer:$/)
|| (//))
{
$cont = "f";
} elsif (/"\s*$/) {
$cont = "f";
s/"\s*$/\n/;
print OUTFILE;
} else {
print OUTFILE;
}
} elsif ((! /TRAINER:/)
&& (/([A-Z]+):\s+"?(.+)/))
{
$cont = "t";
if ($1 eq "CSM") {
$csm = "t";
$line = "\n$1: $2";
} elsif (($line = $2) =~ m/^Click/) {
$cont = "f";
if ($csm eq "f") {
$line = "\n$line";
} else {
$line = "\nNARRATOR: $line\n";
}
} elsif ($1 eq "NARRATION") {
$line = "\n$1: $2";
} elsif ($csm eq "t") {
$line = "\n$1: $2";
} elsif ($csm eq "f") {
$line = "$2";
}
if ($line =~ m/("\s*)$/) {
$line =~ s/$1/\n/;
$cont = "f";
}
print OUTFILE $line;
}
}
close INFILE;
close OUTFILE;
end;
to find out if this is just a problem with the ActiveState PERL
interpreter, or a problem in the PERL spec itself. I am using the
ActivePerl interpreter, build 820 version 5.8.8. I looked for other
free downloadable PERL interpreters to compare results, but had no
luck. So I'm hoping to get some feedback from any of you interested
folks.
The input: a multimedia production storyboard in ASCII-text format.
Sections of the file contain dialogue for voice-overs, and I want to
parse out just this stuff and ignore the rest. In this case, I need
all lines starting with the string '(rules script) "' copied to the
output file of my script.
The process: my PERL script reads in the file line-by-line, does
string matching against the first part of each line and based on some
logic, determines if and how that line should be copied to the output
file.
The problem: in some instances, the string to be used in the matching
function needs to be stripped out before copied to the output file. I
do this using "positional parameters" ($1, $2,...) from the match
command. Whenever the string contains non-alphanumeric characters
(with the "\" escape), those characters seem to cause the substitution
command to fail. After numerous codings, I concluded that based on the
various outputs.
The code: For brevity I am only including the combinations of match
conditionals and the substitution commands I tried that are related to
this specific problem. I'll copy the whole script down below for those
who want to see the whole thing, although everything else works fine.
The match function is written to match the string '(rules script) "'
at the beginning of a line.
Variation 1:
} elsif (/(^\(rules script\) ")/) {
s/$1//;
print OUTFILE;
}
Result 1:
(rules script) "If you need.....
Comments:
This was the first attempt. The outer parans go with the elsif, the
middle parans set the value for $1 (I expected!), and the inner parans
are escaped as they are part of the string matching. Based on the
output I concluded no sub-ing of any kind was performed - input
matched output verbatim.
variation 2:
} elsif (/(^\(rules script\) \")/) {
s/$1//;
print OUTFILE;
}
Result 2:
(rules script) "If you need.....
Comments:
My first suspician was the double-quote at the end of the match. So I
put an "\" before it to ensure it was treated as a literal and copied
into the value for $1. No change.
Variation 3:
} elsif (/(^\(rules script\) ")/) {
print STDERR "\n1=$1\n";
s/\($1\)\s\"//;
print OUTFILE;
}
Result 3:
(to STDERR)1=(rules script) "
OUTPUT: (rules script) "If you need.....
My suspicion was confirmed. $1 has the correct string after the
successful match, but for some reason the sub command is failing. I
hoped that putting in the non-alphanums into the first parameter of
the sub command would work, but it did not. Onto 4 (and qualified
success!)...
Variation 4:
} elsif (/(rules script)/) {
print STDERR "\n1=$1\n";
s/\($1\)\s\"//;
print OUTFILE;
}
Result 4:
(to STDERR)1=(rules script) "
OUTPUT: If you need.....
Comments:
At last, I got the result I wanted! Unfortunately I had to literally
spell it out for the script. Wanted to rule out that it was the non-
alphanums that were mucking up the sub.
Going for broke, I tried doing the sub and matching together in the
conditional test, and that worked fine too.
} elsif (s/^(\(rules script\)\s\")//) {
Conclusions:
* using the positional parameter ($1) in sub-ing = ok
* using non-alphanums in sub-ing = ok
* using a positional parameter containing alpha-nums in sub-ing = NA-
AH!
So my curiosity is whether this is a bug in the ActiveState version of
PERL, the PERL spec itself (not likely), or my logic (usually my first
suspicion, but this time I think I'm off the hook of guilt, no?).
Please reply if you have any insight or gave it a go yourself and got
some useful results. Thanks.
- DK
P.S. As promised, here is the whole script.
$x = 0;
$cont = "f";
$csm = "f";
open (INFILE, "lcm01_v1d-TO.txt");
open (OUTFILE, ">outfile.txt");
while (<INFILE>) {
++$x;
chomp;
if (/^(lcm01_\d{3}\w*)\s?/) {
$csm = "f";
print OUTFILE "\n$_\nNARRATOR: ";
} elsif (s/^(\(rules script\)\s\")//) {
print STDERR "\n1=$1\n";
s/\($1\)\s\"//;
print STDERR "\n$x:\n$_\n";
s/"\s*$/ /;
print OUTFILE;
print OUTFILE "When you have completed this exercise, click the
Next button to continue.\n";
} elsif ($cont eq "t") {
chomp; chomp; chomp;
if ((/^Notes$/)
|| (/^Correct Answer:$/)
|| (//))
{
$cont = "f";
} elsif (/"\s*$/) {
$cont = "f";
s/"\s*$/\n/;
print OUTFILE;
} else {
print OUTFILE;
}
} elsif ((! /TRAINER:/)
&& (/([A-Z]+):\s+"?(.+)/))
{
$cont = "t";
if ($1 eq "CSM") {
$csm = "t";
$line = "\n$1: $2";
} elsif (($line = $2) =~ m/^Click/) {
$cont = "f";
if ($csm eq "f") {
$line = "\n$line";
} else {
$line = "\nNARRATOR: $line\n";
}
} elsif ($1 eq "NARRATION") {
$line = "\n$1: $2";
} elsif ($csm eq "t") {
$line = "\n$1: $2";
} elsif ($csm eq "f") {
$line = "$2";
}
if ($line =~ m/("\s*)$/) {
$line =~ s/$1/\n/;
$cont = "f";
}
print OUTFILE $line;
}
}
close INFILE;
close OUTFILE;
end;