B
Brendan Byrd/SineSwiper
I'm in the middle of fixing the LXR tool to work with Perl. Almost
finished, but I've ran into a problem that completely fries my brain.
In order to detect Perl subroutine/variable declarations, I need to
remove comments and strings so that they don't get confused for
declarations.
My current code works for most cases:
# Remove escaped/variable characters for strings/comments
$contents =~ s/\\\$//g;
$contents =~ s/[\\\$][\"\'\#]//g;
# Remove literal strings
$contents =~ s/\"[^\"]*?\"//gs;
$contents =~ s/\'[^\']*?\'//gs;
# Remove comments
$contents =~ s/\#[^\n]*//g;
However, if you run into a comment with a quotemark, it screws up
everything:
# I'll be back
I can switch the order of the removes, but then I encounter the reverse
problem: a string with a pound sign.
$aaa = '#FF0088';
The remove comment line kills the second quote mark, and the remove
string then runs away and gobs up everything in sight (until you have
another string with the same pound sign problem). Is there any sort of
regexp that would work for this program:
# Sine's program
$aaa = 'test #'; # test statement
$bbb = 'halo #
there'; # multi-line using ' marks
# #'#'#'#''######'
finished, but I've ran into a problem that completely fries my brain.
In order to detect Perl subroutine/variable declarations, I need to
remove comments and strings so that they don't get confused for
declarations.
My current code works for most cases:
# Remove escaped/variable characters for strings/comments
$contents =~ s/\\\$//g;
$contents =~ s/[\\\$][\"\'\#]//g;
# Remove literal strings
$contents =~ s/\"[^\"]*?\"//gs;
$contents =~ s/\'[^\']*?\'//gs;
# Remove comments
$contents =~ s/\#[^\n]*//g;
However, if you run into a comment with a quotemark, it screws up
everything:
# I'll be back
I can switch the order of the removes, but then I encounter the reverse
problem: a string with a pound sign.
$aaa = '#FF0088';
The remove comment line kills the second quote mark, and the remove
string then runs away and gobs up everything in sight (until you have
another string with the same pound sign problem). Is there any sort of
regexp that would work for this program:
# Sine's program
$aaa = 'test #'; # test statement
$bbb = 'halo #
there'; # multi-line using ' marks
# #'#'#'#''######'