Removing Perl comments and strings using regexps

  • Thread starter Brendan Byrd/SineSwiper
  • Start date
B

Brendan Byrd/SineSwiper

I'm in the middle of fixing the LXR tool to work with Perl. Almost
finished, but I've ran into a problem that completely fries my brain.
In order to detect Perl subroutine/variable declarations, I need to
remove comments and strings so that they don't get confused for
declarations.

My current code works for most cases:

# Remove escaped/variable characters for strings/comments
$contents =~ s/\\\$//g;
$contents =~ s/[\\\$][\"\'\#]//g;

# Remove literal strings
$contents =~ s/\"[^\"]*?\"//gs;
$contents =~ s/\'[^\']*?\'//gs;

# Remove comments
$contents =~ s/\#[^\n]*//g;

However, if you run into a comment with a quotemark, it screws up
everything:

# I'll be back

I can switch the order of the removes, but then I encounter the reverse
problem: a string with a pound sign.

$aaa = '#FF0088';

The remove comment line kills the second quote mark, and the remove
string then runs away and gobs up everything in sight (until you have
another string with the same pound sign problem). Is there any sort of
regexp that would work for this program:

# Sine's program
$aaa = 'test #'; # test statement
$bbb = 'halo #
there'; # multi-line using ' marks
# #'#'#'#''######'
 
J

Jay Tilton

: I'm in the middle of fixing the LXR tool to work with Perl. Almost
: finished, but I've ran into a problem that completely fries my brain.
: In order to detect Perl subroutine/variable declarations, I need to
: remove comments and strings so that they don't get confused for
: declarations.

How about just running the program through the Xref backend?

perl -MO=Xref foo.pl
 
B

Brendan Byrd/SineSwiper

Jay said:
: I'm in the middle of fixing the LXR tool to work with Perl. Almost
: finished, but I've ran into a problem that completely fries my brain.
: In order to detect Perl subroutine/variable declarations, I need to
: remove comments and strings so that they don't get confused for
: declarations.

How about just running the program through the Xref backend?

perl -MO=Xref foo.pl

Interesting. Is there a way to load that as a module and send the input
of a variable (the program) to Xref?
 
T

Tassilo v. Parseval

Also sprach Brendan Byrd/SineSwiper:
Interesting. Is there a way to load that as a module and send the input
of a variable (the program) to Xref?

If such a way existed, I'd like to know it, too. How I understand the
B:: modules and the generic compiler backend O there is no such way.
They are triggered in a CHECK block right after a script has been
compiled. A B:: module simply provides a callback that is invoked on the
optree of a script. So it can't work on strings.

Tassilo
 
B

Brendan Byrd/SineSwiper

Tassilo said:
Also sprach Brendan Byrd/SineSwiper:

If such a way existed, I'd like to know it, too. How I understand the
B:: modules and the generic compiler backend O there is no such way.
They are triggered in a CHECK block right after a script has been
compiled. A B:: module simply provides a callback that is invoked on the
optree of a script. So it can't work on strings.

Well, if I could code this into the current development version of LXR,
it wouldn't need strings (all looking at files anyway), and I could just
call another instance of perl with an OPEN command. A little sloppy,
but it beats re-inventing the wheel.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,134
Latest member
Lou6777736
Top