Perl regex to remove c-comments, taking into account string literals

S

Saeed

I have seen searching for a code example that removes c-style comments,
but none of these take into account strings literals, e.g.

----------------------------------------------------
/*
** a comment
*/

printf /* blah */ ("Comments begin with /*\n" );

printf ( "Comments end with */\n" ); /* blah */
 
L

Lukas Mai

Saeed schrob:
I have seen searching for a code example that removes c-style comments,
but none of these take into account strings literals, e.g.
[...]

That's a FAQ; see perldoc -q comments. But that solution is incomplete,
too:

/??/
* foo *\
/
is a single comment, according to the C standard. "??/" is a trigraph
expanding to "\", and backslash-newline pairs are deleted before
tokenizing the program, so the above is equivalent to

/* foo */

The following script should do the job:

#!/usr/local/bin/perl -wp0777
use strict;

# this script reads files, removes C comments,
# and prints the results to stdout

s{
/
(?: (?: \\ | \?\?/) \n)*
(?:
/ (?: (?: \\ | \?\?/) \n | [^\n] )*
|
\* [^*]* \*+ (?: (?: \\ | \?\?/) \n)*
(?: [^/*][^*]* \*+ (?: (?: \\ | \?\?/) \n)* )*
(/)
)
|
(
" (?: (?: \\ | \?\?/) . | [^"])* "
|
' (?: (?: \\ | \?\?/) . | [^'])* '
|
. [^'"/]*
)
}{
(defined $1 ? ' ' : '') . (defined $2 ? $2 : '')
}gsex
__END__

HTH, Lukas
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top