Perl regex to remove c-comments, taking into account string literals

Discussion in 'Perl Misc' started by Saeed, Jul 8, 2004.

  1. Saeed

    Saeed Guest

    I have seen searching for a code example that removes c-style comments,
    but none of these take into account strings literals, e.g.

    ----------------------------------------------------
    /*
    ** a comment
    */

    printf /* blah */ ("Comments begin with /*\n" );

    printf ( "Comments end with */\n" ); /* blah */
    ----------------------------------------------------

    I want this stripped to:

    ----------------------------------------------------
    printf ("Comments begin with /*\n" );

    printf ( "Comments end with */\n" );
    ----------------------------------------------------

    but the sample's I've seen would most probably give:

    ----------------------------------------------------
    printf ("Comments begin with

    \n" );
    ----------------------------------------------------
     
    Saeed, Jul 8, 2004
    #1
    1. Advertising

  2. Saeed

    Lukas Mai Guest

    Saeed schrob:

    > I have seen searching for a code example that removes c-style comments,
    > but none of these take into account strings literals, e.g.

    [...]

    That's a FAQ; see perldoc -q comments. But that solution is incomplete,
    too:

    /??/
    * foo *\
    /
    is a single comment, according to the C standard. "??/" is a trigraph
    expanding to "\", and backslash-newline pairs are deleted before
    tokenizing the program, so the above is equivalent to

    /* foo */

    The following script should do the job:

    #!/usr/local/bin/perl -wp0777
    use strict;

    # this script reads files, removes C comments,
    # and prints the results to stdout

    s{
    /
    (?: (?: \\ | \?\?/) \n)*
    (?:
    / (?: (?: \\ | \?\?/) \n | [^\n] )*
    |
    \* [^*]* \*+ (?: (?: \\ | \?\?/) \n)*
    (?: [^/*][^*]* \*+ (?: (?: \\ | \?\?/) \n)* )*
    (/)
    )
    |
    (
    " (?: (?: \\ | \?\?/) . | [^"])* "
    |
    ' (?: (?: \\ | \?\?/) . | [^'])* '
    |
    . [^'"/]*
    )
    }{
    (defined $1 ? ' ' : '') . (defined $2 ? $2 : '')
    }gsex
    __END__

    HTH, Lukas
    --
    print+74.117.115.116,,qq.\c!..not::.her,Perl=>q$hacker,$,!($,=$")
     
    Lukas Mai, Jul 10, 2004
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Franck
    Replies:
    3
    Views:
    309
    Jason Kester
    Sep 13, 2005
  2. John Goche
    Replies:
    8
    Views:
    16,511
  3. Replies:
    16
    Views:
    509
    Neil Kurzman
    Apr 7, 2005
  4. Jim Cain
    Replies:
    1
    Views:
    218
    Yukihiro Matsumoto
    Jul 18, 2003
  5. Replies:
    3
    Views:
    167
    Paul Lalli
    Oct 27, 2005
Loading...

Share This Page