reg representing blanks

Discussion in 'Perl Misc' started by Sean, Mar 28, 2006.

  1. Sean

    Sean Guest

    Want to find comment lines in a C file, so basically any line started
    with "//" will count. We need to consider any behavior of a programmer,
    i.e, there could be (\t)*, ( )* combinations proceding "//".
    how to express all possible cases in a regular expression?

    Thanks,
    Sean
     
    Sean, Mar 28, 2006
    #1
    1. Advertising

  2. Sean

    Paul Lalli Guest

    Sean wrote:
    > Want to find comment lines in a C file, so basically any line started
    > with "//" will count. We need to consider any behavior of a programmer,
    > i.e, there could be (\t)*, ( )* combinations proceding "//".
    > how to express all possible cases in a regular expression?


    By not reinventing the wheel

    http://search.cpan.org/~abigail/Regexp-Common-2.120/lib/Regexp/Common/comment.pm

    Paul Lalli
     
    Paul Lalli, Mar 28, 2006
    #2
    1. Advertising

  3. "Sean" <> wrote in news:1143515321.509994.260710
    @u72g2000cwu.googlegroups.com:

    > Want to find comment lines in a C file, so basically any line started
    > with "//" will count. We need to consider any behavior of a

    programmer,
    > i.e, there could be (\t)*, ( )* combinations proceding "//".
    > how to express all possible cases in a regular expression?


    perldoc perlre

    \s represents whitespace in a regex.

    #!/usr/bin/perl

    use strict;
    use warnings;

    my $s = "\t \n a\f \r";

    while ( $s =~ /(\s)/g ) {
    printf "%2.2X\n", ord $1;
    }

    __END__

    But, of course, follow Paul's recommendation on how to parse C comments.

    Sinan

    --
    A. Sinan Unur <>
    (remove .invalid and reverse each component for email address)

    comp.lang.perl.misc guidelines on the WWW:
    http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
     
    A. Sinan Unur, Mar 28, 2006
    #3
  4. Sean

    Lukas Mai Guest

    Sean <> schrob:
    > Want to find comment lines in a C file, so basically any line started
    > with "//" will count. We need to consider any behavior of a programmer,
    > i.e, there could be (\t)*, ( )* combinations proceding "//".
    > how to express all possible cases in a regular expression?


    "//" doesn't start a comment in C90, which is what most compilers
    implement. Furthermore, your approach doesn't work in all possible cases:

    /*
    // not a one-line comment
    */

    "\
    // not a comment"

    /\
    / this is a comment

    /??/
    / so is this

    /**/ // or this


    What do you want to do with the found comments?

    Lukas

    --
    #!/usr/bin/perl -p0777
    s{/(?:(?:\\|\?\?/)\n)*(?:/(?:(?:\\|\?\?/)\n|[^\n])*|\*[^*]*\*+(?:(?:\\|\?
    \?/)\n)*(?:[^/*][^*]*\*+(?:(?:\\|\?\?/)\n)*)*(/))|("(?:(?:\\|\?\?/).|[^"]
    )*"|'(?:(?:\\|\?\?/).|[^'])*'|.[^'"/]*)}{(defined $1 ? ' ' : '') . $2}xgse
     
    Lukas Mai, Mar 28, 2006
    #4
  5. Sean

    Sean Guest

    For
    /*
    // not a one-line comment
    */
    We can recognize "/*" first, and from there on, treat everything as
    comments till we recognize a "*/"

    My intention is to simply for fun---getting the ratio of "line of code"
    and "line of comments" in glibc, gcc. As I am fairly new to perl, I'd
    like to exercise it a bit:)

    Thanks,
    Sean
     
    Sean, Mar 28, 2006
    #5
  6. Sean <> wrote:
    > For
    > /*
    > // not a one-line comment
    > */
    > We can recognize "/*" first, and from there on, treat everything as
    > comments till we recognize a "*/"



    printf "/* // also not a one-line comment */";

    or, even worse:

    printf "/* // also not a one-line comment";
    // lots of real code
    printf "*/";


    You need a Real Parser to do a real parse.

    To do a parse of a "context free grammar" (as most programming
    languages are) you need an approach that is up to the task.
    Regular expressions are not up to that task.

    In other words, a mathematician can _prove_ that it is not
    possible to use regular expressions to parse a context
    free language.


    > My intention is to simply for fun



    A Toy Parse might be "good enough" for a learning experience.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Mar 28, 2006
    #6
  7. Sean

    Lukas Mai Guest

    Tad McClellan <> schrob:
    > Sean <> wrote:
    >> For
    >> /*
    >> // not a one-line comment
    >> */
    >> We can recognize "/*" first, and from there on, treat everything as
    >> comments till we recognize a "*/"

    >
    >
    > printf "/* // also not a one-line comment */";
    >
    > or, even worse:
    >
    > printf "/* // also not a one-line comment";
    > // lots of real code
    > printf "*/";
    >
    >
    > You need a Real Parser to do a real parse.
    >
    > To do a parse of a "context free grammar" (as most programming
    > languages are) you need an approach that is up to the task.
    > Regular expressions are not up to that task.


    Yeah, but this only requires tokenizing the input, not a full parse.
    Recognizing C comments can be done with a regex. OK, the regex is long
    and ugly, but it works. See my code in <e0apje$sbh$01$-online.com>.

    HTH, Lukas
     
    Lukas Mai, Mar 29, 2006
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. John Hoge
    Replies:
    2
    Views:
    599
    John Hoge
    May 23, 2004
  2. tfs
    Replies:
    1
    Views:
    391
    John Saunders
    Jun 28, 2004
  3. John A Grandy
    Replies:
    2
    Views:
    2,421
    Eliyahu Goldin
    Nov 21, 2004
  4. Peter Blatt
    Replies:
    0
    Views:
    444
    Peter Blatt
    Sep 25, 2004
  5. pascal barbedor

    blanks embedded in python 2.3 optparse

    pascal barbedor, Jul 29, 2003, in forum: Python
    Replies:
    3
    Views:
    331
    =?ISO-8859-2?Q?Nagy_L=E1szl=F3_Zsolt?=
    Jul 29, 2003
Loading...

Share This Page