reg representing blanks

S

Sean

Want to find comment lines in a C file, so basically any line started
with "//" will count. We need to consider any behavior of a programmer,
i.e, there could be (\t)*, ( )* combinations proceding "//".
how to express all possible cases in a regular expression?

Thanks,
Sean
 
A

A. Sinan Unur

Want to find comment lines in a C file, so basically any line started
with "//" will count. We need to consider any behavior of a programmer,
i.e, there could be (\t)*, ( )* combinations proceding "//".
how to express all possible cases in a regular expression?

perldoc perlre

\s represents whitespace in a regex.

#!/usr/bin/perl

use strict;
use warnings;

my $s = "\t \n a\f \r";

while ( $s =~ /(\s)/g ) {
printf "%2.2X\n", ord $1;
}

__END__

But, of course, follow Paul's recommendation on how to parse C comments.

Sinan
 
L

Lukas Mai

Sean said:
Want to find comment lines in a C file, so basically any line started
with "//" will count. We need to consider any behavior of a programmer,
i.e, there could be (\t)*, ( )* combinations proceding "//".
how to express all possible cases in a regular expression?

"//" doesn't start a comment in C90, which is what most compilers
implement. Furthermore, your approach doesn't work in all possible cases:

/*
// not a one-line comment
*/

"\
// not a comment"

/\
/ this is a comment

/??/
/ so is this

/**/ // or this


What do you want to do with the found comments?

Lukas
 
S

Sean

For
/*
// not a one-line comment
*/
We can recognize "/*" first, and from there on, treat everything as
comments till we recognize a "*/"

My intention is to simply for fun---getting the ratio of "line of code"
and "line of comments" in glibc, gcc. As I am fairly new to perl, I'd
like to exercise it a bit:)

Thanks,
Sean
 
T

Tad McClellan

Sean said:
For
/*
// not a one-line comment
*/
We can recognize "/*" first, and from there on, treat everything as
comments till we recognize a "*/"


printf "/* // also not a one-line comment */";

or, even worse:

printf "/* // also not a one-line comment";
// lots of real code
printf "*/";


You need a Real Parser to do a real parse.

To do a parse of a "context free grammar" (as most programming
languages are) you need an approach that is up to the task.
Regular expressions are not up to that task.

In other words, a mathematician can _prove_ that it is not
possible to use regular expressions to parse a context
free language.

My intention is to simply for fun


A Toy Parse might be "good enough" for a learning experience.
 
L

Lukas Mai

Tad McClellan said:
printf "/* // also not a one-line comment */";

or, even worse:

printf "/* // also not a one-line comment";
// lots of real code
printf "*/";


You need a Real Parser to do a real parse.

To do a parse of a "context free grammar" (as most programming
languages are) you need an approach that is up to the task.
Regular expressions are not up to that task.

Yeah, but this only requires tokenizing the input, not a full parse.
Recognizing C comments can be done with a regex. OK, the regex is long
and ugly, but it works. See my code in <[email protected]>.

HTH, Lukas
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top