program which removes comments from C source

C

Ceriousmall

I just need some feedback on this code. Its one of the K&R exercises.

/* 2011/05 Ceriousmall. . . .
* this program is code-named delcom and as such it deletes all
comments from a C Program. . . . */

#include <stdio.h>

#define ASTR '*'
#define BSLASH '/'
#define FSLASH '\\'
#define DBQTS '\"'
#define SNQTS '\''
#define SPC ' '

#define IN 1 /* in a comment or quoted string*/
#define OUT 0 /* outside of a comment or quoted string */
#define MAXLINE 1024 /* maximum input line size */

int cmt_state, qts_state, start_of_comment; /* state of comments,
state of quoted strings and the beginning of comments */

/* assigns the character string to line[] */
int gotline(char line[], int max)
{
int c, length;

for (length = 0; length < max-2 && (c = getchar()) != EOF && c !=
'\n'; ++length)
line[length] = c;

if (c == '\n')
line[length++] = c;

line[length] = '\0';

if (c == EOF)
return c;
}

/* time to process our line removing any comments found */
void delcom(char line[], int in, int out)
{
int i;

for (i = 0; line != '\0'; ++i)
if (line == DBQTS && qts_state == out && cmt_state == out) { /*
we must ensure quoted strings are ignored if they are outside
comments*/
if (line[i-1] != SNQTS && line[i-1] != FSLASH) /* this
test whether this is a character constants or quoted string */
qts_state = in;
}
else if (line == DBQTS && qts_state == in)
qts_state = out; /* lets us know when we have reached the end
of the string */

else if (line == BSLASH && line[i+1] == ASTR) { /* found a
slash, peek to see if an asterisk follows forming a viable comment */
if (cmt_state == out && qts_state == out) { /* make sure we
are not currently in a comment or quoted string*/
cmt_state = in; /* let the world know that we are now in a
comment */
start_of_comment = i++; /* mark the begginning of
this comment */
}
}
else if (line == '\n' && cmt_state == in) { /* if we started
a comment and its has not been closed before the new line */
while (start_of_comment < i) { /* we need to remove all of its
contents */
line[start_of_comment] = SPC;
++start_of_comment;
}
start_of_comment = 0; /* the next line will start as a comment
*/
}
else if (line == ASTR && line[i+1] == BSLASH) /* found the
closing marker for the end of a comment */
if (cmt_state == in) { /* check if we are currently in a
comment */
++i; /* and if we are */
while (start_of_comment <= i) {
line[start_of_comment] = SPC; /* its time to start the replace
procedure */
++start_of_comment;
}
cmt_state = out; /* comments have been removed so we reset this
to its initial value */
}

if (cmt_state == in)
if (line == '\0' && line[i-1] != '\n') { /* if we encounter a
file that ends without a newline character */
while (start_of_comment < i) { /* we check for any trailing
comments and remove them */
line[start_of_comment] = SPC;
++start_of_comment;
}
cmt_state = out; /* it is wise to close this */
}
}

/* time to execute the entire construct */
int main(void)
{
char line[MAXLINE+1];
int at_start, at_end, return_val;

at_start = cmt_state = qts_state = OUT;

while (at_start != EOF) {
return_val = gotline(line, MAXLINE); /* get a line */
at_end = return_val == EOF;
delcom(line, IN, OUT); /* remove all comments from the line */
printf("%s", line); /* now print the edited line */

if (at_end)
at_start = return_val; /* if end-of-file was found we terminate
the program */
}
return 0;
}
 
C

Ceriousmall

I just need some feedback on this code. Its one  of the K&R exercises.

/* 2011/05 Ceriousmall. . . .
 * this program is code-named delcom and as such it deletes all
comments from a C Program. . . . */

#include <stdio.h>

#define ASTR    '*'
#define BSLASH  '/'
#define FSLASH  '\\'
#define DBQTS   '\"'
#define SNQTS   '\''
#define SPC     ' '

#define IN         1    /* in a comment or quoted string*/
#define OUT        0    /* outside of a comment or quoted string */
#define MAXLINE 1024    /* maximum input line size */

int cmt_state, qts_state, start_of_comment;     /* state of comments,
state of quoted strings and the beginning of comments */

/* assigns the character string to line[] */
int gotline(char line[], int max)
{
        int c, length;

        for (length = 0; length < max-2 && (c = getchar()) != EOF && c !=
'\n'; ++length)
                line[length] = c;

        if (c == '\n')
                line[length++] = c;

        line[length] = '\0';

        if (c == EOF)
                return c;

}

/* time to process our line removing any comments found */
void delcom(char line[], int in, int out)
{
        int i;

        for (i = 0; line != '\0'; ++i)
                if (line == DBQTS && qts_state == out && cmt_state == out) {  /*
we must ensure quoted strings are ignored if they are outside
comments*/
                        if (line[i-1] != SNQTS && line[i-1] != FSLASH)             /* this
test whether this is a character constants or quoted string */
                                qts_state= in;
                }
                else if (line == DBQTS && qts_state == in)
                        qts_state = out;                                        /* lets us know when we have reached the end
of the string */

                else if (line == BSLASH && line[i+1] == ASTR) {     /* found a
slash, peek to see if an asterisk follows forming a viable comment */
                        if (cmt_state == out && qts_state == out) {      /* make sure we
are not currently in a comment or quoted string*/
                                cmt_state= in;                          /* let the world know that we are now in a
comment */
                                start_of_comment = i++;              /* mark the begginning of
this comment */
                        }
                }
                else if (line == '\n' && cmt_state== in) {      /* if we started
a comment and its has not been closed before the new line */
                        while (start_of_comment <i) {                   /* we need to remove all of its
contents */
                                line[start_of_comment] = SPC;
                                ++start_of_comment;
                        }
                        start_of_comment = 0;                           /* the next line will start as a comment
*/
                }
                else if (line == ASTR && line[i+1]== BSLASH)   /* found the
closing marker for the end of a comment */
                        if (cmt_state == in) {                             /* check if we are currently in a
comment */
                                ++i;                                            /* and if we are */
                                while (start_of_comment <= i) {
                                        line[start_of_comment] = SPC;    /* its time to start the replace
procedure */
                                        ++start_of_comment;
                                }
                                cmt_state= out;                        /* comments have been removed so we reset this
to its initial value */
                        }

        if (cmt_state == in)
                if (line == '\0' && line[i-1] !='\n') {             /* if we encounter a
file that ends without a newline character */
                        while (start_of_comment <i) {                       /* we check for any trailing
comments and remove them */
                                line[start_of_comment] = SPC;
                                ++start_of_comment;
                        }
                        cmt_state = out;                                /* it is wise toclose this */
                }

}

/* time to execute the entire construct */
int main(void)
{
        char line[MAXLINE+1];
        int at_start, at_end, return_val;

        at_start = cmt_state = qts_state = OUT;

        while (at_start != EOF) {
                return_val = gotline(line, MAXLINE);    /* get a line */
                at_end = return_val == EOF;
                delcom(line, IN, OUT);                  /* remove all comments from the line */
                printf("%s", line);                   /* now print the edited line */

                if (at_end)
                        at_start = return_val;          /* if end-of-file was found we terminate
the program */
        }
        return 0;



}- Hide quoted text -

- Show quoted text -


I'll fix the comments and post again, my bad.......
 
K

Keith Thompson

Ceriousmall said:
I just need some feedback on this code. Its one of the K&R exercises.

/* 2011/05 Ceriousmall. . . .
* this program is code-named delcom and as such it deletes all
comments from a C Program. . . . */

#include <stdio.h>

#define ASTR '*'
#define BSLASH '/'
#define FSLASH '\\'

You've defined BSLASH as a forward slash and FSLASH as a backslash.
#define DBQTS '\"'

The backslash isn't necessary; you can just write '"'.
#define SNQTS '\''
#define SPC ' '

But really, what benefit do these macros give you? The abbrvtns r vry
tsr -- sorry, I mean the abbreviations are very terse -- and IMHO this:
line[i+1] == '*'
is much clearer than this:
line[i+1] == ASTR
or even this:
line[i+1] == ASTERISK
anyway. It's not as if the value of ASTR is ever going to change.
#define IN 1 /* in a comment or quoted string*/
#define OUT 0 /* outside of a comment or quoted string */

It might be clearer to define a variable that's true if you're inside a
comment or quoted string, false otherwise. Or define an enumerated
type.

Some things to consider:

C99 introduced // comments. Do you want to handle those?

A comment delimiter is also ignored if it appears in a character
constant or header name:

#include </*.h>
int c = '/*';

I'm afraid that's as far as I got. I might take another look later.

[snip]
 
C

Ceriousmall

I just need some feedback on this code. Its one  of the K&R exercises.

/* 2011/05 Ceriousmall. . . .
 * this program is code-named delcom and as such it deletes all
* comments from a C Program. . . . */

#include <stdio.h>

#define ASTR    '*'
#define BSLASH  '/'
#define FSLASH  '\\'
#define DBQTS   '\"'
#define SNQTS   '\''
#define SPC     ' '

#define IN         1    /* in a comment or quoted string*/
#define OUT        0    /* outside of a comment or quoted string */
#define MAXLINE 1024    /* maximum input line size */

int cmt_state, qts_state, start_of_comment;     /* state of comments,
quoted strings and comment's start */

/* assigns the character string to line[] */
int gotline(char line[], int max)
{
        int c, length;

        for (length = 0; length < max-2 && (c = getchar()) != EOF &&
c != '\n'; ++length)
                line[length] = c;

        if (c == '\n')
                line[length++] = c;

        line[length] = '\0';

        if (c == EOF)
                return c;
}

/* time to process our line removing any comments found */
void delcom(char line[], int in, int out)
{
        int i;

        for (i = 0; line != '\0'; ++i)
                if (line == DBQTS && qts_state == out && cmt_state
== out) {   /* ignore quoted strings */
if (line[i-1] != SNQTS && line[i-1] !=
FSLASH)     /* check for character constants */
                                qts_state = in;
                }
                else if (line == DBQTS && qts_state== in)
                        qts_state = out;                          /
* we have reached the end of the string */

                else if (line == BSLASH && line[i+1] == ASTR) { /
* found a slash, check for an asterisk */
                        if (cmt_state == out && qts_state == out) {
                              cmt_state = in;                     /
* we are now in a comment */
                                start_of_comment = i++;             /
* mark the begginning of this comment */
                        }
                }
                else if (line == '\n' && cmt_state == in) { /*
comments not closed b4 the newline*/
while (start_of_comment < i) /* we
remove all of their contents */
                                line[start_of_comment] = SPC;
                                ++start_of_comment;
                        }
                        start_of_comment = 0;                  /* the
next line will start as a comment */
                }
                else if (line == ASTR && line[i+1] == BSLASH)   /*
found the closing marks for a comment */
                        if (cmt_state == in) {                    /*
check if we are currently in a comment */
                                ++i;                               /*
and if we are */
                                while (start_of_comment <= i) {
                                        line[start_of_comment] =
SPC; /* we start the replace procedure */
                                        ++start_of_comment;
                                }
                                cmt_state = out;                    
 /* comments removed so we reset this */
                        }

        if (cmt_state == in)
                if (line == '\0' && line[i-1] != '\n') { /* if we
encounter a file that ends without a '\n' */
                        while (start_of_comment < i) {   /* we check
for any trailing comments and remove them */
                                line[start_of_comment] = SPC;
                                ++start_of_comment;
                        }
                        cmt_state = out;               /* it is wise
to close this */
                }
}

/* time to execute the entire construct */
int main(void)
{
        char line[MAXLINE+1];
        int at_start, at_end, return_val;

        at_start = cmt_state = qts_state = OUT;

        while (at_start != EOF) {
                return_val = gotline(line, MAXLINE);  /* get a line
*/
                at_end = return_val == EOF;
                delcom(line, IN, OUT);               /* remove all
comments from the line */
                printf("%s", line);                   /* now print
the edited line */

                if (at_end)
                        at_start = return_val;   /* if end-of-
file was found we terminate the program */
        }
        return 0;
}

This should be easily read and understood by all......
 
C

Ceriousmall

I just need some feedback on this code. Its one  of the K&R exercises.

 /* 2011/05 Ceriousmall. . . .
  * this program is code-named delcom and as such it deletes all
  * comments from a C Program. . . . */

 #include <stdio.h>

 #define ASTR    '*'
 #define BSLASH  '/'
 #define FSLASH  '\\'
 #define DBQTS   '\"'
 #define SNQTS   '\''
 #define SPC     ' '

 #define IN         1    /* in a comment or quoted string*/
 #define OUT        0    /* outside of a comment or quoted string */
 #define MAXLINE 1024    /* maximum input line size */

 int cmt_state, qts_state, start_of_comment;     /* state of comments,
quoted strings and comment's start */

 /* assigns the character string to line[] */
 int gotline(char line[], int max)
 {
         int c, length;

         for (length = 0; length < max-2 && (c = getchar())!= EOF &&
c != '\n'; ++length)
                 line[length] = c;

         if (c == '\n')
                 line[length++] = c;

         line[length] = '\0';

         if (c == EOF)
                 return c;
  }

 /* time to process our line removing any comments found */
 void delcom(char line[], int in, int out)
 {
         int i;

         for (i = 0; line != '\0'; ++i)
                 if (line == DBQTS && qts_state == out && cmt_state
== out) {     /* ignore quoted strings */
                          if (line[i-1] != SNQTS && line[i-1] !=
FSLASH)     /* check for character constants */
                                 qts_state = in;
                 }
                 else if (line == DBQTS && qts_state == in)
                         qts_state = out;                           /
* we have reached the end of the string */

                 else if (line == BSLASH && line[i+1] == ASTR) {  /
* found a slash, check for an asterisk */
                         if (cmt_state == out && qts_state == out) {
                                 cmt_state = in;                     /
* we are now in a comment */
                                 start_of_comment = i++;             /
* mark the begginning of this comment */
                         }
                 }
                 else if (line == '\n' && cmt_state == in) { /*
comments not closed b4 the newline*/
                        while (start_of_comment <i)            /* we
remove all of their contents */
                                 line[start_of_comment] = SPC;
                                 ++start_of_comment;
                         }
                         start_of_comment = 0;                  /* the
next line will start as a comment */
                 }
                 else if (line == ASTR && line[i+1] == BSLASH)   /*
found the closing marks for a comment */
                         if (cmt_state == in) {                     /*
check if we are currently in a comment */
                                 ++i;                               /*
and if we are */
                                 while (start_of_comment <= i) {
                                         line[start_of_comment] =
SPC; /* we start the replace procedure */
                                         ++start_of_comment;
                                 }
                                 cmt_state = out;                    
 /* comments removed so we reset this */
                         }

         if (cmt_state == in)
                 if (line == '\0' && line[i-1] != '\n') { /* if we
encounter a file that ends without a '\n' */
                         while (start_of_comment < i) {   /* we check
for any trailing comments and remove them */
                                 line[start_of_comment] = SPC;
                                 ++start_of_comment;
                         }
                         cmt_state = out;               /* it is wise
to close this */
                 }
 }

 /* time to execute the entire construct */
 int main(void)
 {
         char line[MAXLINE+1];
         int at_start, at_end, return_val;

         at_start = cmt_state = qts_state = OUT;

         while (at_start != EOF) {
                 return_val = gotline(line, MAXLINE); /* get a line
*/
                 at_end = return_val == EOF;
                 delcom(line, IN, OUT);               /* remove all
comments from the line */
                 printf("%s", line);                   /* now print
the edited line */

                 if (at_end)
                         at_start = return_val;        /* if end-of-
file was found we terminate the program */
         }
         return 0;
 }

This should be easily read and understood by all......


this is fighting me so i'll remove all the comments for now.....
 
C

Ceriousmall

Better.....................................

/* 2011/05 Ceriousmall. . . .
* this program is code-named delcom
* and as such it deletes all comments from a C Program. . . . */

#include <stdio.h>

#define ASTR '*'
#define BSLASH '/'
#define FSLASH '\\'
#define DBQTS '\"'
#define SNQTS '\''
#define SPC ' '

#define IN 1 /* in a comment or quoted string*/
#define OUT 0 /* outside of a comment or quoted string */
#define MAXLINE 1024 /* maximum input line size */

int cmt_state, qts_state, start_of_comment;

/* assigns the character string to line[] */
int gotline(char line[], int max)
{
int c, length;

for (length = 0; length < max-2 && (c = getchar()) != EOF && c !=
'\n'; ++length)
line[length] = c;

if (c == '\n')
line[length++] = c;

line[length] = '\0';

if (c == EOF)
return c;
}

/* time to process our line removing any comments found */
void delcom(char line[], int in, int out)
{
int i;

for (i = 0; line != '\0'; ++i)
if (line == DBQTS && qts_state == out && cmt_state == out) {
if (line[i-1] != SNQTS && line[i-1] != FSLASH)
qts_state = in;
}
else if (line == DBQTS && qts_state == in)
qts_state = out;

else if (line == BSLASH && line[i+1] == ASTR) {
if (cmt_state == out && qts_state == out) {
cmt_state = in;
start_of_comment = i++;
}
}
else if (line == '\n' && cmt_state == in) {
while (start_of_comment < i) {
line[start_of_comment] = SPC;
++start_of_comment;
}
start_of_comment = 0;
}
else if (line == ASTR && line[i+1] == BSLASH)
if (cmt_state == in) {
++i;
while (start_of_comment <= i) {
line[start_of_comment] = SPC;
++start_of_comment;
}
cmt_state = out;
}

if (cmt_state == in)
if (line == '\0' && line[i-1] != '\n') {
while (start_of_comment < i) {
line[start_of_comment] = SPC;
++start_of_comment;
}
cmt_state = out;
}
}

/* time to execute the entire construct */
int main(void)
{
char line[MAXLINE+1];
int at_start, at_end, return_val;

at_start = cmt_state = qts_state = OUT;

while (at_start != EOF) {
return_val = gotline(line, MAXLINE);
at_end = return_val == EOF;
delcom(line, IN, OUT);
printf("%s", line);

if (at_end)
at_start = return_val;
}
return 0;
}
 
I

Ian Collins

I just need some feedback on this code. Its one of the K&R exercises.

/* 2011/05 Ceriousmall. . . .
* this program is code-named delcom and as such it deletes all
* comments from a C Program. . . . */

#include<stdio.h>

#define ASTR '*'
#define BSLASH '/'
#define FSLASH '\\'
#define DBQTS '\"'
#define SNQTS '\''
#define SPC ' '

Just a general observation while you sort out your layout!

As Keith pointed out, those defines are more trouble than they are
worth. If you are going to use them, use something with a meaningful
name, such as

const char space = ' ';

Similarly for the function names, rather than 'delcom' accompanied by
explanatory comments, give the function a name like
'removeCommentsFromLine'.

A good general rule is if something requires an explanatory comment, it
is wrongly named.
 
I

Ike Naar

int gotline(char line[], int max)
{
int c, length;

for (length = 0; length < max-2 && (c = getchar()) != EOF && c !=
'\n'; ++length)
line[length] = c;

if (c == '\n')
line[length++] = c;

line[length] = '\0';

if (c == EOF)
return c;
}

The gotline() function does not always return a value.
What is the purpose of the ``if (c == EOF)'' test?
int main(void)
{
char line[MAXLINE+1];
int at_start, at_end, return_val;

at_start = cmt_state = qts_state = OUT;

while (at_start != EOF) {
return_val = gotline(line, MAXLINE);
at_end = return_val == EOF;
delcom(line, IN, OUT);
printf("%s", line);

if (at_end)
at_start = return_val;
}
return 0;
}

The last element of line[] is never used; it could be declared as
line[MAXLINE];
at_end and return_val are redundant; the following does the same thing:

int main(void)
{
char line[MAXLINE+1];
int at_start;

at_start = cmt_state = qts_state = OUT;

while (at_start != EOF) {
at_start = gotline(line, MAXLINE);
delcom(line, IN, OUT);
printf("%s", line);
}
return 0;
}

The program does not perform well if there are lines longer than MAXLINE
(you can verify this by setting MAXLINE to a small value, e.g. 32, and
let the program process itself as input).

Why do you process the input line-by-line anyway? The program only needs
to buffer a few (probably two) characters, not an entire line; comment
text can be replaced by spaces on-the-fly, no need to postpone this
until end-of-line or end-of-comment is seen.
The line handling makes the program more complicated than necessary.
 
C

Ceriousmall

int gotline(char line[], int max)
{
   int c, length;
   for (length = 0; length < max-2 && (c = getchar()) != EOF && c !=
'\n'; ++length)
           line[length] = c;
   if (c == '\n')
           line[length++] = c;
   line[length] = '\0';
   if (c == EOF)
           return c;
}

The gotline() function does not always return a value.
What is the purpose of the ``if (c == EOF)'' test?


[snip]
int main(void)
{
   char line[MAXLINE+1];
   int at_start, at_end, return_val;
   at_start = cmt_state = qts_state = OUT;
   while (at_start != EOF) {
           return_val = gotline(line, MAXLINE);
           at_end = return_val == EOF;
           delcom(line, IN, OUT);
           printf("%s", line);
           if (at_end)
                   at_start = return_val;
   }
   return 0;
}

The last element of line[] is never used; it could be declared as
line[MAXLINE];
at_end and return_val are redundant; the following does the same thing:

  int main(void)
  {
        char line[MAXLINE+1];
        int at_start;

        at_start = cmt_state = qts_state = OUT;

        while (at_start != EOF) {
                at_start = gotline(line, MAXLINE);
                delcom(line, IN, OUT);
                printf("%s", line);
        }
        return 0;
  }

The program does not perform well if there are lines longer than MAXLINE
(you can verify this by setting MAXLINE to a small value, e.g. 32, and
let the program process itself as input).

Why do you process the input line-by-line anyway? The program only needs
to buffer a few (probably two) characters, not an entire line; comment
text can be replaced by spaces on-the-fly, no need to postpone this
until end-of-line or end-of-comment is seen.
The line handling makes the program more complicated than necessary.



lots of good info as usual, a difference in perspective really helps
with my coding, i find that its easy to get trapped by your own coding
style. The learning curve never ends..........
 
T

Tim Rentsch

Ceriousmall said:
I just need some feedback on this code. [snip]

Please excuse a blunt comment. It's abysmal.

I say it's abysmal because information about how the program
works at a high level is either murky or just not there.
Per-line comments, for example, are a very poor substitute
for one well written, whole-program comment (probably a
paragraph or two) explaining how the program works.

Alternatively, one can try to show high-level program logic
by the actual source code. For example:

int
main(){
int c;

while( c = next_input(), c != EOF ){

if( c == '/' && peek() == '*' ) skip_over_comment();

else if( c == '\"' ) read_and_print_string();

else if( c == '\'' ) read_and_print_CC();

else output( c );
}

return 0;
}


Reading just this one function, it's clear at a high level
how the program works. I don't mean to suggest this
particular approach is the only one or the best one, just
that it illustrates the idea of having a clear high-level
idea of how the program will work and writing code that
directly reflects that idea. Make sense?

I recommend trying another iteration (starting from
scratch is a good idea), perhaps along the lines of
the example but in any case trying to follow the
advice about making high-level logic clear in the
program.

Also, lexical note: please convert TABS to SPACES before
posting.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top