Should use of goto completely avoided in programs?

SUMIT · Sep 6, 2005

I wrote a program for removing comment from C source file for which
i...
1.first constructed a DFA
2.used goto statement to write a program.

now it was very easy to model DFA using goto & i suppose it was also
efficient.

so was the use of goto right here?

thanx

Sumit
PUCSD

Richard Heathfield · Sep 6, 2005

SUMIT said:

I wrote a program for removing comment from C source file for which
i...
1.first constructed a DFA
2.used goto statement to write a program.

now it was very easy to model DFA using goto & i suppose it was also
efficient.

so was the use of goto right here?

If it made the program clearer to read and maintain than the alternatives,
yes. Otherwise, no. And the odds are good that it didn't.

State machines such as DFAs are normally implemented (in procedural
languages, at least) using a construct such as C's switch (multi-way
select). In OOP, you would probably do it using something a bit more - er -
OO.

The problem with goto is that it tends to make maintenance a pig.

John Bode · Sep 6, 2005

SUMIT said:
I wrote a program for removing comment from C source file for which
i...
1.first constructed a DFA
2.used goto statement to write a program.

now it was very easy to model DFA using goto & i suppose it was also
efficient.

so was the use of goto right here?

thanx

Sumit
PUCSD

Whether it was right or not depends on the answers to the following
questions:

1. Is this a one-off, quick-n-dirty job that's never going to be
touched again?

2. Is the code easy to understand by someone who didn't write it?

3. Can the code easily be updated to reflect any potential changes in
the DFA?

If the answer to 1 is "no", then the answers to both 2 and 3 had better
be "yes"; otherwise you need to re-evaluate how you are using gotos.

Gotos by themselves aren't a bad thing, but you have to be disciplined
in how you use them, otherwise they can lead to maintenance headaches.
You should never replace a structured construct (if-else, for, while,
do-while, switch) with a goto. You should never branch into a block,
and you should branch forward only. In some cases they may interfere
with compiler optimizations.

jiushengliu · Sep 6, 2005

SUMIT said:

If it made the program clearer to read and maintain than the alternatives,
yes. Otherwise, no. And the odds are good that it didn't.

State machines such as DFAs are normally implemented (in procedural
languages, at least) using a construct such as C's switch (multi-way
select). In OOP, you would probably do it using something a bit more - er -
OO.

The problem with goto is that it tends to make maintenance a pig.

Keith Thompson · Sep 6, 2005

SUMIT said:
I wrote a program for removing comment from C source file for which
i...
1.first constructed a DFA
2.used goto statement to write a program.

now it was very easy to model DFA using goto & i suppose it was also
efficient.

so was the use of goto right here?

I'll agree with everyone else: Maybe.

Any program maps a problem space (the real-world problem you're trying
to solve) onto a program space (the implementation of your program).
Ideally, a control-flow construct should have some meaning in the
problem space. For example, an if-else should usually correspond to
some decision in the problem space; a loop should correspond to some
kind of iteration in the problem space.

A goto statement usually applies only to the program itself; it rarely
corresponds to anything in the problem space. An exception to this is
an explicit finite state machine, where a goto (a jump from one place
in your program to another) can actually correspond to a state
transition.

The other cases where a goto is acceptable in C are exception handling
(giving up on processing and jumping to wrap-up code) and breaking out
of a nested loop. These both, in my opinion, represent gaps in the
language. If the language provided an explicit construct to break out
of a nested loop, for example, a goto statement wouldn't be necessary
there. And yes, a hypothetical multi-level break would be
functionally equivalent to a goto; the difference is that it wouldn't
be spelled g-o-t-o, and it couldn't be mistaken by a reader for
undisciplined spaghetti code.

Tim Rentsch · Sep 7, 2005

SUMIT said:
I wrote a program for removing comment from C source file for which
i...
1.first constructed a DFA
2.used goto statement to write a program.

now it was very easy to model DFA using goto & i suppose it was also
efficient.

so was the use of goto right here?

Other responders have given answers like "probably not",
"maybe", or "it depends". Here's another opinion: No.

I'll elaborate a bit on that two-letter answer.

First, there's an excellent chance the program you have now
is wrong. It's not too hard to remove C comments if one
ignores the corner cases, but dealing with the corner cases
adds noticeably to the complexity. Using goto doesn't scale
up very well as the DFA gets more complicated.

Second, even if it was clear to you when you wrote it, it
probably won't be so clear to the next guy, or yourself when
you have to look at it again later. Right now you've got
all the structure in your head; but someone who doesn't
have all the structure in their head will have to discover
it looking over the various goto's (presumably in a single
large function body).

Third, perhaps most important, there's a simple and clear
way to write the code that needs no goto's at all. Here's
a sketch:

int
main(){
State s;
int c;

s = NORMAL;
do {
c = getchar();
switch( s ){
case NORMAL: s = normal_next(c); break;

case STRING: s = string_next(c); break;
case STRING_1: s = string_1_next(c); break;

case CHARLIT: s = charlit_next(c); break;
case CHARLIT_1: s = charlit_1_next(c); break;

case SLASH: s = slash_next(c); break;

case COMMENT: s = comment_next(c); break;
case COMMENT_1: s = comment_1_next(c); break;

...
}
} while( c != EOF );

return appropriate_return_code();
}

State
normal_next( int c ){
if( c == '"' ){ out(c); return STRING; }
if( c == '\'' ){ out(c); return CHARLIT; }
if( c == '/' ){ return SLASH; }

...

out(c);
return NORMAL;
}

... etc ...

I expect a program along these lines would be somewhat
longer than the program you wrote, and probably take
slightly longer to write (because of typing time if nothing
else). However, when the time comes to fix or improve the
program, you'll be glad you took the time to structure the
code in a way that scales up better.

Incidentally, the removal of comments from C source cannot,
strictly speaking, be done with just a finite state machine
(aka DFA). Additional state (in principle unbounded) is
needed. (I'll just leave this topic here, without saying
why, in case some comp.lang.c readers would enjoy figuring
that out on their own.)

Antonio Contreras · Sep 7, 2005

Tim Rentsch wrote:

Incidentally, the removal of comments from C source cannot,
strictly speaking, be done with just a finite state machine
(aka DFA). Additional state (in principle unbounded) is
needed. (I'll just leave this topic here, without saying
why, in case some comp.lang.c readers would enjoy figuring
that out on their own.)

Nested comments? By the way, are they legal? (I know they weren't, but,
are they in C99?)

Lawrence Kirby · Sep 7, 2005

Tim Rentsch wrote:

Nested comments? By the way, are they legal? (I know they weren't, but,
are they in C99?)

Comments don't nest in C99, although you can put /* */ comments around
the // comments that C99 supports.

Lawrence

Michael Wojcik · Sep 7, 2005

Tim Rentsch wrote:

Nested comments? By the way, are they legal? (I know they weren't, but,
are they in C99?)

No, thank goodness. They'd break existing code. Not *good* existing
code, perhaps, but existing nonetheless.

I thought perhaps Tim was referring to incomplete comments in
included files, but that's illegal (C90 5.1.1.2 phase 3); or to
incomplete comments controlled by #if, but comments are removed
(translation phase 3) before preprocessing directives are processed
(phase 4).

Similarly, string literals can't span lines (in conforming code), and
string literals are pp-tokens, so you can't play games with enclosing
(what appear to be) comments in strings using preprocessing
directives.

Trigraphs don't require unbounded state, and at any rate there are no
trigraphs for "*" or "/". It should be legal to separate a comment
delimiter using backslash-continuation, but that only requires a
fixed amount of state, too.

It's probably something obvious that I'm just overlooking.

--
Michael Wojcik (e-mail address removed)

Be sure to push the button of the bottom, and push the button of the
settlement page indicated next only once, there is fear of the bottom
rhinoceros multiplex lesson money. -- Sukebe Net

Tim Rentsch · Sep 8, 2005

No, thank goodness. They'd break existing code. Not *good* existing
code, perhaps, but existing nonetheless.

I thought perhaps Tim was referring to incomplete comments in
included files, but that's illegal (C90 5.1.1.2 phase 3); or to
incomplete comments controlled by #if, but comments are removed
(translation phase 3) before preprocessing directives are processed
(phase 4).

Similarly, string literals can't span lines (in conforming code), and
string literals are pp-tokens, so you can't play games with enclosing
(what appear to be) comments in strings using preprocessing
directives.

Trigraphs don't require unbounded state, and at any rate there are no
trigraphs for "*" or "/". It should be legal to separate a comment
delimiter using backslash-continuation, but that only requires a
fixed amount of state, too.

It's probably something obvious that I'm just overlooking.

The hint is, one of Michael's comments is getting warm.

Incidentally, it wasn't obvious to me either when I first started
thinking about how to do comment removal.

Anonymous 7843 · Sep 10, 2005

Incidentally, the removal of comments from C source cannot,
strictly speaking, be done with just a finite state machine
(aka DFA). Additional state (in principle unbounded) is
needed. (I'll just leave this topic here, without saying
why, in case some comp.lang.c readers would enjoy figuring
that out on their own.)

/\
\
\
\
* Are you referring to comments like this? *\
\
\
\
/

Tim Rentsch · Sep 10, 2005

/\
\
\
\
* Are you referring to comments like this? *\
\
\
\
/

Yes. The start of that comment needs unbounded
lookahead to tell that it's a comment rather than,
for example, a '/=' operator.

P.J. Plauger · Sep 10, 2005

Yes. The start of that comment needs unbounded
lookahead to tell that it's a comment rather than,
for example, a '/=' operator.

Uh, no. A comment is syntactically equivalent to a single
space character, which turns any comment of the form
"//**/=" into "/ =".

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

Tim Rentsch · Sep 10, 2005

P.J. Plauger said:
Uh, no. A comment is syntactically equivalent to a single
space character, which turns any comment of the form
"//**/=" into "/ =".

Right, however that isn't the case I was talking
about. I meant this one:

/\
\
\
\
\
\
=

which looks like it might be a comment until
the scan gets to the '=' character.

Anonymous 7843 · Sep 12, 2005

Yes. The start of that comment needs unbounded
lookahead to tell that it's a comment rather than,
for example, a '/=' operator.

Unbounded lookahead is not the same as unbounded state. The repetition
of backslash/newline is two states no matter how long it goes on for
and is well within the abilities of a DFA.

However, catering to backslash/newline for every possible token
is a tedious way to implement a lexer. One might prefer to paste the
backslash/newlines together in an early phase, even before removing
comments.

Tim Rentsch · Sep 13, 2005

Unbounded lookahead is not the same as unbounded state. The repetition
of backslash/newline is two states no matter how long it goes on for
and is well within the abilities of a DFA.

However, catering to backslash/newline for every possible token
is a tedious way to implement a lexer. One might prefer to paste the
backslash/newlines together in an early phase, even before removing
comments.

I think you may have missed the significance of the context here.
We weren't talking about writing a lexer. The requirement is for
a program to remove comments from C source code, and except for
removing comments leaves the original source completely intact
(not counting possibly inserting spaces to preserve the token
boundaries around the removed comments).

In this context it's important to know how many \<NL> pairs have
been skipped over, because they may need to be output if they
don't belong to a comment. Needing that count is what changes
the problem to one that can't be done with a finite state
machine.

Anonymous 7843 · Sep 13, 2005

I think you may have missed the significance of the context here.
We weren't talking about writing a lexer. The requirement is for
a program to remove comments from C source code, and except for
removing comments leaves the original source completely intact
(not counting possibly inserting spaces to preserve the token
boundaries around the removed comments).

In this context it's important to know how many \<NL> pairs have
been skipped over, because they may need to be output if they
don't belong to a comment. Needing that count is what changes
the problem to one that can't be done with a finite state
machine.

You have piqued my curiousity. I'm going to read about flex some more
and have a go at writing a proper comment-removal thingy. I think the
trailing context feature would be sufficient to decide whether or not
to emit backslash/newline, but I'm going to try it out before digging
myself any deeper into the hole.

Tim Rentsch · Sep 13, 2005

You have piqued my curiousity. I'm going to read about flex some more
and have a go at writing a proper comment-removal thingy. I think the
trailing context feature would be sufficient to decide whether or not
to emit backslash/newline, but I'm going to try it out before digging
myself any deeper into the hole.

I expect the trailing context feature is sufficient to resolve
the question. Of course, using the trailing context feature
already (in the general case) makes the program exceed what a
finite state machine can do: the trailing context lookahead has
to remember where to continue scanning after checking the
trailing context, and remembering where to go back to takes (in
the theoretical sense) unbounded memory.

Can I jump to a case inside a switch using goto?	31	Apr 3, 2014
The use of goto in C	9	Jan 24, 2006
Goto still considered helpful	49	Oct 5, 2007
Possibly better loop construct, also labels+goto important and on the fly compiler idea.	82	Oct 17, 2013
Typography of programs	83	Jun 30, 2011
GOTO in Javascript	18	Jul 7, 2008
How to use PDF-lib and how to center each line of texts on the page?	1	Aug 16, 2023
Data saving in condition of changing reality	0	Apr 29, 2022

Should use of goto completely avoided in programs?

SUMIT

Richard Heathfield

John Bode

jiushengliu

Keith Thompson

Tim Rentsch

Antonio Contreras

Lawrence Kirby

Michael Wojcik

Tim Rentsch

Anonymous 7843

Tim Rentsch

P.J. Plauger

Tim Rentsch

Anonymous 7843

Tim Rentsch

Anonymous 7843

Tim Rentsch

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads