Enhancing 'break' and 'continue': Levels vs. Labels

K

Kenny McCormack

Some other thread drifted off into this topic, but I feel it deserves a
thread of its own.

I have frequently been an advocate of adding the 'break n' construct from
the standard Unix shell to C, since there are situations where you would
like to break out of more than one loop. It seems harmless enough; like
everything else in the world, if you don't like it, don't use it. But for
some reason, suggesting it makes some people's blood pressure rise.

I should also add that part of the problem is that 'break' is overloaded.
If you are in a 'switch' statement inside of a 'for' or 'while' statement,
you can't break out of the loop. It seems pretty clear to me that if we had
it to do all over again, we'd use some other verb in the 'switch' context
(C-shell has 'breaksw'...).

The current break and continue statements can also be replaced
by goto. For that matter, all control statements can be replaced
by goto and if.

This was written in response to another poster who was arguing against the
"break label" idea. Explanation: "break label" is the often-advanced
counter-proposal to "break level", in which instead of allowing, say, "break
2;", you allow "break somewhere;", the later construct being, as far as I
can tell, exactly equivalent to "goto somewhere;".

As you can see, I also don't see the point in "break label" at the
bits-and-bytes-assembler-level, but I think the following things can be said
in its defense:

1) It avoids the dreaded "g word" from soiling up your pristine code.

2) It may well be the case that arbitraryily targeted "goto"s are not
allowed and that a "syntactic" "break label" construct could be
implemented to "goto" places that one could not otherwise "goto".

It is, thus, 2) above that is most interesting from a discussion
point-of-view. It has never been clear to me, which "goto"s are allowed in
C (since I very rarely use goto, except in the simplest cases) and, in fact,
if any are disallowed.

--
"The anti-regulation business ethos is based on the charmingly naive notion
that people will not do unspeakable things for money." - Dana Carpender

Quoted by Paul Ciszek (pciszek at panix dot com). But what I want to know
is why is this diet/low-carb food author doing making pithy political/economic
statements?

Nevertheless, the above quote is dead-on, because, the thing is - business
in one breath tells us they don't need to be regulated (which is to say:
that they can morally self-regulate), then in the next breath tells us that
corporations are amoral entities which have no obligations to anyone except
their officers and shareholders, then in the next breath they tell us they
don't need to be regulated (that they can morally self-regulate) ...
 
T

Tom St Denis

It is, thus, 2) above that is most interesting from a discussion
point-of-view.  It has never been clear to me, which "goto"s are allowed in
C (since I very rarely use goto, except in the simplest cases) and, in fact,
if any are disallowed.

[I shall note it's ironic you want to have a discussion here when you
spend so much time harassing people otherwise, but in the interests of
being a better human being than you I'll engage in conversation
here...]

Usually the only form of goto's that are a problem is if they're

a) backwards [going up in the function]

and

b) To targets spread out throughout the function.

The typical accepted form is to have goto destinations clustered at
the end of a function to perform cleanup [like an exception handler].

There are asinine standards out there that all but ban "goto" and
"break" [as well as "continue"] but they're problematic. They favour
coding like

err = do_something(...);
if (err == OK) {
err = do_something_else(...);
if (err == OK) {
err = do_...
}
}
.... etc

Which can get nested fairly deep and becomes a pain to read. Whereas

if ((err = blah()) != OK) { goto error; }
if ((err = foo()) != OK) { goto error; }
if ((err = bar()) != OK) { goto error; }

is much easier to read and also shorter...

Tom
 
D

David Mathog

This was written in response to another poster who was arguing against the
"break label" idea.  Explanation: "break label" is the often-advanced
counter-proposal to "break level", in which instead of allowing, say, "break
2;", you allow "break somewhere;", the later construct being, as far as I
can tell, exactly equivalent to "goto somewhere;".

Ah, but it depends on what you label. If the label just refers to a
line, then you are correct. But what if the label identifies the loop
or switch on that line too? It wouldn't be hard to implement this:

foo: for(i=0;i<TOP;i++){
if(whatever)break foo; /* equivalent to break (no goto
equivalent with just these labels) */
if(whatever-1)continue foo; /* equivalent to continue and
"goto foo" */
bar: switch(data){
case 1:
break foo; /* out of the for loop */
case 2:
continue foo; /* to the top of the for loop */
case 3:
switch(data[i-1]){
case 1:
break bar; /* out of the outer switch */
case 2;
break; /* out of this switch */
case 3:
continue foo; /* to the top of the for loop */
case 4:
break foo; /* out of the for loop */
case 5:
/* continue bar - not allowed! No continue jump to
the top of a switch */
case 6:
/* break targ - not allowed! No jumps except outward
in nest */
}
default:
}
targ: switch(otherdata){
...
}
/* more code */
}

Here the label applies not only in the existing goto sense, but also
as a label for the associated
loop or switch context on the same line (or up to the next ; or set of
{}). It is maybe better thought of as a name for the loop or switch.
For a loop both continue and break would be enhanced by that label to
"do the right thing". For jumping with respect to switches, only break
would be enhanced (no "continue" to the top of a switch, anywhere in
the nested set). Similarly, no enhanced break or continue would move
in any direction other than upward in the nested set.

One advantage of doing things this way is that editing operations
which accidentally merge switch/loop
constructs could be caught by the compiler if they contained illegal
"break label" and "continue label" statements, which might have been
perfectly valid (but not what was intended) if they were just break
and continue.

Regards,

David Mathog
 
L

lawrence.jones

Kenny McCormack said:
I have frequently been an advocate of adding the 'break n' construct from
the standard Unix shell to C, since there are situations where you would
like to break out of more than one loop. It seems harmless enough; like
everything else in the world, if you don't like it, don't use it. But for
some reason, suggesting it makes some people's blood pressure rise.

It's exceedingly fragile -- adding or removing a loop (or switch)
requires tracking down all the break and continue statements for the
surrounding statements and adjusting the count. That's why using a
label rather than a depth is generally regarded as a much better idea.
This was written in response to another poster who was arguing against the
"break label" idea. Explanation: "break label" is the often-advanced
counter-proposal to "break level", in which instead of allowing, say, "break
2;", you allow "break somewhere;", the later construct being, as far as I
can tell, exactly equivalent to "goto somewhere;".

Not at all, it's the *loop* (or switch) that's labeled, not the point
you're jumping to that's labeled. Thus, ``break foo;'' and ``continue
foo;'' would go to radically different places, neither of which is the
same place that ``goto foo;'' would go.
2) It may well be the case that arbitraryily targeted "goto"s are not
allowed and that a "syntactic" "break label" construct could be
implemented to "goto" places that one could not otherwise "goto".

There is no place you can ``break'' or ``continue'' to that you
couldn't just as well ``goto'' to. They're purely syntactic sugar.
 
B

BartC

It's exceedingly fragile -- adding or removing a loop (or switch)
requires tracking down all the break and continue statements for the
surrounding statements and adjusting the count. That's why using a
label rather than a depth is generally regarded as a much better idea.

Adding a loop (or switch) in C now is a lot *more* fragile.

It's not a question of simply tracking down all the break and continue
statements, but you have to restructure the whole thing!

Being able to merely change break to break 1 or whatever, would be a
godsend
 
I

Ian Collins

Adding a loop (or switch) in C now is a lot *more* fragile.

It's not a question of simply tracking down all the break and continue
statements, but you have to restructure the whole thing!

Or just do what I do: don't use break and continue. They both make code
more fragile!
 
J

Jens

I have frequently been an advocate of adding the 'break n' construct from
the standard Unix shell to C, since there are situations where you would
like to break out of more than one loop.  It seems harmless enough; like
everything else in the world, if you don't like it, don't use it.  But for
some reason, suggesting it makes some people's blood pressure rise.

If you take aside gotos and longjmp for a moment, higher level breaks
change the complexity of the control-flow graph. Programs without
gotos and higher level breaks have a bounded "tree-width", to state it
briefly they are very similar to a tree. This helps when you have to
compute data dependencies, do register assignment, avoid spills and
stuff like that.

Java has these higher level breaks and continues and we have shown a
while ago, that this can lead to arbitrarily high treewidth, so to
control flow graphs that are arbitrarily complex.

I would think of gotos (and longjmp) as "exceptional" change of
control (mostly they are used like that) and break and continue as
"algorithmic" change of control. So adding such a construct would
allow to express much more complicated control flow with a "normal"
construct. To my taste this would be relatively little gain for quite
a difficulty to implement a good optimizer.

Jens
 
R

robertwessel2

If you take aside gotos and longjmp for a moment, higher level breaks
change the complexity of the control-flow graph. Programs without
gotos and higher level breaks have a bounded "tree-width", to state it
briefly they are very similar to a tree. This helps when you have to
compute data dependencies, do register assignment, avoid spills and
stuff like that.

Java has these higher level breaks and continues and we have shown a
while ago, that this can lead to arbitrarily high treewidth, so to
control flow graphs that are arbitrarily complex.

I would think of gotos (and longjmp) as "exceptional" change of
control (mostly they are used like that) and break and continue as
"algorithmic" change of control. So adding such a construct would
allow to express much more complicated control flow with a "normal"
construct. To my taste this would be relatively little gain for quite
a difficulty to implement a good optimizer.


Of course an optimizer would be free to continue doing whatever it's
doing for single level breaks and continues, and interpret the multi-
level versions as their goto equivalents. Admittedly some optimizers
simply give up if they find *any* gotos in a routine, but those appear
to be somewhat rare.

In any case, if my goal was to make life easy for the optimizer, the
language wouldn’t look much like C.
 
B

BartC

If you take aside gotos and longjmp for a moment, higher level breaks
change the complexity of the control-flow graph. Programs without
gotos and higher level breaks have a bounded "tree-width", to state it
briefly they are very similar to a tree. This helps when you have to
compute data dependencies, do register assignment, avoid spills and
stuff like that.

Some people just don't care. They just want more expressiveness without
being forced to write convoluted code with unnecessary variables.

(And if it is really that straightforward to convert multi-level breaks into
standard code, why can't the compiler do it? Then the programmer, doesn't
have to do the conversion both ways: converting multi-level breaks into some
cryptic code, and later deducing that that was simply trying to exit from an
inner loop!)
 
M

Michael Press

Some people just don't care. They just want more expressiveness without
being forced to write convoluted code with unnecessary variables.

(And if it is really that straightforward to convert multi-level breaks into
standard code, why can't the compiler do it? Then the programmer, doesn't
have to do the conversion both ways: converting multi-level breaks into some
cryptic code, and later deducing that that was simply trying to exit from an
inner loop!)

Code itself is a data structure. When a nested loop gets so
deep so as to be unwieldy, then it is time to design and
construct a data structure external to the code that does
not fracture under the load; one that immediately suggests
a pellucid block of code that writes itself.
 
K

K4 Monk

It is, thus, 2) above that is most interesting from a discussion
point-of-view.  It has never been clear to me, which "goto"s are allowed in
C (since I very rarely use goto, except in the simplest cases) and, in fact,
if any are disallowed.

[I shall note it's ironic you want to have a discussion here when you
spend so much time harassing people otherwise, but in the interests of
being a better human being than you I'll engage in conversation
here...]

Usually the only form of goto's that are a problem is if they're

a) backwards [going up in the function]

and

b) To targets spread out throughout the function.

The typical accepted form is to have goto destinations clustered at
the end of a function to perform cleanup [like an exception handler].

What is the difference between a goto and long jmp? People seem to
favor the later for exception handling in C but not gotos.ca
 
B

BartC

Michael Press said:
Code itself is a data structure. When a nested loop gets so
deep so as to be unwieldy, then it is time to design and
construct a data structure external to the code that does
not fracture under the load; one that immediately suggests
a pellucid block of code that writes itself.

There's nothing to stop someone doing that.

And there's nothing to stop someone doing that *after* they've been using
'break N' while developing some code. After all, these break statements can
come and go, same as any others, and there's no point in turning the code
upside down, when that break could disappear five minutes later.

For example, you might find that an inner loop is getting too complex and
decide to put that into it's own function. Now a multi-level break is no
longer needed.

And the nested loop doesn't need to get so deep; just two levels is enough!

As it is, C with it's current one-level break already imposes arbitrary
restrictions: you can use 'break' from inside statement A, but not from
inside statement B.
 
J

James Kuyper

On 03/25/2011 01:10 AM, K4 Monk wrote:
....
What is the difference between a goto and long jmp? People seem to
favor the later for exception handling in C but not gotos.ca

A goto can only jump to a different label in the same function. A
longjmp() can take you back to any function that is still active - that
includes the current function, but also the calling function, and the
function that called it, all the way back to main(). That's the key
reason why longjmp() is favored for exception handling. However, there's
many other differences, as well:

There are some complicated restrictions on where a setjmp() call can be
made. Those restrictions are not arbitrary - the serve to make
implementation of setjmp/longjmp safer and easier. A label that serves
as the destination for a goto has much fewer restrictions on where it
can be placed.

longjmp() returns a value that is used as the return value of setjmp()
after the jump has succeeded; how that value is used is up to the
developer. No such data-passing mechanism is associated with goto.

After returning from a longjmp() call, local variables with automatic
storage duration that are not declared volatile cannot be counted on to
have retained the value they had at the time longjmp() was called.
There's no such problem with goto.
 
J

James Kuyper

longjmp() returns a value that is used as the return value of setjmp()

Correction:

longjmp() takes an argument whose values is used as the return value of
setjmp().
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top