Assigned gotos in standard C

A

Adam

To explain why I am trying to do a non-C thing in C, I am writing a
translator for a toy language targeting C.
To do this I need a way to perform assigned gotos, by which I mean I
have a number of labels LABEL_0, LABEL_0, .. LABEL_N, and wish to be
able to take an int k and do goto LABEL_k. ( I can't use function
pointers as they would be called recursively, and one soon runs out of
stack space.)

In GCC and similar, I can use the && operator to take the address of
labels, and that works fine. However I want a second solution for
those compilers that do not support this extension.
I would also like that performing the goto takes constant time, and
does not depend on the number of labels, it is to be expected that
there will be many labels.

My current solution uses setjmp/longjmp, it creates an array jmp_buf
jump_table[ size], and then initialises it by
LABEL_k : if(!setjmp( jump_table[k] ) { goto LABEL_(k+1) }
I can then go to LABEL_k by calling longjump(jump_table[k])

I have the following questions:

Is this approach correct? Not having used setjmp/longjmp before, I
want to check I'm not making some mistake.

I need that the value of one particular variable is not changed by the
longjmp, and so it can not be stored in a register. I can ensure this
by declaring it volatile, yes?

Finally, and most importantly, is there a better way to do this?
 
B

Ben Bacarisse

Adam said:
To explain why I am trying to do a non-C thing in C, I am writing a
translator for a toy language targeting C.
To do this I need a way to perform assigned gotos, by which I mean I
have a number of labels LABEL_0, LABEL_0, .. LABEL_N, and wish to be
able to take an int k and do goto LABEL_k. ( I can't use function
pointers as they would be called recursively, and one soon runs out of
stack space.)

Maybe not. If you wrap up each labeled block as a function that returns
either a function pointer or an index into your function pointer table
you can turn the chain of GOTOs into a loop:

void run_program(struct state *state, function *start_function)
{
function *f = start_function;
while ((f = start_function(state)) != stop_function);
}

Of course, you then have to re-organise everything to fit this pattern
and this might not be convenient or even sensible.
In GCC and similar, I can use the && operator to take the address of
labels, and that works fine. However I want a second solution for
those compilers that do not support this extension.
I would also like that performing the goto takes constant time, and
does not depend on the number of labels, it is to be expected that
there will be many labels.

My current solution uses setjmp/longjmp, it creates an array jmp_buf
jump_table[ size], and then initialises it by
LABEL_k : if(!setjmp( jump_table[k] ) { goto LABEL_(k+1) }

That confused me at first. LABEL_(k+1) is s sort of meta-syntax and the
chain of gotos is to get the jump buffers initialised.
I can then go to LABEL_k by calling longjump(jump_table[k])

I have the following questions:

Is this approach correct? Not having used setjmp/longjmp before, I
want to check I'm not making some mistake.

Looks OK but there are issues to do with the values of local variables.
I see that comes next...
I need that the value of one particular variable is not changed by the
longjmp, and so it can not be stored in a register. I can ensure this
by declaring it volatile, yes?

Yes, though that's an odd way of putting it so I am not certain I
follow.

Objects local to the setjmp will have indeterminate values (C's way of
saying you can't tell) if (a) the change between setjmp and longjmp and
(b) they are not declared volatile.

longjmp won't change any variables but you may not be able to be sure
what values they hold if they are not volatile.
Finally, and most importantly, is there a better way to do this?

I'd use a switch and replace labels with integers. longjmp is overkill
if you are coming from the same function. If you are coming from
outside the function, I'd still use a switch but I'd switch on the value
returned though longjmp rather than having an array of jump buffers.

Even then, I'd try to get rid of longjmp by maybe calling the function I
want to "jump into" with an argument that specifies the case label.
 
M

Malcolm McLean

To explain why I am trying to do a non-C thing in C, I am writing a
translator for a toy language targeting C.
To do this I need a way to perform assigned gotos, by which I mean I
have a number of labels LABEL_0, LABEL_0, .. LABEL_N,

Finally, and most importantly, is there a better way to do this?
Translator creates a list of jump labels and an expression that
creates a labe.

You need to ouput
a) A function that converts the label expression into an index into
the jump list.
b) A switch on the index
c) The jump labels themselves embedded into the C.


so

int labelstoindex( char *toylabel)
{
if!(strcmp(toylabel, "LABEL0"))
return 0;
if(!strcmp(toylabel, "LABEL1"))
return 1;
etc
}

In the C

switch(labelstoindex( gettoylabel() )
{
case 0: goto LABEL1;
case 1: goto LABEL2;
etc
}
...

LABEL1:
printf("This is the code to execute for LABEL1")
goto cleanup;
LABEL2:
printf("This is the code to execute for LABEL2");
goto cleanup;
}

You might be able to collapse some of these steps, depending on your
rules for label assignment. However you proably need to keep the
labels theselves out of the switch, so they can be jumped to from
other code.

The "goto considered harmful" rule doesn't apply here, by they way.
The human-readable structure is in the toy language, not in the source
the compiler spits out.
 
A

Adam

I understand that a switch statement would allow this sort of flow
control; however, will this not lead to each jump requiring linear
time to compute, as the code works its way through each case?
 
B

BartC

Adam said:
To explain why I am trying to do a non-C thing in C, I am writing a
translator for a toy language targeting C.
To do this I need a way to perform assigned gotos, by which I mean I
have a number of labels LABEL_0, LABEL_0, .. LABEL_N, and wish to be
able to take an int k and do goto LABEL_k. ( I can't use function
pointers as they would be called recursively, and one soon runs out of
stack space.)

In GCC and similar, I can use the && operator to take the address of
labels, and that works fine. However I want a second solution for
those compilers that do not support this extension.
I would also like that performing the goto takes constant time, and
does not depend on the number of labels, it is to be expected that
there will be many labels.

A 'toy' language suggests that performance isn't too critical.

In that case, the simplest C solution is to use a switch statement inside a
loop. The switch index will be k, ranging from 0 to N, and the labels just
become case labels.

Calling functions (ie. calling then returning) from a dispatch loop, via a
function table indexed by k, is another standard way. Putting each handler
into it's own function is better than having a single giant function.

For minimal overheads, you might need to use assembly code, but it gets
fiddly. And non-portable.
 
B

BartC

Adam said:
I understand that a switch statement would allow this sort of flow
control; however, will this not lead to each jump requiring linear
time to compute, as the code works its way through each case?

No. If the case values span a limited range (eg. 0 to 255, with or without
gaps), then a jumptable is likely used. This is very fast.

The overheads of a switch are testing the index is in range, and jumping
back to the start of the switch.
 
M

Malcolm McLean

I understand that a switch statement would allow this sort of flow
control; however, will this not lead to each jump requiring linear
time to compute, as the code works its way through each case?
Depends on the compiler.
If you make all the labels contiguous from 0-N, and the code is just
goto ..., any decent compiler will produce a computed jump. If you
have funny fall-throughs and odd labels, then it might resort to an
if .. else ladder.
 
G

Gene

Maybe not.  If you wrap up each labeled block as a function that returns
either a function pointer or an index into your function pointer table
you can turn the chain of GOTOs into a loop:

  void run_program(struct state *state, function *start_function)
  {
       function *f = start_function;
       while ((f = start_function(state)) != stop_function);
  }

Of course, you then have to re-organise everything to fit this pattern
and this might not be convenient or even sensible.

I believe Ben means

void run_program(struct state *state, function *start_function)
{
function *f = start_function;
while ((f = f(state)) != stop_function)
/* skip */;
}

This is a very old technique called a trampoline or trampolining. The
state would contain a stack for arguments. Caller pushes and callee
pops. I've been told by a guy who knows that it dates from the time
when functional language researchers were trying to emit C code as you
are, but many C compilers had hard limits on the numbers of goto
labels they could handle.

One other thing to check -- if you have a well-defined set of
compilers to target -- is whether the compilers are good enough at
tail-recursion removal to use tail calls as gotos. Not long ago I did
an experiment to implement a finite automaton with tail calls
representing state transitions. Both gcc and Visual C compilers did a
perfect job of turning tail calls to jumps. I expect a compiler based
on e.g. the LLVM infrastructure could do this even for calls among
different source files.
 
A

Adam

I believe Ben means

void run_program(struct state *state, function *start_function)
{
     function *f = start_function;
    while ((f = f(state)) != stop_function)
        /* skip */;

}

This is a very old technique called a trampoline or trampolining.  The
state would contain a stack for arguments.  Caller pushes and callee
pops.  I've been told by a guy who knows that it dates from the time
when functional language researchers were trying to emit C code as you
are, but many C compilers had hard limits on the numbers of goto
labels they could handle.

One other thing to check -- if you have a well-defined set of
compilers to target -- is whether the compilers are good enough at
tail-recursion removal to use tail calls as gotos. Not long ago I did
an experiment to implement a finite automaton with tail calls
representing state transitions.  Both gcc and Visual C compilers did a
perfect job of turning tail calls to jumps.  I expect a compiler based
on e.g. the LLVM infrastructure could do this even for calls among
different source files.

Sadly in this case the recursion is not tail recursion, so this sort
of approach would not work without a lot of extra effort.
 
B

Ben Bacarisse

Gene said:
I believe Ben means

void run_program(struct state *state, function *start_function)
{
function *f = start_function;
while ((f = f(state)) != stop_function)
/* skip */;
}

<Sigh>. Yes, thanks, I did.

<snip>
 
B

Ben Bacarisse

Adam said:
On Aug 29, 8:08 pm, Gene <[email protected]> wrote:

Sadly in this case the recursion is not tail recursion, so this sort
of approach would not work without a lot of extra effort.

I don't follow that. A goto is always tail recursive in some sense
since it is not a call.

I'm suggesting that this (unsafe) code:

l0: i = 0;
/* implied goto l1 */
l1: c = getchar();
if (c == EOF) goto l3;
if (c == '\n') goto l2;
b[i++] = c;
goto l1:
l2: b = 0;
printf("%s\n", b);
goto l0:
l3: printf("Done\n");

Turns into:

struct state { int i, c; char b[100]; };

typedef void (*fptype)(void);
typedef fptype function(struct state *);

function *stop_function = 0;
function l0, l1, l2, l3;

fptype l0(struct state *sp)
{
sp->i = 0;
return (fptype)l1;
}

fptype l1(struct state *sp)
{
sp->c = getchar();
if (sp->c == EOF) return (fptype)l3;
if (sp->c == '\n') return (fptype)l2;
sp->b[sp->i++] = sp->c;
return (fptype)l1;
}

fptype l2(struct state *sp)
{
sp->b[sp->i] = 0;
printf("%s\n", sp->b);
return (fptype)l0;
}

fptype l3(struct state *sp)
{
printf("Done\n");
return (fptype)stop_function;
}

int main(void)
{
struct state state;
function *fp = l0;
while ((fp = (function *)fp(&state)) != stop_function);
}

(The types are hairy because C can't represent a function that returns a
pointer to the same type of function as itself. You have to cast at the
return. C people: please advise. Is there a better way to write/fudge
these function pointer types?)

The point being that you can put the function pointers into arrays or
complex expressions and, in essence, do a computed goto. It is a very
flexible organisation but, as I said, it may not suit your purpose.
 
B

Ben Bacarisse

William Ahern said:
Ben Bacarisse said:
(The types are hairy because C can't represent a function that returns a
pointer to the same type of function as itself. You have to cast at the
return. C people: please advise. Is there a better way to write/fudge
these function pointer types?)

You can fudge it if you "return" the function pointer through a parameter.
The trick is that you can short-circuit the infinite regression by excluding
declaration of the [parameter] type altogether.

struct state;

typedef void (*fun_t)();

void foo(fun_t *fp, struct state *sp) {
*fp = &foo;
}

This might not fit the bill, though. I couldn't follow the discussion about
tail recursion because I didn't get those posts.

That's a nice option, though I prefer to be able to simply "return X;"
since it is supposed to be the translation of "goto X;" a goto. Since
it is machine-generate code I shouldn't even consider this aspect but
somehow I can't help it!
 
T

Tim Rentsch

Please excuse me, I can't resist editorializing briefly:

while( f && f != stop_function ) f = f( state );


I'm suggesting that this (unsafe) code:

l0: i = 0;
/* implied goto l1 */
l1: c = getchar();
if (c == EOF) goto l3;
if (c == '\n') goto l2;
b[i++] = c;
goto l1:
l2: b = 0;
printf("%s\n", b);
goto l0:
l3: printf("Done\n");

Turns into:

struct state { int i, c; char b[100]; };

typedef void (*fptype)(void);
typedef fptype function(struct state *);

function *stop_function = 0;
function l0, l1, l2, l3;

fptype l0(struct state *sp)
{
sp->i = 0;
return (fptype)l1;
}

fptype l1(struct state *sp)
{
sp->c = getchar();
if (sp->c == EOF) return (fptype)l3;
if (sp->c == '\n') return (fptype)l2;
sp->b[sp->i++] = sp->c;
return (fptype)l1;
}

fptype l2(struct state *sp)
{
sp->b[sp->i] = 0;
printf("%s\n", sp->b);
return (fptype)l0;
}

fptype l3(struct state *sp)
{
printf("Done\n");
return (fptype)stop_function;
}

int main(void)
{
struct state state;
function *fp = l0;
while ((fp = (function *)fp(&state)) != stop_function);
}

(The types are hairy because C can't represent a function that returns a
pointer to the same type of function as itself. You have to cast at the
return. C people: please advise. Is there a better way to write/fudge
these function pointer types?)


The usual technique for doing this is to wrap the function pointer in a
struct and have the various state functions return a struct (disclaimer:
code not compiled!):

typedef struct state_changer_s {
struct state_changer_s (*f)( struct state * );
} StateChanger;

void
run_fsm( struct state *state, StateChanger f_initial ){
StateChanger f = f_initial;
while( f.f && f.f != stop_function ) f = f.f( state );
}

This makes the types work out, but the individual state functions
become a little bit clunky because of having to return a struct.
So it's compound literals to the rescue:

typedef StateChanger FSM_function( struct state* );
FSM_function l0, l1, l2, l3;

StateChanger
l0( struct state *sp ){
sp->i = 0;
return (StateChanger){ l1 };
}

/* ... etc ... */

No casts, all type safe.


OT trivia question: Give a definition for type 'StateChanger'
so we could write

void
run_fsm( struct state *state, StateChanger f_initial ){
StateChanger f = f_initial;
while( f && f != stop_function ) f = f( state );
}

and

StateChanger
l0( struct state *sp ){
sp->i = 0;
return l1;
}

/* ... etc ... */

while still preserving type correctness, etc.

(Hint: there's a hint.)
 
S

Shao Miller

Tim said:
... ... ...

OT trivia question: Give a definition for type 'StateChanger'
so we could write

void
run_fsm( struct state *state, StateChanger f_initial ){
StateChanger f = f_initial;
while( f && f != stop_function ) f = f( state );
}

and

StateChanger
l0( struct state *sp ){
sp->i = 0;
return l1;
}

/* ... etc ... */

while still preserving type correctness, etc.

(Hint: there's a hint.)

That seems to be chicken-and-egg, to me.

Perhaps in a similar spirit, how could one have a pointer type 'foo'
which is a 'typedef' for 'foo(*)[1]'; a pointer to an array whose
element(s) have the very same pointer type.

I don't know which "hint" you mean (perhaps you are hinting that it's
noted as impossible somewhere), but do you have a trick for
accomplishing this? If so, please do share as that would be exciting!
 
E

Ersek, Laszlo

OT trivia question: Give a definition for type 'StateChanger' so we
could write

void
run_fsm( struct state *state, StateChanger f_initial ){
StateChanger f = f_initial;
while( f && f != stop_function ) f = f( state );
}

and

StateChanger
l0( struct state *sp ){
sp->i = 0;
return l1;
}

/* ... etc ... */

while still preserving type correctness, etc.

(Hint: there's a hint.)

I don't get the hint, but I believe this is covered by the C FAQ 1.22
<http://c-faq.com/decl/recurfuncp.html>. (Hm, perhaps the hint is: "trivia
-> look it up in the FAQ!")

I wanted to see *why* it is impossible ("I'm not smart enough" is
insufficient). I'm looking at 6.7.6 "Type names", and the production rule
for the "type-name" non-terminal looks like this in the Syntax section:

type-name:
specifier-qualifier-list abstract-declarator_opt

abstract-declarator:
pointer
pointer_opt direct-abstract-declarator

[...]

Hence it seems that the left side of any type name must be a
"specifier-qualifier-list". A.2.2 "Declarations" to the rescue:

6.7.2.1p1

specifier-qualifier-list:
type-specifier specifier-qualifier-list_opt
type-qualifier specifier-qualifier-list_opt

6.7.3p1

type-qualifier:
const
restrict
volatile

6.7.2p1

type-specifier:
void
char
short
int
long
float
double
signed
unsigned
_Bool
_Complex
_Imaginary
struct-or-union-specifier
enum-specifier
typedef-name

To define the type "StateChanger", we would need this last list of
alternatives to allow a "function type". Generally, that's possible with
the "typedef-name" alternative, but a "typedef declaration does not
introduce a new type, only a synonym for the type so specified" (6.7.7p3),
so we're back to square one. We'd get an infinite parse tree with or
without typedefs. The only difference is in the production rules and
"levels of semantics" involved (like 6.7.7p3).

(Still talking to myself -- 'cause it's obvious you know this: in general,
it is possible to declare a function that retuns a pointer-to-function,
even without typedefs, if there is no circular dependency between the
types; 6.7.5.3p19 is an example. My eternal favorite is signal(),
7.14.1.1p1.)

lacos
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top