forward declarations in C

T

Till Crueger

Hi,

I am trying to implement a tree in C and I have the folowing code:

struct inner {
struct node *left;
struct node *right;
};

struct leaf {
/*data goes here */
};

struct node {
union {
struct inner inner;
struct leaf leaf;
} data;

enum {
INNER,LEAF
}type;
};

I was surprised that this actually compiles, because at the point were I
declare the struct inner there is no struct node. When I add a simple

struct node;

at the beginning it also compiles. My question is now, which is the
correct way to do this kind of thing.

Also if I wanted to add typedefs for inner leaf and node, what would be
the right way to achieve the same thing.

Thanks,
Till
 
B

Barry Schwarz

Hi,

I am trying to implement a tree in C and I have the folowing code:

struct inner {
struct node *left;
struct node *right;
};

struct leaf {
/*data goes here */
};

struct node {
union {
struct inner inner;
struct leaf leaf;
} data;

enum {
INNER,LEAF
}type;
};

I was surprised that this actually compiles, because at the point were I
declare the struct inner there is no struct node. When I add a simple

Are you sure you had your warning set to max?

One reason your compiler may choose to accept it is the standard
requires all pointers to any type of struct to have the same
representation. Therefore, when processing struct inner, while the
compiler may not know what a struct node is, it does know everything
it needs to know about a struct node *.
struct node;

at the beginning it also compiles. My question is now, which is the
correct way to do this kind of thing.

The latter, if for no other reason than maybe the lack of diagnostic
is a compiler error and gets fixed in the next update.
Also if I wanted to add typedefs for inner leaf and node, what would be
the right way to achieve the same thing.

One easy way is to insert the word "typedef" at the beginning of each
declaration and the typedef name just before the semicolon, as in
typedef struct inner{...}inner_t;

If you have structure types that point to each other and want to use
the typedef names, you can use
typedef struct inner inner_t;
use inner_t* to your heart's content and later declare what a struct
inner really is.


Remove del for email
 
A

aegis

Till said:
Hi,

I am trying to implement a tree in C and I have the folowing code:

struct inner {
struct node *left;
struct node *right;
};

struct leaf {
/*data goes here */
};

struct node {
union {
struct inner inner;
struct leaf leaf;
} data;

enum {
INNER,LEAF
}type;
};

I was surprised that this actually compiles, because at the point were I
declare the struct inner there is no struct node. When I add a simple

struct node;

at the beginning it also compiles. My question is now, which is the
correct way to do this kind of thing.

The former works because you can have pointers to
incomplete types. This, consequently, makes the latter
superfluous.

Also if I wanted to add typedefs for inner leaf and node, what would be
the right way to achieve the same thing.

Just add:

typedef struct leaf LEAF;
typedef struct inner INNER;
typedef struct node NODE; (redundant though because we can
consider a node to be external or internal)
 
C

Chris Torek

struct inner {
struct node *left;
struct node *right;
};

struct leaf {
/*data goes here */
};

struct node {
union {
struct inner inner;
struct leaf leaf;
} data;

enum {
INNER,LEAF
}type;
};

I was surprised that this actually compiles, because at the point were I
declare the struct inner there is no struct node.
[/QUOTE]

This is in fact OK. Structure names (in the "struct" namespace)
simply "spring into being" the first time they are mentioned like
this. Where this goes wrong is when the current scope is narrower
than you intended.

When you first mention a structure name, the scope at which the
name "suddenly exists now" is the *current* scope, *whatever that
is*. Thus, at file scope, the struct name has file scope -- which
is the maximum possible scope, so everyone will be able to see it
from here onward. But other scopes exist:

void f(void) {
struct foo { char *p; };
...
}

Here, the scope of "struct foo" is limited to the block that
encloses it. Once we finish with f(), the name vanishes. We can
then create a new and different "struct foo":

void g(void) {
struct foo { double x; };
...
}

and in this case we have done it in another block, so f()'s
"struct foo" and g()'s "struct foo" are entirely separate. This
should be quite familiar, since we can do the same scope tricks
with ordinary variables:

void f2(void) {
double foo;
...
}
void g2(void) {
int foo;
}

Here f2()'s "foo" is completely independent of g2()'s "foo".

Where "the name suddenly springs into being" goes terribly wrong
is when we deal with function prototype scope. Consider, for
instance:

void setname(char *name);
int resolve(int name);

Clearly setname() takes a different kind of "name" from resolve().
The scope of the identifier "name" is limited to each prototype,
so the two "name"s name different variables. But suppose we do this
with structures:

void op1(struct foo *p);
void op2(struct foo *p);

Not only are the two names "p" completely independent, so are the
two "struct foo"s -- just as in the example with f() and g().

The cure for this problem is as Barry Schwarz suggests below:
declare the existence of the structure name in advance, at an
outer scope, so that the inner scope (prototype-scope) names
simply refer back to the already-existing outer-scope name. For
instance:

struct foo;
void op1(struct foo *);
void op2(struct foo *);

Now op1() and op2() both take the same kind of argument.

Are you sure you had your warning set to max?

Since the construct above is legal *and* does what is desired,
it seems appropriate that the compiler did not warn.

[OP writes: with]
The latter, if for no other reason than maybe the lack of diagnostic
is a compiler error and gets fixed in the next update.

It is not a compiler error but declaring the structure in advance
like this is certainly harmless, and possibly wise. :)

One other note: a newfangled invention in C89 was to say that, in
an inner scope, a "vacuous" structure declaration like the above
had a new, special property. Suppose you had (for some reason)
code like this:

struct foo { int val; ... other members as needed ... };

void f3(void) {
struct foo { char *p; double q; ... };
...
}

Here, the inner-scope "struct foo" inside f3() overrides the outer
one, so that in f3(), "struct foo" means "f3's user-defined type
foo", not "the file's user-defined type foo". Well, this is all
well and good; but what happens if, for some reason, you want
mutually-referential structures whose scope is limited to f3()?
You might start with:

void f3(void) {
struct bar { struct foo *ref; ... };
struct foo { struct bar *ref; char *p; double q; ... };
...
}

But now "struct bar" refers to the *outer* scope user-defined "foo"
type, rather than the inner-scope one. We can fix this by reversing
the two declarations, but that will fail if there is *also* an
outer-scope user-defined "bar" too. So in C89, as a new committee
invention, they said:

If you write a vacuous declaration for a struct (or union) type
in an inner scope, that "clears the decks" of the outer instance
so that you can then declare a new, inner-scope version.

So, now we can write f3() in complete safety:

void f3(void) {
struct foo; /* vacuous declaration, makes foo local */
struct bar { struct foo *ref; ... };
struct foo { struct bar *ref; char *p; double q; ... };
...
}

Now the user-defined type "bar" in f3() refers to the user-defined
type "foo" in f3(), and the user-defined type "foo" in f3() refers
to the user-defined type "bar" in f3(), just as desired. (We could
be even more "belt and suspenders" about this and write vacuous
declarations for both, but since we *define* a new block-scope
"bar" right away, we do not *have* to declare it here.)
One easy way is to insert the word "typedef" at the beginning of each
declaration and the typedef name just before the semicolon, as in
typedef struct inner{...}inner_t;

I dislike this method because of the limitation you mention:
If you have structure types that point to each other and want to use
the typedef names, you can use
typedef struct inner inner_t;
use inner_t* to your heart's content and later declare what a struct
inner really is.

If you are going to use typedefs at all (and I prefer not to), I
suggest the following sequence:

1) declare that the user-defined type exists, using the
"vacuous declaration" method defined by C89:

struct foo;

2) define the typedef, using the user-defined type just declared:

typedef struct foo StructFoo;

3) repeat for the rest of your types.

Of course, at file scope -- but not always at block scope! -- we
can make use of the fact that user-defined types ("struct"s) simply
"spring into being" at the current scope to combine steps 1 and 2:

typedef struct bar StructBar;

Clearly there are only two possibilities for this line: either
"struct bar" has already been declared at the current (file) scope,
so that StructBar is now an alias for this type; or "struct bar"
has *not* already been declared at this current (file) scope, so
that it springs into being here, and StructBar is now an alias for
this type.

If you engage in the practice of declaring and defining new
user-defined types inside blocks, however, you may get the wrong
type aliased:

void h(void) {
struct foo { int a; };

if (somecond) {
typedef struct foo zog;
struct foo { char *p; };

... code section A ...
}
... code section B ...
}

In "code section A", the name "zog" is an alias for the *outer*
user-defined "foo" type, with one member named "a", while the name
"struct foo" is the name of the *inner* user-defined "foo" type,
with one member named "p". That is, in this code, "zog" and
"struct foo" name two *different* types.

Compare that to the two-part method:

void h(void) {
struct foo { int a; };

if (somecond) {
struct foo;
typedef struct foo zog;
struct foo { char *p; };

... code section A ...
}
... code section B ...
}

Here, in code section "A", "zog" is an alias for the *inner*
user-defined "foo" type, with one member named "p". That is,
in this code, both "zog" and "struct foo" name the *same* type.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,053
Latest member
BrodieSola

Latest Threads

Top