forward declarations in C

Discussion in 'C Programming' started by Till Crueger, Jun 3, 2006.

  1. Till Crueger

    Till Crueger Guest

    Hi,

    I am trying to implement a tree in C and I have the folowing code:

    struct inner {
    struct node *left;
    struct node *right;
    };

    struct leaf {
    /*data goes here */
    };

    struct node {
    union {
    struct inner inner;
    struct leaf leaf;
    } data;

    enum {
    INNER,LEAF
    }type;
    };

    I was surprised that this actually compiles, because at the point were I
    declare the struct inner there is no struct node. When I add a simple

    struct node;

    at the beginning it also compiles. My question is now, which is the
    correct way to do this kind of thing.

    Also if I wanted to add typedefs for inner leaf and node, what would be
    the right way to achieve the same thing.

    Thanks,
    Till

    --
    Please add "Salt and Pepper" to the subject line to bypass my spam filter
     
    Till Crueger, Jun 3, 2006
    #1
    1. Advertising

  2. On Sat, 03 Jun 2006 16:36:56 +0200, Till Crueger <>
    wrote:

    >Hi,
    >
    >I am trying to implement a tree in C and I have the folowing code:
    >
    >struct inner {
    > struct node *left;
    > struct node *right;
    >};
    >
    >struct leaf {
    > /*data goes here */
    >};
    >
    >struct node {
    > union {
    > struct inner inner;
    > struct leaf leaf;
    > } data;
    >
    > enum {
    > INNER,LEAF
    > }type;
    >};
    >
    >I was surprised that this actually compiles, because at the point were I
    >declare the struct inner there is no struct node. When I add a simple


    Are you sure you had your warning set to max?

    One reason your compiler may choose to accept it is the standard
    requires all pointers to any type of struct to have the same
    representation. Therefore, when processing struct inner, while the
    compiler may not know what a struct node is, it does know everything
    it needs to know about a struct node *.
    >
    >struct node;
    >
    >at the beginning it also compiles. My question is now, which is the
    >correct way to do this kind of thing.


    The latter, if for no other reason than maybe the lack of diagnostic
    is a compiler error and gets fixed in the next update.

    >
    >Also if I wanted to add typedefs for inner leaf and node, what would be
    >the right way to achieve the same thing.


    One easy way is to insert the word "typedef" at the beginning of each
    declaration and the typedef name just before the semicolon, as in
    typedef struct inner{...}inner_t;

    If you have structure types that point to each other and want to use
    the typedef names, you can use
    typedef struct inner inner_t;
    use inner_t* to your heart's content and later declare what a struct
    inner really is.


    Remove del for email
     
    Barry Schwarz, Jun 3, 2006
    #2
    1. Advertising

  3. Till Crueger

    aegis Guest

    Till Crueger wrote:
    > Hi,
    >
    > I am trying to implement a tree in C and I have the folowing code:
    >
    > struct inner {
    > struct node *left;
    > struct node *right;
    > };
    >
    > struct leaf {
    > /*data goes here */
    > };
    >
    > struct node {
    > union {
    > struct inner inner;
    > struct leaf leaf;
    > } data;
    >
    > enum {
    > INNER,LEAF
    > }type;
    > };
    >
    > I was surprised that this actually compiles, because at the point were I
    > declare the struct inner there is no struct node. When I add a simple
    >
    > struct node;
    >
    > at the beginning it also compiles. My question is now, which is the
    > correct way to do this kind of thing.
    >


    The former works because you can have pointers to
    incomplete types. This, consequently, makes the latter
    superfluous.


    > Also if I wanted to add typedefs for inner leaf and node, what would be
    > the right way to achieve the same thing.
    >


    Just add:

    typedef struct leaf LEAF;
    typedef struct inner INNER;
    typedef struct node NODE; (redundant though because we can
    consider a node to be external or internal)

    --
    aegis
     
    aegis, Jun 3, 2006
    #3
  4. Till Crueger

    Chris Torek Guest

    >On Sat, 03 Jun 2006 16:36:56 +0200, Till Crueger <>
    >wrote:
    >>struct inner {
    >> struct node *left;
    >> struct node *right;
    >>};
    >>
    >>struct leaf {
    >> /*data goes here */
    >>};
    >>
    >>struct node {
    >> union {
    >> struct inner inner;
    >> struct leaf leaf;
    >> } data;
    >>
    >> enum {
    >> INNER,LEAF
    >> }type;
    >>};
    >>
    >>I was surprised that this actually compiles, because at the point were I
    >>declare the struct inner there is no struct node.


    This is in fact OK. Structure names (in the "struct" namespace)
    simply "spring into being" the first time they are mentioned like
    this. Where this goes wrong is when the current scope is narrower
    than you intended.

    When you first mention a structure name, the scope at which the
    name "suddenly exists now" is the *current* scope, *whatever that
    is*. Thus, at file scope, the struct name has file scope -- which
    is the maximum possible scope, so everyone will be able to see it
    from here onward. But other scopes exist:

    void f(void) {
    struct foo { char *p; };
    ...
    }

    Here, the scope of "struct foo" is limited to the block that
    encloses it. Once we finish with f(), the name vanishes. We can
    then create a new and different "struct foo":

    void g(void) {
    struct foo { double x; };
    ...
    }

    and in this case we have done it in another block, so f()'s
    "struct foo" and g()'s "struct foo" are entirely separate. This
    should be quite familiar, since we can do the same scope tricks
    with ordinary variables:

    void f2(void) {
    double foo;
    ...
    }
    void g2(void) {
    int foo;
    }

    Here f2()'s "foo" is completely independent of g2()'s "foo".

    Where "the name suddenly springs into being" goes terribly wrong
    is when we deal with function prototype scope. Consider, for
    instance:

    void setname(char *name);
    int resolve(int name);

    Clearly setname() takes a different kind of "name" from resolve().
    The scope of the identifier "name" is limited to each prototype,
    so the two "name"s name different variables. But suppose we do this
    with structures:

    void op1(struct foo *p);
    void op2(struct foo *p);

    Not only are the two names "p" completely independent, so are the
    two "struct foo"s -- just as in the example with f() and g().

    The cure for this problem is as Barry Schwarz suggests below:
    declare the existence of the structure name in advance, at an
    outer scope, so that the inner scope (prototype-scope) names
    simply refer back to the already-existing outer-scope name. For
    instance:

    struct foo;
    void op1(struct foo *);
    void op2(struct foo *);

    Now op1() and op2() both take the same kind of argument.

    In article <>,
    Barry Schwarz <> wrote:
    >Are you sure you had your warning set to max?


    Since the construct above is legal *and* does what is desired,
    it seems appropriate that the compiler did not warn.

    [OP writes: with]
    >>struct node;
    >>
    >>at the beginning it also compiles. My question is now, which is the
    >>correct way to do this kind of thing.

    >
    >The latter, if for no other reason than maybe the lack of diagnostic
    >is a compiler error and gets fixed in the next update.


    It is not a compiler error but declaring the structure in advance
    like this is certainly harmless, and possibly wise. :)

    One other note: a newfangled invention in C89 was to say that, in
    an inner scope, a "vacuous" structure declaration like the above
    had a new, special property. Suppose you had (for some reason)
    code like this:

    struct foo { int val; ... other members as needed ... };

    void f3(void) {
    struct foo { char *p; double q; ... };
    ...
    }

    Here, the inner-scope "struct foo" inside f3() overrides the outer
    one, so that in f3(), "struct foo" means "f3's user-defined type
    foo", not "the file's user-defined type foo". Well, this is all
    well and good; but what happens if, for some reason, you want
    mutually-referential structures whose scope is limited to f3()?
    You might start with:

    void f3(void) {
    struct bar { struct foo *ref; ... };
    struct foo { struct bar *ref; char *p; double q; ... };
    ...
    }

    But now "struct bar" refers to the *outer* scope user-defined "foo"
    type, rather than the inner-scope one. We can fix this by reversing
    the two declarations, but that will fail if there is *also* an
    outer-scope user-defined "bar" too. So in C89, as a new committee
    invention, they said:

    If you write a vacuous declaration for a struct (or union) type
    in an inner scope, that "clears the decks" of the outer instance
    so that you can then declare a new, inner-scope version.

    So, now we can write f3() in complete safety:

    void f3(void) {
    struct foo; /* vacuous declaration, makes foo local */
    struct bar { struct foo *ref; ... };
    struct foo { struct bar *ref; char *p; double q; ... };
    ...
    }

    Now the user-defined type "bar" in f3() refers to the user-defined
    type "foo" in f3(), and the user-defined type "foo" in f3() refers
    to the user-defined type "bar" in f3(), just as desired. (We could
    be even more "belt and suspenders" about this and write vacuous
    declarations for both, but since we *define* a new block-scope
    "bar" right away, we do not *have* to declare it here.)

    >>Also if I wanted to add typedefs for inner leaf and node, what would be
    >>the right way to achieve the same thing.

    >
    >One easy way is to insert the word "typedef" at the beginning of each
    >declaration and the typedef name just before the semicolon, as in
    > typedef struct inner{...}inner_t;


    I dislike this method because of the limitation you mention:

    >If you have structure types that point to each other and want to use
    >the typedef names, you can use
    > typedef struct inner inner_t;
    >use inner_t* to your heart's content and later declare what a struct
    >inner really is.


    If you are going to use typedefs at all (and I prefer not to), I
    suggest the following sequence:

    1) declare that the user-defined type exists, using the
    "vacuous declaration" method defined by C89:

    struct foo;

    2) define the typedef, using the user-defined type just declared:

    typedef struct foo StructFoo;

    3) repeat for the rest of your types.

    Of course, at file scope -- but not always at block scope! -- we
    can make use of the fact that user-defined types ("struct"s) simply
    "spring into being" at the current scope to combine steps 1 and 2:

    typedef struct bar StructBar;

    Clearly there are only two possibilities for this line: either
    "struct bar" has already been declared at the current (file) scope,
    so that StructBar is now an alias for this type; or "struct bar"
    has *not* already been declared at this current (file) scope, so
    that it springs into being here, and StructBar is now an alias for
    this type.

    If you engage in the practice of declaring and defining new
    user-defined types inside blocks, however, you may get the wrong
    type aliased:

    void h(void) {
    struct foo { int a; };

    if (somecond) {
    typedef struct foo zog;
    struct foo { char *p; };

    ... code section A ...
    }
    ... code section B ...
    }

    In "code section A", the name "zog" is an alias for the *outer*
    user-defined "foo" type, with one member named "a", while the name
    "struct foo" is the name of the *inner* user-defined "foo" type,
    with one member named "p". That is, in this code, "zog" and
    "struct foo" name two *different* types.

    Compare that to the two-part method:

    void h(void) {
    struct foo { int a; };

    if (somecond) {
    struct foo;
    typedef struct foo zog;
    struct foo { char *p; };

    ... code section A ...
    }
    ... code section B ...
    }

    Here, in code section "A", "zog" is an alias for the *inner*
    user-defined "foo" type, with one member named "p". That is,
    in this code, both "zog" and "struct foo" name the *same* type.
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
     
    Chris Torek, Jun 3, 2006
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. matthew polder

    namespaces and forward declarations

    matthew polder, Jul 24, 2003, in forum: C++
    Replies:
    1
    Views:
    363
    John Harrison
    Jul 24, 2003
  2. mjm
    Replies:
    3
    Views:
    407
  3. whithers
    Replies:
    4
    Views:
    382
    Sumit Rajan
    Jan 16, 2004
  4. Alan Lee
    Replies:
    5
    Views:
    376
    Rolf Magnus
    Apr 5, 2004
  5. Steven T. Hatton

    forward declarations and namespaces?

    Steven T. Hatton, Apr 27, 2004, in forum: C++
    Replies:
    6
    Views:
    3,266
    Dave Moore
    May 5, 2004
Loading...

Share This Page