How is it possible to typedef a struct before it has been declared/defined?

Discussion in 'C Programming' started by Shriramana Sharma, Apr 17, 2013.

  1. Hello. The following code compiles (and links with a dummy main) quite well on both GCC 4.6.3 and CLang 3.0:

    Code:
    typedef struct MyStruct_Tag MyStruct ;
    struct MyStruct_Tag { int x, y ; } ;
    MyStruct a ;
    
    and this surprises me very much because I would have expected the compiler to complain if the first type in a typedef statement (or the second in a C++11 using-style typedef) is not yet even declared.

    Can anyone please explain how this is legal?

    Thanks.
    Shriramana Sharma, Apr 17, 2013
    #1
    1. Advertising

  2. Shriramana Sharma

    Eric Sosman Guest

    On 4/17/2013 8:47 AM, Shriramana Sharma wrote:
    > Hello. The following code compiles (and links with a dummy main) quite well on both GCC 4.6.3 and CLang 3.0:
    >
    >
    Code:
    > typedef struct MyStruct_Tag MyStruct ;
    > struct MyStruct_Tag { int x, y ; } ;
    > MyStruct a ;
    > 
    >
    > and this surprises me very much because I would have expected the compiler to complain if the first type in a typedef statement (or the second in a C++11 using-style typedef) is not yet even declared.
    >
    > Can anyone please explain how this is legal?


    C (I won't speak for C++) has the notion of an "incomplete
    type," a type that has been specified only in part. Early in
    the first line you have `struct MyStruct_Tag', and this -- all
    by itself -- is enough to inform the compiler of the existence
    of the `struct MyStruct_Tag' type. The type is "incomplete" at
    this point: The compiler knows the tag name, and knows that it
    is a struct type, but doesn't know any of the details.

    Despite the name, `typedef' doesn't actually define a new
    type: It just registers an alias for a type already known. So
    by the end of the first line, the compiler knows that `MyStruct'
    is an alias for the `struct MyStruct_Tag' type -- the type is
    still incomplete, either under its "official" name or under
    its `MyStruct' alias.

    The second line "completes" the incomplete declaration,
    filling in the missing details. This ability to separate "naming"
    the type from "describing" it has two useful consequences:

    - As soon as the type name is known you have the ability to write
    the names of other related types, most especially pointers to the
    named type. That's why you can do things like:

    typedef struct node_tag {
    struct node_tag *left;
    struct node_tag *right;
    int keyValue;
    } TreeNode;

    The type has not been "completely" declared at the point where
    `left' and `right' are described, but since the compiler already
    knows that `struct node_tag' is a type, it also knows that
    `struct node_tag *' is a type. You could separate it further:

    typedef struct node_tag TreeNode;
    struct node_tag {
    TreeNode *left;
    TreeNode *right;
    int keyValue;
    };

    .... with exactly the same effect.

    - You can leave the type incomplete, telling the compiler only
    that a struct type with such-and-such tag name exists but never
    describing its innards. This is useful for libraries that want
    to have "opaque" or "abstract" types; their headers can write

    /* opaque.h */
    typedef struct opaque_tag OpaqueData;
    OpaqueData *opaqueFactory(void);
    void opaqueDestructor(OpaqueData *ptr);
    void opaqueSetName(OpaqueData *ptr, const char *name);
    const char *opaqueGetName(const OpaqueData *ptr);
    // ... and so on

    .... and then the library's private implementation files can
    write, internally

    /* opaque.c */
    #include "opaque.h"
    struct opaque_tag {
    const char *name;
    double trouble;
    int whatnot;
    };

    With a setup like this, clients can deal with pointers to the
    incompletely-described type, but can never see or mess with
    its innards; a new version of the library can make rearrange
    or expand those innards without disturbing the clients.

    The same thing can be done with `union' types, too. Also,
    an array of unknown size is incomplete:

    extern int array[]; // incomplete, remains so
    // ...
    double vector[]; // incomplete
    // ...
    double vector[] = { 1.2, 2.3, 3.4 }; // completion

    --
    Eric Sosman
    d
    Eric Sosman, Apr 17, 2013
    #2
    1. Advertising

  3. Shriramana Sharma

    James Kuyper Guest

    On 04/17/2013 08:47 AM, Shriramana Sharma wrote:
    > Hello. The following code compiles (and links with a dummy main) quite well on both GCC 4.6.3 and CLang 3.0:
    >
    >
    Code:
    > typedef struct MyStruct_Tag MyStruct ;
    > struct MyStruct_Tag { int x, y ; } ;
    > MyStruct a ;
    > 
    >
    > and this surprises me very much because I would have expected the compiler to complain if the first type in a typedef statement (or the second in a C++11 using-style typedef) is not yet even declared.
    >
    > Can anyone please explain how this is legal?


    The typedef is actually irrelevant to this issue. You could remove the
    'typedef' keyword, and declare 'a' as "struct MyStruct_Tag", and you'd
    still have the same issues.

    The declaration

    struct MyStruct_Tag MyStruct;

    declares "struct MyStruct_Tag" to be an incomplete struct type. The
    contents of that type are unspecified, and therefore so is the size of
    the type. You can't declare an object of an incomplete type, nor an
    array of it. However, it can still be used in limited ways. In
    particular, you can declare a object to be a pointer to that type - the
    standard requires that all pointers to struct types have the same
    representation and same alignment requirements, so it's possible to pass
    around struct pointers without knowing anything about the contents of
    the struct they refer to.

    A declaration of an incomplete type can be completed at a later point,
    and that's exactly what your code does on the very next line. From that
    point onward, the type can be used just as if it had been complete from
    the beginning.
    --
    James Kuyper
    James Kuyper, Apr 17, 2013
    #3
  4. Wow thanks people, that explanation was very useful!
    Shriramana Sharma, Apr 17, 2013
    #4
  5. Shriramana Sharma

    Les Cargill Guest

    Shriramana Sharma wrote:
    > Hello. The following code compiles (and links with a dummy main)
    > quite well on both GCC 4.6.3 and CLang 3.0:
    >
    >
    Code:
     typedef struct MyStruct_Tag MyStruct ; struct MyStruct_Tag {
    > int x, y ; } ; MyStruct a ; 
    >
    > and this surprises me very much because I would have expected the
    > compiler to complain if the first type in a typedef statement (or the
    > second in a C++11 using-style typedef) is not yet even declared.
    >
    > Can anyone please explain how this is legal?
    >
    > Thanks.
    >


    'C' is not inherently a single-pass compiled language.

    Makes perfect sense to me - otherwise a self-referential struct
    would require a void * and cast.

    it solves a catch-22.

    --
    Les Cargill
    Les Cargill, Apr 17, 2013
    #5
  6. Les Cargill <> writes:
    > Shriramana Sharma wrote:
    >> Hello. The following code compiles (and links with a dummy main)
    >> quite well on both GCC 4.6.3 and CLang 3.0:
    >>
    >>
    Code:
     typedef struct MyStruct_Tag MyStruct ; struct MyStruct_Tag {
    >> int x, y ; } ; MyStruct a ; 
    >>
    >> and this surprises me very much because I would have expected the
    >> compiler to complain if the first type in a typedef statement (or the
    >> second in a C++11 using-style typedef) is not yet even declared.
    >>
    >> Can anyone please explain how this is legal?

    >
    > 'C' is not inherently a single-pass compiled language.


    Well, sort of. It's designed for single-pass compilation, but in a few
    special cases later declarations can cause earlier declarations to be
    "completed". For example:

    struct foo;
    /* Compiler creates a symbol table entry for "struct foo" as an
    incomplete type. */

    struct foo { int n; };
    /* Makes "struct foo" a complete type */

    The compiler doesn't do a second pass over the source file; it just goes
    back and updates a symbol that it has previously partially processed.

    It's a special-case tweak to the single-pass model, needed because, as
    you say:

    > Makes perfect sense to me - otherwise a self-referential struct
    > would require a void * and cast.
    >
    > it solves a catch-22.


    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Working, but not speaking, for JetHead Development, Inc.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Apr 17, 2013
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Simon Elliott
    Replies:
    10
    Views:
    1,869
    msalters
    Sep 8, 2005
  2. Replies:
    7
    Views:
    9,786
    Ron Natalie
    Sep 9, 2005
  3. Replies:
    11
    Views:
    563
  4. jubelbrus
    Replies:
    6
    Views:
    4,113
    jubelbrus
    Jun 16, 2007
  5. wyndz0108
    Replies:
    4
    Views:
    17,449
    Pete Becker
    Feb 10, 2008
Loading...

Share This Page