Tagged unions

Discussion in 'C Programming' started by Johan Tibell, Jul 21, 2006.

  1. Johan Tibell

    Johan Tibell Guest

    I use a tagged union to represent different expression types in one of
    my programs.

    struct exp {
    enum {
    LIT,
    VAR
    } type;
    union {
    int lit;
    char *var;
    } form;
    };

    In my implementation I've put the enum outside of the struct and given
    it a name, "exp_type".

    enum exp_type { /* ... */ };

    struct my_struct {
    enum exp_type type;
    /* ... */
    };

    What would be the pros and cons of having it unnamed inside the struct
    versus named outside the struct respectively? I can think of a few:

    Pros:
    * Less pollution of the namespace. I currently have two different
    structs so I have to prefix my enum type names with "structname_" (e.g.
    exp_type).
    * Saves me some typing.
    * Avoid repetition of the name "type" in the variable declaration
    inside the struct (e.g. exp_type type).

    Cons:
    * Can't create a variable of the enum type since the type can't be
    referred to. (Would it even be possible to refer to the enum type if it
    was named _and_ declared inside the struct?). This can also be a good
    thing if no more variables of the enum type will ever be created but it
    can be a bit difficult to predict in advance.

    This is probably more of a stylistic question than anything else (and
    hence I expect 10^100 replies).
     
    Johan Tibell, Jul 21, 2006
    #1
    1. Advertising

  2. On Fri, 21 Jul 2006, Johan Tibell wrote:

    > I use a tagged union to represent different expression types in one of
    > my programs.
    >
    > struct exp {
    > enum {
    > LIT,
    > VAR
    > } type;
    > union {
    > int lit;
    > char *var;
    > } form;
    > };
    >
    > In my implementation I've put the enum outside of the struct and given
    > it a name, "exp_type".
    >
    > enum exp_type { /* ... */ };
    >
    > struct my_struct {
    > enum exp_type type;
    > /* ... */
    > };
    >
    > What would be the pros and cons of having it unnamed inside the struct
    > versus named outside the struct respectively? I can think of a few:
    >
    > Pros:
    > * Less pollution of the namespace. I currently have two different
    > structs so I have to prefix my enum type names with "structname_" (e.g.
    > exp_type).
    > * Saves me some typing.
    > * Avoid repetition of the name "type" in the variable declaration
    > inside the struct (e.g. exp_type type).
    >
    > Cons:
    > * Can't create a variable of the enum type since the type can't be
    > referred to. (Would it even be possible to refer to the enum type if it
    > was named _and_ declared inside the struct?). This can also be a good
    > thing if no more variables of the enum type will ever be created but it
    > can be a bit difficult to predict in advance.


    Why would you ever need to create such variables? That is
    bad programming practice in my book (creating unnecessary
    couplings).

    > This is probably more of a stylistic question than anything else (and
    > hence I expect 10^100 replies).


    My preference is to use anonymous enums in this situation.

    Tak-Shing
     
    Tak-Shing Chan, Jul 21, 2006
    #2
    1. Advertising

  3. Johan Tibell

    Johan Tibell Guest

    Tak-Shing Chan wrote:
    > > I use a tagged union to represent different expression types in one of
    > > my programs.

    >
    > Why would you ever need to create such variables? That is
    > bad programming practice in my book (creating unnecessary
    > couplings).


    In this case I'm representing an abstract syntax tree (AST) created
    using lex/yacc which will be passed to a function eval for evaluation.
    In functional languages one usually uses an algebraic data type to
    represent such ASTs and in OO languages a class hierarchy is often
    (always?) used. I assumed that a tagged union would be the
    corresponding representation in C. If you know of a better alternative
    please let me know, I'm not an experienced C programmer.

    Or perhaps I don't quite understand what part of my implementation you
    think is bad. Are you referring to the whole tagged union thing?
     
    Johan Tibell, Jul 21, 2006
    #3
  4. On Fri, 21 Jul 2006, Johan Tibell wrote:

    > Tak-Shing Chan wrote:
    >>> I use a tagged union to represent different expression types in one of
    >>> my programs.

    >>
    >> Why would you ever need to create such variables? That is
    >> bad programming practice in my book (creating unnecessary
    >> couplings).


    [You are quoting me out of context here. By ``such
    variables'' I am referring to reused enums, not tagged unions.]

    > In this case I'm representing an abstract syntax tree (AST) created
    > using lex/yacc which will be passed to a function eval for evaluation.
    > In functional languages one usually uses an algebraic data type to
    > represent such ASTs and in OO languages a class hierarchy is often
    > (always?) used. I assumed that a tagged union would be the
    > corresponding representation in C. If you know of a better alternative
    > please let me know, I'm not an experienced C programmer.
    >
    > Or perhaps I don't quite understand what part of my implementation you
    > think is bad. Are you referring to the whole tagged union thing?


    You have misread my post. What I said was, tagged unions
    are fine but reused enums are bad (in the context of this
    thread). IMHO. YMMV.

    Tak-Shing
     
    Tak-Shing Chan, Jul 21, 2006
    #4
  5. Johan Tibell

    Ben Pfaff Guest

    "Johan Tibell" <> writes:

    > struct exp {
    > enum {
    > LIT,
    > VAR
    > } type;
    > union {
    > int lit;
    > char *var;
    > } form;
    > };


    ....versus...

    > enum exp_type { /* ... */ };
    >
    > struct my_struct {
    > enum exp_type type;
    > /* ... */
    > };


    > Cons:
    > * Can't create a variable of the enum type since the type can't be
    > referred to. (Would it even be possible to refer to the enum type if it
    > was named _and_ declared inside the struct?).


    (Yes, it would.)

    > This can also be a good thing if no more variables of the enum
    > type will ever be created but it can be a bit difficult to
    > predict in advance.


    I often use the former style, where the enum is declared without
    a tag inside the struct. If later it becomes necessary to refer
    to its type explicitly (which is fairly rare), it's only the
    matter of a moment's work to add a tag.
    --
    "When in doubt, treat ``feature'' as a pejorative.
    (Think of a hundred-bladed Swiss army knife.)"
    --Kernighan and Plauger, _Software Tools_
     
    Ben Pfaff, Jul 21, 2006
    #5
  6. Johan Tibell

    Rob Thorpe Guest

    Johan Tibell wrote:
    > I use a tagged union to represent different expression types in one of
    > my programs.
    >
    > struct exp {
    > enum {
    > LIT,
    > VAR
    > } type;
    > union {
    > int lit;
    > char *var;
    > } form;
    > };
    >
    > In my implementation I've put the enum outside of the struct and given
    > it a name, "exp_type".
    >
    > enum exp_type { /* ... */ };
    >
    > struct my_struct {
    > enum exp_type type;
    > /* ... */
    > };
    >
    > What would be the pros and cons of having it unnamed inside the struct
    > versus named outside the struct respectively? I can think of a few:
    >
    > Pros:
    > * Less pollution of the namespace. I currently have two different
    > structs so I have to prefix my enum type names with "structname_" (e.g.
    > exp_type).
    > * Saves me some typing.
    > * Avoid repetition of the name "type" in the variable declaration
    > inside the struct (e.g. exp_type type).
    > Cons:
    > * Can't create a variable of the enum type since the type can't be
    > referred to. (Would it even be possible to refer to the enum type if it
    > was named _and_ declared inside the struct?). This can also be a good
    > thing if no more variables of the enum type will ever be created but it
    > can be a bit difficult to predict in advance.
    >
    > This is probably more of a stylistic question than anything else (and
    > hence I expect 10^100 replies).


    If something like this is for use inside a programming langauge
    implementation it is likely to be used a lot. In this case I'd
    recommend separating the meaning of the data from it's structure. You
    could for example create a set of inline functions that access parts of
    the struct. That way it becomes much easier to change the inside of
    the struct without breaking other things. (eg functions like get_type,
    set_type, get_lit, etc)

    This type of pseudo-OO in C isn't always a good idea, but it's useful
    for something like this.
     
    Rob Thorpe, Jul 21, 2006
    #6
  7. Johan Tibell

    Johan Tibell Guest

    Tak-Shing Chan wrote:
    > [You are quoting me out of context here. By ``such
    > variables'' I am referring to reused enums, not tagged unions.]


    I was a bit unsure about which part you were addressing (since you
    almost quoted my entire message) and therefore I included the caveat at
    the very end of my message. Thank you for the clarification.
     
    Johan Tibell, Jul 21, 2006
    #7
  8. Johan Tibell

    Chris Torek Guest

    In article <>
    Johan Tibell <> wrote:

    [vertically compressed]

    >struct exp {
    > enum { LIT, VAR } type;
    > union { int lit; char *var; } form;
    >};


    [vs
    enum exp_type { LIT, VAR };
    struct exp {
    enum exp_type type;
    union { int lit; char *var; } form;
    };
    ]

    >What would be the pros and cons of having it unnamed inside the struct
    >versus named outside the struct respectively? I can think of a few:
    >
    >Pros:
    >* Less pollution of the namespace. I currently have two different
    >structs so I have to prefix my enum type names with "structname_" (e.g.
    >exp_type).


    This is a smaller pro than it may look: enumeration members are
    in the ordinary namespace, at the same scope as the overall definition
    of the structure type, so LIT and VAR can appear anywhere up to
    the end of the current scope and must be unique. That is:

    struct expression {
    enum { LIT, VAR } type;
    ...
    };
    struct fuse {
    enum { UNLIT, LIT } type;
    ...
    };

    is no good -- the two "LIT"s conflict. (In C++ each struct has its
    own little sub-namespace, but C is not C++.)

    Thus, the only namespace you avoid polluting is the "tag" namespace.

    >* Saves me some typing.


    Not much, since you can also write:

    struct exp {
    enum exp_type { LIT, VAR } type;
    union { ... } form;
    }

    >* Avoid repetition of the name "type" in the variable declaration
    >inside the struct (e.g. exp_type type).
    >
    >Cons:
    >* Can't create a variable of the enum type since the type can't be
    >referred to. (Would it even be possible to refer to the enum type if it
    >was named _and_ declared inside the struct?).


    Easily fixed by adding an enum tag, as above. Yes, you can refer to
    "embedded" types afterward in C:

    struct foo {
    enum zot { ZOT_A, ZOT_B } zot;
    struct bar {
    int i;
    };
    char *p;
    };
    enum zot zed;
    struct bar bar;

    (Again, C++ is different -- another reason not to try to compile C
    code with a C++ compiler: valid C code is sometimes invalid, but
    sometimes valid yet meaning something else, in C++.)
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
     
    Chris Torek, Jul 22, 2006
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Sudsy
    Replies:
    1
    Views:
    339
    Brad BARCLAY
    Jan 7, 2004
  2. RobM
    Replies:
    1
    Views:
    369
  3. mikea_59
    Replies:
    1
    Views:
    615
    Martin Honnen
    Jan 13, 2005
  4. metal
    Replies:
    2
    Views:
    291
    Kay Schluehr
    Dec 12, 2009
  5. Replies:
    9
    Views:
    130
Loading...

Share This Page