Union and pointer casts?

Discussion in 'C Programming' started by Jef Driesen, Feb 24, 2011.

  1. Jef Driesen

    Jef Driesen Guest

    Hi,

    Suppose I have two distinct data structures:

    typedef struct foo_t {
    ...
    } foo_t;

    typedef struct bar_t {
    ...
    } bar_t;

    and a function that receives a pointer to such a structure, together with a type
    to indicate which structure is being passed:

    typedef enum data_type_t {
    DATA_TYPE_FOO,
    DATA_TYPE_BAR
    } data_type_t;

    void myfunction (data_type_t type, void *data)
    {
    foo_t *foo = data;
    foo_t *bar = data;

    switch (type) {
    case DATA_TYPE_FOO:
    /* Use foo here */
    break:
    case DATA_TYPE_BAR:
    /* Use bar here */
    break:
    default:
    return;
    }
    }

    A typical usage would be like this:

    int main(void)
    {
    foo_t foo;
    bar_t bar;

    myfunction (DATA_TYPE_FOO, &foo);
    myfunction (DATA_TYPE_BAR, &bar);

    return 0;
    }

    Is it portable to replace the separate variables and explicit casts with a union?

    typedef union foobar_t {
    bar_t bar;
    foo_t foo;
    } foobar_t;

    void myfunction (data_type_t type, void *data)
    {
    foobar_t *foobar = data;

    switch (type) {
    case DATA_TYPE_FOO:
    /* Use foobar->foo here */
    break:
    case DATA_TYPE_BAR:
    /* Use foobar->bar here */
    break:
    default:
    return;
    }
    }

    I think this is a portable construct, but I'm not 100% sure. Note that it's not
    my intent to try to interpret a foo_t as a bar_t. The main purpose of the union
    is to improve the readability of the code (my real code has many more foo and
    bar structs).

    Jef
     
    Jef Driesen, Feb 24, 2011
    #1
    1. Advertising

  2. Jef Driesen <> wrote:
    > Suppose I have two distinct data structures:

    <snip>
    > and a function that receives a pointer to such a structure, together with a type to indicate which structure is being passed:

    <snip>
    > Is it portable to replace the separate variables and explicit casts with a union?


    That is the main use of unions. You might consider this pattern, called a “tagged unionâ€:

    struct foo {…};
    struct bar {…};

    struct foobar {
    enum {
    T_FOO,
    T_BAR,
    } tag;
    union {
    struct foo foo;
    struct bar bar;
    } data;
    };

    void myfunction(struct foobar *foobar) {
    switch(foobar->type) {
    case T_FOO:
    /* use foobar->data->foo here */
    break;
    case T_BAR:
    /* use foobar->data->bar here */
    break;
    default:
    fprintf(stderr, "bad type\n");
    abort();
    }

    There are extensions to C (MSVC, Plan 9, gcc with -fms-extensions) that allow
    you to not name the union and to refer to foobar->foo or foobar->bar directly;
    a version of this will be in the C1x standard.

    (Plan 9’s compiler also allowed

    typedef struct foo {int bas} foo;
    typedef struct bar {int quux} foo;

    struct foobar {
    foo;
    bar;
    };

    void func(struct foobar f) {
    assert(f.bas == f.quux);
    }

    i.e., using the typedef name to declare an anonymous structure. The current
    C1x draft allows that, but N1549 makes clear that this was *not* intended, &
    will be removed. Shame, that; it’s a cool & useful feature.)

    —Joel

    N1549: <http://open-std.org/jtc1/sc22/wg14/www/docs/n1549.pdf>
     
    Joel C. Salomon, Feb 24, 2011
    #2
    1. Advertising

  3. Jef Driesen

    Tim Rentsch Guest

    Jef Driesen <> writes:

    > Hi,
    >
    > Suppose I have two distinct data structures:
    >
    > typedef struct foo_t {
    > ...
    > } foo_t;
    >
    > typedef struct bar_t {
    > ...
    > } bar_t;
    >
    > and a function that receives a pointer to such a structure, together
    > with a type to indicate which structure is being passed:
    >
    > typedef enum data_type_t {
    > DATA_TYPE_FOO,
    > DATA_TYPE_BAR
    > } data_type_t;
    >
    > void myfunction (data_type_t type, void *data)
    > {
    > foo_t *foo = data;
    > foo_t *bar = data;
    >
    > switch (type) {
    > case DATA_TYPE_FOO:
    > /* Use foo here */
    > break:
    > case DATA_TYPE_BAR:
    > /* Use bar here */
    > break:
    > default:
    > return;
    > }
    > }
    >
    > A typical usage would be like this:
    >
    > int main(void)
    > {
    > foo_t foo;
    > bar_t bar;
    >
    > myfunction (DATA_TYPE_FOO, &foo);
    > myfunction (DATA_TYPE_BAR, &bar);
    >
    > return 0;
    > }
    >
    > Is it portable to replace the separate variables and explicit casts with a union?
    >
    > typedef union foobar_t {
    > bar_t bar;
    > foo_t foo;
    > } foobar_t;
    >
    > void myfunction (data_type_t type, void *data)
    > {
    > foobar_t *foobar = data;
    >
    > switch (type) {
    > case DATA_TYPE_FOO:
    > /* Use foobar->foo here */
    > break:
    > case DATA_TYPE_BAR:
    > /* Use foobar->bar here */
    > break:
    > default:
    > return;
    > }
    > }
    >
    > I think this is a portable construct, but I'm not 100% sure. Note that
    > it's not my intent to try to interpret a foo_t as a bar_t. The main
    > purpose of the union is to improve the readability of the code (my
    > real code has many more foo and bar structs).


    If called from your example main() function above, technically
    this last function crosses over into undefined behavior. In
    fact the undefined behavior happens even before getting to
    the switch() statement.

    To see why this is true, remember what we did: we took a
    pointer to a foo_t or bar_t, and converted that to a 'void *'.
    Okay, nothing wrong with that. But then, in the revised
    myfunction(), we took the 'void *' pointer value and converted
    it to a pointer to a foobar_t (the union type). The union type
    may have (ie, the Standard allows it to have) a more restrictive
    alignment requirement than the struct types. Hence, upon doing
    the conversion of a struct pointer (in the guise of a 'void *',
    but still pointing to one of the structs), we could get a pointer
    that is not correctly aligned for access to the union type. The
    Standard says clearly that if the resulting pointer value is not
    correctly aligned for the target type then the behavior is
    undefined.

    If I had to take a bet at even money on this, I would bet that
    this code would actually work on a platform chosen at random.
    But, if what you're looking for is code that is within the bounds
    of the Standard requires to work portably, this approach isn't it.
     
    Tim Rentsch, Feb 24, 2011
    #3
  4. Jef Driesen

    Jef Driesen Guest

    On 24/02/11 22:02, Joel C. Salomon wrote:
    > Jef Driesen<> wrote:
    >> Suppose I have two distinct data structures:

    > <snip>
    >> and a function that receives a pointer to such a structure, together with a type to indicate which structure is being passed:

    > <snip>
    >> Is it portable to replace the separate variables and explicit casts with a union?

    >
    > That is the main use of unions. You might consider this pattern, called a “tagged union”:
    >
    > struct foo {…};
    > struct bar {…};
    >
    > struct foobar {
    > enum {
    > T_FOO,
    > T_BAR,
    > } tag;
    > union {
    > struct foo foo;
    > struct bar bar;
    > } data;
    > };
    >
    > void myfunction(struct foobar *foobar) {
    > switch(foobar->type) {
    > case T_FOO:
    > /* use foobar->data->foo here */
    > break;
    > case T_BAR:
    > /* use foobar->data->bar here */
    > break;
    > default:
    > fprintf(stderr, "bad type\n");
    > abort();
    > }


    This is something I don't want to do, because the foo and bar data types are
    part of a library where I want to be able to add new data types without breaking
    backwards compatibility. But adding new structs to the union may change its size
    and hence break backwards compatibility.

    If the union is not part of the public api and used only internally, that's not
    an issue. The union would just be a convenient way to avoid doing explicit casts.
     
    Jef Driesen, Feb 24, 2011
    #4
  5. Jef Driesen

    Paul N Guest

    On Feb 24, 8:14 pm, Jef Driesen <>
    wrote:
    > Hi,
    >
    > Suppose I have two distinct data structures:
    >
    > typedef struct foo_t {
    >     ...
    >
    > } foo_t;
    >
    > typedef struct bar_t {
    >     ...
    >
    > } bar_t;
    >
    > and a function that receives a pointer to such a structure, together witha type
    > to indicate which structure is being passed:
    >
    > typedef enum data_type_t {
    >     DATA_TYPE_FOO,
    >     DATA_TYPE_BAR
    >
    > } data_type_t;
    >
    > void myfunction (data_type_t type, void *data)
    > {
    >     foo_t *foo = data;
    >     foo_t *bar = data;
    >
    >     switch (type) {
    >     case DATA_TYPE_FOO:
    >        /* Use foo here */
    >        break:
    >     case DATA_TYPE_BAR:
    >        /* Use bar here */
    >        break:
    >     default:
    >        return;
    >     }
    >
    > }
    >
    > A typical usage would be like this:
    >
    > int main(void)
    > {
    >     foo_t foo;
    >     bar_t bar;
    >
    >     myfunction (DATA_TYPE_FOO, &foo);
    >     myfunction (DATA_TYPE_BAR, &bar);
    >
    >     return 0;
    >
    > }
    >
    > Is it portable to replace the separate variables and explicit casts with a union?
    >
    > typedef union foobar_t {
    >     bar_t bar;
    >     foo_t foo;
    >
    > } foobar_t;
    >
    > void myfunction (data_type_t type, void *data)
    > {
    >     foobar_t *foobar = data;
    >
    >     switch (type) {
    >     case DATA_TYPE_FOO:
    >        /* Use foobar->foo here */
    >        break:
    >     case DATA_TYPE_BAR:
    >        /* Use foobar->bar here */
    >        break:
    >     default:
    >        return;
    >     }
    >
    > }
    >
    > I think this is a portable construct, but I'm not 100% sure. Note that it's not
    > my intent to try to interpret a foo_t as a bar_t. The main purpose of theunion
    > is to improve the readability of the code (my real code has many more fooand
    > bar structs).


    As an alternative suggestion, why not have a union consisting of a
    foo_t * and a bar_t * ?
     
    Paul N, Feb 24, 2011
    #5
  6. Jonathan Leffler <> writes:

    > On 2/24/11 12:14 PM, Jef Driesen wrote:
    >> Suppose I have two distinct data structures:
    >>
    >> typedef struct foo_t {
    >> ...
    >> } foo_t;
    >>
    >> typedef struct bar_t {
    >> ...
    >> } bar_t;
    >>
    >> and a function that receives a pointer to such a structure, together
    >> with a type to indicate which structure is being passed:
    >>
    >> typedef enum data_type_t {
    >> DATA_TYPE_FOO,
    >> DATA_TYPE_BAR
    >> } data_type_t;
    >>
    >> void myfunction (data_type_t type, void *data)
    >> {
    >> foo_t *foo = data;
    >> foo_t *bar = data;
    >>
    >> switch (type) {
    >> case DATA_TYPE_FOO:
    >> /* Use foo here */
    >> break:
    >> case DATA_TYPE_BAR:
    >> /* Use bar here */
    >> break:
    >> default:
    >> return;
    >> }
    >> }
    >>
    >> A typical usage would be like this:
    >>
    >> int main(void)
    >> {
    >> foo_t foo;
    >> bar_t bar;
    >>
    >> myfunction (DATA_TYPE_FOO, &foo);
    >> myfunction (DATA_TYPE_BAR, &bar);
    >>
    >> return 0;
    >> }
    >>
    >> Is it portable to replace the separate variables and explicit casts with
    >> a union?


    Your code probably does have explicit casts but they've gone from the
    example you posted.

    >>
    >> typedef union foobar_t {
    >> bar_t bar;
    >> foo_t foo;
    >> } foobar_t;
    >>
    >> void myfunction (data_type_t type, void *data)
    >> {
    >> foobar_t *foobar = data;
    >>
    >> switch (type) {
    >> case DATA_TYPE_FOO:
    >> /* Use foobar->foo here */
    >> break:
    >> case DATA_TYPE_BAR:
    >> /* Use foobar->bar here */
    >> break:
    >> default:
    >> return;
    >> }
    >> }
    >>
    >> I think this is a portable construct, but I'm not 100% sure. Note that
    >> it's not my intent to try to interpret a foo_t as a bar_t. The main
    >> purpose of the union is to improve the readability of the code (my real
    >> code has many more foo and bar structs).

    >
    > I believe it would be portable. You could reasonably change the second
    > parameter of myfunction() to a 'foobar_t *', of course.


    But that would require a whole lot more casts. The program has no data
    that is actually of the union type (it's a figment designed to simplify
    the code) so the conversion from the struct pointer to the union pointer
    will require a cast (though it may be simply a cast to void *).

    To the OP: Have you considered function pointers? Your myfunction
    function would reduce to

    dispatch[type](data);

    and each of functions in the dispatch table would look like this:

    void myfunction_foo(void *data)
    {
    foo_t *foo = data;
    /* whatever the switch case did */
    }

    It means writing a function per case, but there is not that much more
    noise in the functions than there is in the switch statement.

    --
    Ben.
     
    Ben Bacarisse, Feb 24, 2011
    #6
  7. Jef Driesen

    Jef Driesen Guest

    On 24/02/11 22:02, Tim Rentsch wrote:
    > Jef Driesen<> writes:
    >
    >> Hi,
    >>
    >> Suppose I have two distinct data structures:
    >>
    >> typedef struct foo_t {
    >> ...
    >> } foo_t;
    >>
    >> typedef struct bar_t {
    >> ...
    >> } bar_t;
    >>
    >> and a function that receives a pointer to such a structure, together
    >> with a type to indicate which structure is being passed:
    >>
    >> typedef enum data_type_t {
    >> DATA_TYPE_FOO,
    >> DATA_TYPE_BAR
    >> } data_type_t;
    >>
    >> void myfunction (data_type_t type, void *data)
    >> {
    >> foo_t *foo = data;
    >> foo_t *bar = data;
    >>
    >> switch (type) {
    >> case DATA_TYPE_FOO:
    >> /* Use foo here */
    >> break:
    >> case DATA_TYPE_BAR:
    >> /* Use bar here */
    >> break:
    >> default:
    >> return;
    >> }
    >> }
    >>
    >> A typical usage would be like this:
    >>
    >> int main(void)
    >> {
    >> foo_t foo;
    >> bar_t bar;
    >>
    >> myfunction (DATA_TYPE_FOO,&foo);
    >> myfunction (DATA_TYPE_BAR,&bar);
    >>
    >> return 0;
    >> }
    >>
    >> Is it portable to replace the separate variables and explicit casts with a union?
    >>
    >> typedef union foobar_t {
    >> bar_t bar;
    >> foo_t foo;
    >> } foobar_t;
    >>
    >> void myfunction (data_type_t type, void *data)
    >> {
    >> foobar_t *foobar = data;
    >>
    >> switch (type) {
    >> case DATA_TYPE_FOO:
    >> /* Use foobar->foo here */
    >> break:
    >> case DATA_TYPE_BAR:
    >> /* Use foobar->bar here */
    >> break:
    >> default:
    >> return;
    >> }
    >> }
    >>
    >> I think this is a portable construct, but I'm not 100% sure. Note that
    >> it's not my intent to try to interpret a foo_t as a bar_t. The main
    >> purpose of the union is to improve the readability of the code (my
    >> real code has many more foo and bar structs).

    >
    > If called from your example main() function above, technically
    > this last function crosses over into undefined behavior. In
    > fact the undefined behavior happens even before getting to
    > the switch() statement.
    >
    > To see why this is true, remember what we did: we took a
    > pointer to a foo_t or bar_t, and converted that to a 'void *'.
    > Okay, nothing wrong with that. But then, in the revised
    > myfunction(), we took the 'void *' pointer value and converted
    > it to a pointer to a foobar_t (the union type). The union type
    > may have (ie, the Standard allows it to have) a more restrictive
    > alignment requirement than the struct types. Hence, upon doing
    > the conversion of a struct pointer (in the guise of a 'void *',
    > but still pointing to one of the structs), we could get a pointer
    > that is not correctly aligned for access to the union type. The
    > Standard says clearly that if the resulting pointer value is not
    > correctly aligned for the target type then the behavior is
    > undefined.


    If this is indeed a potential problem, then why doesn't the same logic apply to
    my first example too? Here I did cast the data pointer to all possible struct
    types, while only one of them will be the correct one:

    foo_t *foo = data;
    bar_t *bar = data;

    Assuming the real type was of type foo_t, the bar variable may now point to a
    struct which may have different alignment requirements. Or am I seeing this wrong?

    I suppose the correct way would be to cast only to the correct type:

    void myfunction (data_type_t type, void *data)
    {
    foo_t *foo = NULL;
    bar_t *bar = NULL;

    switch (type) {
    case DATA_TYPE_FOO:
    foo = data;
    /* Use foo here */
    break:
    case DATA_TYPE_BAR:
    foo = data;
    /* Use bar here */
    break:
    default:
    return;
    }
    }

    or get rid of the foo and bar variables and cast the data pointer everywhere
    where it is accessed:

    ((foo_t *) data)->member

    But this is ugly and error-prone, especially when you have to do this cast often.

    > If I had to take a bet at even money on this, I would bet that
    > this code would actually work on a platform chosen at random.
    > But, if what you're looking for is code that is within the bounds
    > of the Standard requires to work portably, this approach isn't it.


    I prefer portable code, but I don't want to take it to the extreme either.
     
    Jef Driesen, Feb 25, 2011
    #7
  8. Jef Driesen <> writes:
    > On 24/02/11 22:02, Tim Rentsch wrote:

    <snip>
    >> To see why this is true, remember what we did: we took a
    >> pointer to a foo_t or bar_t, and converted that to a 'void *'.
    >> Okay, nothing wrong with that. But then, in the revised
    >> myfunction(), we took the 'void *' pointer value and converted
    >> it to a pointer to a foobar_t (the union type). The union type
    >> may have (ie, the Standard allows it to have) a more restrictive
    >> alignment requirement than the struct types. Hence, upon doing
    >> the conversion of a struct pointer (in the guise of a 'void *',
    >> but still pointing to one of the structs), we could get a pointer
    >> that is not correctly aligned for access to the union type. The
    >> Standard says clearly that if the resulting pointer value is not
    >> correctly aligned for the target type then the behavior is
    >> undefined.

    >
    > If this is indeed a potential problem, then why doesn't the same logic
    > apply to my first example too?


    The key information is in the text I've left quoted: union types may
    require stricter alignment than pointer types. All pointers to
    structure types have the same alignment requirements as do all pointers
    to union types, but they don't have the same alignment requirements as
    each other.

    I agree with what Tim says (in a part I snipped) that it is a reasonable
    bet that this will work but it is not guaranteed.

    > possible struct types, while only one of them will be the correct one:
    >
    > foo_t *foo = data;
    > bar_t *bar = data;
    >
    > Assuming the real type was of type foo_t, the bar variable may now
    > point to a struct which may have different alignment requirements. Or
    > am I seeing this wrong?
    >
    > I suppose the correct way would be to cast only to the correct type:
    >
    > void myfunction (data_type_t type, void *data)
    > {
    > foo_t *foo = NULL;
    > bar_t *bar = NULL;
    >
    > switch (type) {
    > case DATA_TYPE_FOO:
    > foo = data;
    > /* Use foo here */
    > break:
    > case DATA_TYPE_BAR:
    > foo = data;
    > /* Use bar here */
    > break:
    > default:
    > return;
    > }
    > }
    >
    > or get rid of the foo and bar variables and cast the data pointer
    > everywhere where it is accessed:


    What you originally has was fine because you can cover the void * to any
    structure type without undefined behaviour (they all have the same
    alignment requirements after all) provided that you don't access the
    "wrong" structure, and your original code ensured that that did not
    happen.

    > ((foo_t *) data)->member
    >
    > But this is ugly and error-prone, especially when you have to do this
    > cast often.
    >
    >> If I had to take a bet at even money on this, I would bet that
    >> this code would actually work on a platform chosen at random.
    >> But, if what you're looking for is code that is within the bounds
    >> of the Standard requires to work portably, this approach isn't it.

    >
    > I prefer portable code, but I don't want to take it to the extreme
    > either.


    That's a tough call. Someone suggested putting the pointers into a
    union instead of the structures. That works but it is not very
    convenient unless you can use C99's compound literals at the point of
    call. Of course, using compound literals has portability implications
    too.

    --
    Ben.
     
    Ben Bacarisse, Feb 25, 2011
    #8
  9. Jef Driesen wrote:
    > I suppose the correct way would be to cast only to the correct type:
    >
    > void myfunction (data_type_t type, void *data)
    > {
    > foo_t *foo = NULL;
    > bar_t *bar = NULL;
    >
    > switch (type) {
    > case DATA_TYPE_FOO:
    > foo = data;
    > /* Use foo here */
    > break:
    > case DATA_TYPE_BAR:
    > foo = data;
    > /* Use bar here */
    > break:
    > default:
    > return;
    > }
    > }


    Why not try this:

    void myfunction (data_type_t type, void *data) {
    switch (type) {
    case DATA_TYPE_FOO:
    foo_t *foo = data;
    /* use foo here */
    break;
    case DATA_TYPE_BAR:
    bar_t *bar = data;
    /* Use bar here */
    break;
    default:
    return;
    }
    }

    i.e., only even defining `foo` & `bar` where they are needed.
    012345678901234567890123456789012345678901234567890123456789012345678901234|6789
    (Well actually, `foo` is in-scope but uninitialized in the `bar` case. The
    compiler might well catch it if you accidentally use it there, or you can add
    blocks, e.g.,

    …
    case DATA_TYPE_BAR: {
    bar_t *bar = data;
    /* Use bar here */
    } break;
    …

    and you can use this technique in a pre-C99 compiler, too.)

    —Joel
     
    Joel C. Salomon, Feb 25, 2011
    #9
  10. Jef Driesen

    Tim Rentsch Guest

    Jef Driesen <> writes:

    > On 24/02/11 22:02, Tim Rentsch wrote:
    >> Jef Driesen<> writes:
    >>
    >>> Hi,
    >>>
    >>> Suppose I have two distinct data structures:
    >>>
    >>> typedef struct foo_t {
    >>> ...
    >>> } foo_t;
    >>>
    >>> typedef struct bar_t {
    >>> ...
    >>> } bar_t;
    >>>
    >>> and a function that receives a pointer to such a structure, together
    >>> with a type to indicate which structure is being passed:
    >>>
    >>> typedef enum data_type_t {
    >>> DATA_TYPE_FOO,
    >>> DATA_TYPE_BAR
    >>> } data_type_t;
    >>>
    >>> void myfunction (data_type_t type, void *data)
    >>> {
    >>> foo_t *foo = data;
    >>> foo_t *bar = data;
    >>>
    >>> switch (type) {
    >>> case DATA_TYPE_FOO:
    >>> /* Use foo here */
    >>> break:
    >>> case DATA_TYPE_BAR:
    >>> /* Use bar here */
    >>> break:
    >>> default:
    >>> return;
    >>> }
    >>> }
    >>>
    >>> A typical usage would be like this:
    >>>
    >>> int main(void)
    >>> {
    >>> foo_t foo;
    >>> bar_t bar;
    >>>
    >>> myfunction (DATA_TYPE_FOO,&foo);
    >>> myfunction (DATA_TYPE_BAR,&bar);
    >>>
    >>> return 0;
    >>> }
    >>>
    >>> Is it portable to replace the separate variables and explicit casts with a union?
    >>>
    >>> typedef union foobar_t {
    >>> bar_t bar;
    >>> foo_t foo;
    >>> } foobar_t;
    >>>
    >>> void myfunction (data_type_t type, void *data)
    >>> {
    >>> foobar_t *foobar = data;
    >>>
    >>> switch (type) {
    >>> case DATA_TYPE_FOO:
    >>> /* Use foobar->foo here */
    >>> break:
    >>> case DATA_TYPE_BAR:
    >>> /* Use foobar->bar here */
    >>> break:
    >>> default:
    >>> return;
    >>> }
    >>> }
    >>>
    >>> I think this is a portable construct, but I'm not 100% sure. Note that
    >>> it's not my intent to try to interpret a foo_t as a bar_t. The main
    >>> purpose of the union is to improve the readability of the code (my
    >>> real code has many more foo and bar structs).

    >>
    >> If called from your example main() function above, technically
    >> this last function crosses over into undefined behavior. In
    >> fact the undefined behavior happens even before getting to
    >> the switch() statement.
    >>
    >> To see why this is true, remember what we did: we took a
    >> pointer to a foo_t or bar_t, and converted that to a 'void *'.
    >> Okay, nothing wrong with that. But then, in the revised
    >> myfunction(), we took the 'void *' pointer value and converted
    >> it to a pointer to a foobar_t (the union type). The union type
    >> may have (ie, the Standard allows it to have) a more restrictive
    >> alignment requirement than the struct types. Hence, upon doing
    >> the conversion of a struct pointer (in the guise of a 'void *',
    >> but still pointing to one of the structs), we could get a pointer
    >> that is not correctly aligned for access to the union type. The
    >> Standard says clearly that if the resulting pointer value is not
    >> correctly aligned for the target type then the behavior is
    >> undefined.

    >
    > If this is indeed a potential problem, then why doesn't the same logic
    > apply to my first example too? Here I did cast the data pointer to all
    > possible struct types, while only one of them will be the correct one:
    >
    > foo_t *foo = data;
    > bar_t *bar = data;
    >
    > Assuming the real type was of type foo_t, the bar variable may now
    > point to a struct which may have different alignment requirements. Or
    > am I seeing this wrong?


    You're right, this usage is also undefined behavior. I didn't
    notice earlier because I was focused on the question about using
    unions.


    > I suppose the correct way would be to cast only to the correct type:
    >
    > void myfunction (data_type_t type, void *data)
    > {
    > foo_t *foo = NULL;
    > bar_t *bar = NULL;
    >
    > switch (type) {
    > case DATA_TYPE_FOO:
    > foo = data;
    > /* Use foo here */
    > break:
    > case DATA_TYPE_BAR:
    > foo = data;
    > /* Use bar here */
    > break:
    > default:
    > return;
    > }
    > }
    >
    > or get rid of the foo and bar variables and cast the data pointer
    > everywhere where it is accessed:
    >
    > ((foo_t *) data)->member
    >
    > But this is ugly and error-prone, especially when you have to do this cast often.
    >
    >> If I had to take a bet at even money on this, I would bet that
    >> this code would actually work on a platform chosen at random.
    >> But, if what you're looking for is code that is within the bounds
    >> of the Standard requires to work portably, this approach isn't it.

    >
    > I prefer portable code, but I don't want to take it to the extreme either.


    How about this way (please excuse a minor reformating):

    void
    myfunction( data_type_t type, void *data ){
    switch (type) {

    case DATA_TYPE_FOO: {
    foo_t *foo = data;
    /* Use foo here */
    break:
    }

    case DATA_TYPE_BAR: {
    bar_t *bar = data;
    /* Use bar here */
    break:
    }

    }
    }

    Not too bad aesthetics-wise, and completely portable (assuming of
    course the calls are right).
     
    Tim Rentsch, Feb 25, 2011
    #10
  11. Jef Driesen

    Tim Rentsch Guest

    Ben Bacarisse <> writes:

    > Jef Driesen <> writes:
    >> On 24/02/11 22:02, Tim Rentsch wrote:

    > <snip>
    >>> To see why this is true, remember what we did: we took a
    >>> pointer to a foo_t or bar_t, and converted that to a 'void *'.
    >>> Okay, nothing wrong with that. But then, in the revised
    >>> myfunction(), we took the 'void *' pointer value and converted
    >>> it to a pointer to a foobar_t (the union type). The union type
    >>> may have (ie, the Standard allows it to have) a more restrictive
    >>> alignment requirement than the struct types. Hence, upon doing
    >>> the conversion of a struct pointer (in the guise of a 'void *',
    >>> but still pointing to one of the structs), we could get a pointer
    >>> that is not correctly aligned for access to the union type. The
    >>> Standard says clearly that if the resulting pointer value is not
    >>> correctly aligned for the target type then the behavior is
    >>> undefined.

    >>
    >> If this is indeed a potential problem, then why doesn't the same logic
    >> apply to my first example too?

    >
    > The key information is in the text I've left quoted: union types may
    > require stricter alignment than pointer types. All pointers to
    > structure types have the same alignment requirements as do all pointers
    > to union types, but they don't have the same alignment requirements as
    > each other.
    >
    > I agree with what Tim says (in a part I snipped) that it is a reasonable
    > bet that this will work but it is not guaranteed.
    >
    >> possible struct types, while only one of them will be the correct one:
    >>
    >> foo_t *foo = data;
    >> bar_t *bar = data;
    >>
    >> Assuming the real type was of type foo_t, the bar variable may now
    >> point to a struct which may have different alignment requirements. Or
    >> am I seeing this wrong?
    >>
    >> I suppose the correct way would be to cast only to the correct type:
    >>
    >> void myfunction (data_type_t type, void *data)
    >> {
    >> foo_t *foo = NULL;
    >> bar_t *bar = NULL;
    >>
    >> switch (type) {
    >> case DATA_TYPE_FOO:
    >> foo = data;
    >> /* Use foo here */
    >> break:
    >> case DATA_TYPE_BAR:
    >> foo = data;
    >> /* Use bar here */
    >> break:
    >> default:
    >> return;
    >> }
    >> }
    >>
    >> or get rid of the foo and bar variables and cast the data pointer
    >> everywhere where it is accessed:

    >
    > What you originally has was fine because you can cover the void * to any
    > structure type without undefined behaviour (they all have the same
    > alignment requirements after all) provided that you don't access the
    > "wrong" structure, and your original code ensured that that did not
    > happen.


    This isn't right. The two struct pointer _variables_ have the same
    alignment requirements, but the pointer _values_ are pointing to
    struct types that may have different alignment requirements. The
    relevant requirement statement (from 6.3.2.3p7) is

    If the resulting pointer is not correctly aligned for the
    pointed-to type, the behavior is undefined.

    The 'pointed-to' type is a structure type, and the converted
    pointer values might not be correctly aligned for a struct
    type other than that of the actual argument.
     
    Tim Rentsch, Feb 25, 2011
    #11
  12. Tim Rentsch <> writes:

    > Ben Bacarisse <> writes:

    <snip>
    >> What you originally has was fine because you can cover the void * to any
    >> structure type without undefined behaviour (they all have the same
    >> alignment requirements after all) provided that you don't access the
    >> "wrong" structure, and your original code ensured that that did not
    >> happen.

    >
    > This isn't right. The two struct pointer _variables_ have the same
    > alignment requirements, but the pointer _values_ are pointing to
    > struct types that may have different alignment requirements. The
    > relevant requirement statement (from 6.3.2.3p7) is
    >
    > If the resulting pointer is not correctly aligned for the
    > pointed-to type, the behavior is undefined.
    >
    > The 'pointed-to' type is a structure type, and the converted
    > pointer values might not be correctly aligned for a struct
    > type other than that of the actual argument.


    Yes, you are right. I'd read this:

    "[a]ll pointers to structure types shall have the same representation
    and alignment requirements as each other" (6.2.5 p27)

    as referring to the alignment of the pointed-to object rather than the
    pointer object itself. The wording is slightly ambiguous because
    whether a pointer is aligned or not *does* refer to the pointer-to type.
    Thus A pointer may have its alignment requirements met (as per 6.2.5
    p27) and yet not be correctly aligned (as per 6.3.2.3 p7)!

    --
    Ben.
     
    Ben Bacarisse, Feb 25, 2011
    #12
  13. Jef Driesen

    Jef Driesen Guest

    On 24/02/11 21:14, Jef Driesen wrote:
    > Hi,
    >
    > Suppose I have two distinct data structures:
    >
    > typedef struct foo_t {
    > ...
    > } foo_t;
    >
    > typedef struct bar_t {
    > ...
    > } bar_t;
    >
    > and a function that receives a pointer to such a structure, together with a type
    > to indicate which structure is being passed:
    >
    > typedef enum data_type_t {
    > DATA_TYPE_FOO,
    > DATA_TYPE_BAR
    > } data_type_t;
    >
    > void myfunction (data_type_t type, void *data)
    > {
    > foo_t *foo = data;
    > bar_t *bar = data;
    >
    > switch (type) {
    > case DATA_TYPE_FOO:
    > /* Use foo here */
    > break:
    > case DATA_TYPE_BAR:
    > /* Use bar here */
    > break:
    > default:
    > return;
    > }
    > }
    >
    > A typical usage would be like this:
    >
    > int main(void)
    > {
    > foo_t foo;
    > bar_t bar;
    >
    > myfunction (DATA_TYPE_FOO,&foo);
    > myfunction (DATA_TYPE_BAR,&bar);
    >
    > return 0;
    > }
    >
    > Is it portable to replace the separate variables and explicit casts with a union?
    >
    > typedef union foobar_t {
    > bar_t bar;
    > foo_t foo;
    > } foobar_t;
    >
    > [...]


    How about the reverse: casting a pointer to union to a pointer to one of the
    structs. Like in this code snippet:

    int main(void)
    {
    foobar_t foobar;

    foobar.foo = ...;
    myfunction (DATA_TYPE_FOO, &foobar);

    foobar.bar = ...;
    myfunction (DATA_TYPE_BAR, &foobar);

    return 0;
    }

    I think this is a portable construct, although I'm not 100% sure.
     
    Jef Driesen, Apr 7, 2011
    #13
  14. Jef Driesen <> writes:

    > On 24/02/11 21:14, Jef Driesen wrote:
    >> Hi,
    >>
    >> Suppose I have two distinct data structures:
    >>
    >> typedef struct foo_t {
    >> ...
    >> } foo_t;
    >>
    >> typedef struct bar_t {
    >> ...
    >> } bar_t;
    >>
    >> and a function that receives a pointer to such a structure, together with a type
    >> to indicate which structure is being passed:
    >>
    >> typedef enum data_type_t {
    >> DATA_TYPE_FOO,
    >> DATA_TYPE_BAR
    >> } data_type_t;
    >>
    >> void myfunction (data_type_t type, void *data)
    >> {
    >> foo_t *foo = data;
    >> bar_t *bar = data;
    >>
    >> switch (type) {
    >> case DATA_TYPE_FOO:
    >> /* Use foo here */
    >> break:
    >> case DATA_TYPE_BAR:
    >> /* Use bar here */
    >> break:
    >> default:
    >> return;
    >> }
    >> }
    >>
    >> A typical usage would be like this:
    >>
    >> int main(void)
    >> {
    >> foo_t foo;
    >> bar_t bar;
    >>
    >> myfunction (DATA_TYPE_FOO,&foo);
    >> myfunction (DATA_TYPE_BAR,&bar);
    >>
    >> return 0;
    >> }
    >>
    >> Is it portable to replace the separate variables and explicit casts with a union?
    >>
    >> typedef union foobar_t {
    >> bar_t bar;
    >> foo_t foo;
    >> } foobar_t;
    >>
    >> [...]

    >
    > How about the reverse: casting a pointer to union to a pointer to one
    > of the structs. Like in this code snippet:
    >
    > int main(void)
    > {
    > foobar_t foobar;
    >
    > foobar.foo = ...;
    > myfunction (DATA_TYPE_FOO, &foobar);
    >
    > foobar.bar = ...;
    > myfunction (DATA_TYPE_BAR, &foobar);
    >
    > return 0;
    > }
    >
    > I think this is a portable construct, although I'm not 100% sure.


    Yes that's fine but I'd re-word your description of it. Casts are used
    to perform conversions, but a conversion without a cast is just a
    conversion.

    I'd be tempted to write

    myfunction (DATA_TYPE_FOO, &foobar.foo);

    just because it is so much more explicit, but I don't think it makes
    much difference.

    --
    Ben.
     
    Ben Bacarisse, Apr 8, 2011
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Matt Garman
    Replies:
    1
    Views:
    694
    Matt Garman
    Apr 25, 2004
  2. Wolfgang Jeltsch

    pointer casts and the heap

    Wolfgang Jeltsch, Aug 30, 2003, in forum: C++
    Replies:
    7
    Views:
    390
    Kevin Goodsell
    Aug 31, 2003
  3. Peter Dunker

    union in struct without union name

    Peter Dunker, Apr 26, 2004, in forum: C Programming
    Replies:
    2
    Views:
    932
    Chris Torek
    Apr 26, 2004
  4. Pointer to Structure Casts

    , Sep 21, 2005, in forum: C Programming
    Replies:
    5
    Views:
    521
  5. Replies:
    8
    Views:
    352
    Kai-Uwe Bux
    Feb 10, 2009
Loading...

Share This Page