Address of a union member

Discussion in 'C Programming' started by Noob, Aug 13, 2009.

  1. Noob

    Noob Guest

    Hello,

    My compiler complains when I take the address of a member in a union.

    $ cat mu.c
    union foo
    {
    int i;
    double d;
    };

    int main(void)
    {
    union foo bar = { 0 };
    int *p = &(bar.i);
    return *p;
    }

    $ cc mu.c
    w "mu.c",L10/C12(#241): Address of a union member is being used as a
    | pointer. This may violate an assumption made by the optimizer.
    | To be safe, you should recompile your program at a lower
    | optimization level; or else, turn off the BEHAVED toggle.
    No errors 1 warning

    I don't see what the problem is, and gcc did not seem to mind.

    $ gcc -O2 -std=c89 -Wall -Wextra mu.c
    /* NO OUTPUT */

    Is this an aliasing problem?

    What am I doing wrong?

    Regards.
     
    Noob, Aug 13, 2009
    #1
    1. Advertising

  2. Noob

    Noob Guest

    Eric Sosman wrote:
    > Noob wrote:
    >> Hello,
    >>
    >> My compiler complains when I take the address of a member in a union.
    >>
    >> $ cat mu.c
    >> union foo
    >> {
    >> int i;
    >> double d;
    >> };
    >>
    >> int main(void)
    >> {
    >> union foo bar = { 0 };
    >> int *p = &(bar.i);
    >> return *p;
    >> }
    >>
    >> $ cc mu.c
    >> w "mu.c",L10/C12(#241): Address of a union member is being used as a
    >> | pointer. This may violate an assumption made by the optimizer.
    >> | To be safe, you should recompile your program at a lower
    >> | optimization level; or else, turn off the BEHAVED toggle.
    >> No errors 1 warning
    >>
    >> I don't see what the problem is, and gcc did not seem to mind.
    >>
    >> $ gcc -O2 -std=c89 -Wall -Wextra mu.c
    >> /* NO OUTPUT */
    >>
    >> Is this an aliasing problem?
    >>
    >> What am I doing wrong?

    >
    > Nothing that I can see. My guess is that the compiler wants
    > to assume that an int* and a double* point to different objects
    > since they point to different types. That is, if you pass both
    > &bar.i and &bar.d as arguments to
    >
    > void silly(int *ip, double *dp) {
    > *ip = 42;
    > *dp = 42.0;
    > *ip = 42;
    > }
    >
    > ... the compiler might assume that the two differently-typed
    > parameters point to two differently-typed and distinct objects,
    > so the third assignment could be omitted.
    >
    > ... but if I were you, I'd read my compiler's documentation
    > starting with what it says about "the BEHAVED toggle."


    Good advice :)

    """
    When it assumes that code is well-behaved, the compiler can be less conservative
    in generating code for pointer-based objects. Well-behaved code follows these rules:

    o The address of a union member is never assigned to a pointer.
    o A value of a pointer type is never cast to an incompatible pointer type.

    Given these assumptions, the compiler might be able to generate substantially
    better code in referencing pointer-based variables. The compiler issues an
    appropriate warning if either of these assumptions is violated in such a way as
    to affect assumptions made by the optimizer. You must decide whether the
    warnings can be safely ignored or whether the program should be compiled at a
    lower optimization level.

    CAUTION: The compiler might not catch all instances of misbehaved code.
    For example, a pointer-to-char might be passed to an undeclared
    (unprototyped) external function expecting a pointer-to-int.
    Therefore, it is possible for a program to compile at optimization
    level 6 without warnings (and run incorrectly), but run correctly
    when compiled at a lower optimization level.
    """

    I'm not sure what they mean by "You must decide whether the warnings can be
    safely ignored". How do I tell whether it is safe? :)

    Regards.
     
    Noob, Aug 13, 2009
    #2
    1. Advertising

  3. Noob <root@127.0.0.1> writes:

    > My compiler complains when I take the address of a member in a union.
    >
    > $ cat mu.c
    > union foo
    > {
    > int i;
    > double d;
    > };
    >
    > int main(void)
    > {
    > union foo bar = { 0 };
    > int *p = &(bar.i);
    > return *p;
    > }
    >
    > $ cc mu.c
    > w "mu.c",L10/C12(#241): Address of a union member is being used as a
    > | pointer. This may violate an assumption made by the optimizer.
    > | To be safe, you should recompile your program at a lower
    > | optimization level; or else, turn off the BEHAVED toggle.
    > No errors 1 warning
    >
    > I don't see what the problem is, and gcc did not seem to mind.
    >
    > $ gcc -O2 -std=c89 -Wall -Wextra mu.c
    > /* NO OUTPUT */
    >
    > Is this an aliasing problem?


    I think so, yes. In section 6.5 you will find:

    7 An object shall have its stored value accessed only by an lvalue
    expression that has one of the following types[76]

    — a type compatible with the effective type of the object,

    — a qualified version of a type compatible with the effective type
    of the object,

    — a type that is the signed or unsigned type corresponding to the
    effective type of the object,

    — a type that is the signed or unsigned type corresponding to a
    qualified version of the effective type of the object,

    — an aggregate or union type that includes one of the aforementioned
    types among its members (including, recursively, a member of a
    subaggregate or contained union), or

    — a character type.

    Footnote 76 is: "The intent of this list is to specify those
    circumstances in which an object may or may not be aliased."

    The compiler is warning you that I can assume that the union (in
    particular the other member) will not change as a result of changes
    though the pointer you have just obtained.

    There in not problem in taking the address, but the compiler can
    assume that *p does not change when u.d changes (and vice versa).
    This is often called the "strict aliasing rule".

    --
    Ben.
     
    Ben Bacarisse, Aug 13, 2009
    #3
  4. Noob

    Guest

    On Thu, 13 Aug 2009 18:30:29 +0200 Noob <root@127.0.0.1> wrote:

    | I'm not sure what they mean by "You must decide whether the warnings can be
    | safely ignored". How do I tell whether it is safe? :)

    You will have to understand what the optimization does ... on each platform
    .... and decide if that optimization will conflict with the behaviour being
    coded. For example, in the silly() function shown earlier, is it OK for the
    3rd assignment to be skipped?

    --
    -----------------------------------------------------------------------------
    | Phil Howard KA9WGN | http://linuxhomepage.com/ http://ham.org/ |
    | (first name) at ipal.net | http://phil.ipal.org/ http://ka9wgn.ham.org/ |
    -----------------------------------------------------------------------------
     
    , Aug 13, 2009
    #4
  5. Noob <root@127.0.0.1> writes:
    > My compiler complains when I take the address of a member in a union.
    >
    > $ cat mu.c
    > union foo
    > {
    > int i;
    > double d;
    > };
    >
    > int main(void)
    > {
    > union foo bar = { 0 };
    > int *p = &(bar.i);
    > return *p;
    > }
    >
    > $ cc mu.c
    > w "mu.c",L10/C12(#241): Address of a union member is being used as a
    > | pointer. This may violate an assumption made by the optimizer.
    > | To be safe, you should recompile your program at a lower
    > | optimization level; or else, turn off the BEHAVED toggle.
    > No errors 1 warning
    >
    > I don't see what the problem is, and gcc did not seem to mind.

    [...]

    Compilers are allowed to warn about anything they like.

    In this case, the compiler appears to be warning about a *potential*
    problem, not one that actually occurs in the code you posted.

    Consider, for example:

    union foo bar;
    int *p = &bar.i;
    *p = 10;
    bar.d = 12.34;
    printf("*p = %d\n", *p);

    The optimizer might assume that, since you just stored the value 10 in
    *p, the value 10 must still be there, and optimize the printf to
    something like puts("p = 10").

    (The standard has more to say about whether this behavior is defined
    or undefined and whether an optimizer is allowed to make this
    assumption; I don't have my copy of the standard handy right now and
    I'm too lazy to look it up.)

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Aug 13, 2009
    #5
  6. Eric Sosman <> wrote:
    > Noob wrote:
    > > Hello,
    > > My compiler complains when I take the address of a member
    > > in a union.

    <snip sample>
    > > Is this an aliasing problem?

    >
    > > What am I doing wrong?

    >
    >      Nothing that I can see.  My guess is that the compiler
    > wants to assume that an int* and a double* point to different
    > objects since they point to different types.  That is, if you
    > pass both &bar.i and &bar.d as arguments to
    >
    >         void silly(int *ip, double *dp) {
    >             *ip = 42;
    >             *dp = 42.0;
    >             *ip = 42;
    >         }
    >
    > ... the compiler might assume that the two differently-typed
    > parameters point to two differently-typed and distinct objects,
    > so the third assignment could be omitted.


    The first perhaps, but not the third. Strictly conforming code
    could detect the difference...

    union { int i; double d; } u;
    silly(&u.i, &u.d);
    printf("%d\n", u.i);

    --
    Peter
     
    Peter Nilsson, Aug 13, 2009
    #6
  7. Peter Nilsson <> writes:

    > Eric Sosman <> wrote:
    >> Noob wrote:
    >> > Hello,
    >> > My compiler complains when I take the address of a member
    >> > in a union.

    > <snip sample>
    >> > Is this an aliasing problem?

    >>
    >> > What am I doing wrong?

    >>
    >>      Nothing that I can see.  My guess is that the compiler
    >> wants to assume that an int* and a double* point to different
    >> objects since they point to different types.  That is, if you
    >> pass both &bar.i and &bar.d as arguments to
    >>
    >>         void silly(int *ip, double *dp) {
    >>             *ip = 42;
    >>             *dp = 42.0;
    >>             *ip = 42;
    >>         }
    >>
    >> ... the compiler might assume that the two differently-typed
    >> parameters point to two differently-typed and distinct objects,
    >> so the third assignment could be omitted.

    >
    > The first perhaps, but not the third. Strictly conforming code
    > could detect the difference...
    >
    > union { int i; double d; } u;
    > silly(&u.i, &u.d);
    > printf("%d\n", u.i);


    I disagree but given your history of being correct, I currently
    suspect that I have missed something here. Does not your snippet of
    code violate the "shall" from section 6.5 paragraph 7?

    --
    Ben.
     
    Ben Bacarisse, Aug 14, 2009
    #7
  8. Noob

    Nobody Guest

    On Thu, 13 Aug 2009 18:30:29 +0200, Noob wrote:

    > I'm not sure what they mean by "You must decide whether the warnings can be
    > safely ignored". How do I tell whether it is safe? :)


    Examine the resulting assembler output (or disassembly) to see if it does
    what you want.

    If you don't understand assembler, either reduce the optimisation level or
    avoid constructs which the compiler complains about.

    In this case, it looks like the compiler is being overly conservative
    about checking for potential aliasing bugs. If you are actually
    referencing both union members for the same object, that may well be a bug.
     
    Nobody, Aug 14, 2009
    #8
  9. Noob

    Richard Bos Guest

    Keith Thompson <> wrote:

    > Consider, for example:
    >
    > union foo bar;
    > int *p = &bar.i;
    > *p = 10;
    > bar.d = 12.34;
    > printf("*p = %d\n", *p);
    >
    > The optimizer might assume that, since you just stored the value 10 in
    > *p, the value 10 must still be there, and optimize the printf to
    > something like puts("p = 10").
    >
    > (The standard has more to say about whether this behavior is defined
    > or undefined and whether an optimizer is allowed to make this
    > assumption; I don't have my copy of the standard handy right now and
    > I'm too lazy to look it up.)


    You're reading a member of a union which is not the last member that has
    been assigned to. You're reading it indirectly, but you're still reading
    it. This means that its bytes have unspecified values, and therefore
    that its value may be a trap value[1]; hence, in theory undefined
    behaviour, but most likely to result in nonsense values. And AFAICT it's
    _allowed_ to result in the same nonsense value no matter what you store
    in bar.d, or even in different nonsense values even if you store the
    same value in bar.d more than once.

    Richard

    [1] Of the member not last assigned to, _not_ of the union as a, dare I
    say it, thing-in-itself.
     
    Richard Bos, Aug 16, 2009
    #9
  10. On Aug 14, 9:36 am, Ben Bacarisse <> wrote:
    > Peter Nilsson <> writes:
    > > Eric Sosman <> wrote:
    > > >      ... My guess is that the compiler
    > > > wants to assume that an int* and a double* point to
    > > > different objects since they point to different types.
    > > > That is, if you pass both &bar.i and &bar.d as arguments
    > > > to
    > > >
    > > >         void silly(int *ip, double *dp) {
    > > >             *ip = 42;
    > > >             *dp = 42.0;
    > > >             *ip = 42;
    > > >         }
    > > >
    > > > ... the compiler might assume that the two differently-
    > > > typed parameters point to two differently-typed and
    > > > distinct objects, so the third assignment could be
    > > > omitted.

    > >
    > > The first perhaps, but not the third. Strictly conforming
    > > code could detect the difference...
    > >
    > >   union { int i; double d; } u;
    > >   silly(&u.i, &u.d);
    > >   printf("%d\n", u.i);

    >
    > I disagree but given your history of being correct,


    It may have happened. I think it was a Tuesday. ;)

    > I currently suspect that I have missed something here.  Does
    > not your snippet of code violate the "shall" from section 6.5
    > paragraph 7?


    I don't see how. The last u.i accesses an object that was last
    assigned via an int lvalue. That assigment imposed the
    effective type. [6.5p6]

    --
    Peter
     
    Peter Nilsson, Aug 17, 2009
    #10
  11. Peter Nilsson <> writes:

    > On Aug 14, 9:36 am, Ben Bacarisse <> wrote:
    >> Peter Nilsson <> writes:
    >> > Eric Sosman <> wrote:
    >> > >      ... My guess is that the compiler
    >> > > wants to assume that an int* and a double* point to
    >> > > different objects since they point to different types.
    >> > > That is, if you pass both &bar.i and &bar.d as arguments
    >> > > to
    >> > >
    >> > >         void silly(int *ip, double *dp) {
    >> > >             *ip = 42;
    >> > >             *dp = 42.0;
    >> > >             *ip = 42;
    >> > >         }
    >> > >
    >> > > ... the compiler might assume that the two differently-
    >> > > typed parameters point to two differently-typed and
    >> > > distinct objects, so the third assignment could be
    >> > > omitted.
    >> >
    >> > The first perhaps, but not the third. Strictly conforming
    >> > code could detect the difference...
    >> >
    >> >   union { int i; double d; } u;
    >> >   silly(&u.i, &u.d);
    >> >   printf("%d\n", u.i);

    >>
    >> I disagree but given your history of being correct,

    >
    > It may have happened. I think it was a Tuesday. ;)
    >
    >> I currently suspect that I have missed something here.  Does
    >> not your snippet of code violate the "shall" from section 6.5
    >> paragraph 7?

    >
    > I don't see how. The last u.i accesses an object that was last
    > assigned via an int lvalue. That assigment imposed the
    > effective type. [6.5p6]


    Duh! I was reading the earlier quote as if the programmer were
    permitted to remove the third line, not the compiler.

    --
    Ben.
     
    Ben Bacarisse, Aug 17, 2009
    #11
  12. Ben Bacarisse <> writes:
    <snip>
    > Duh! I was reading the earlier quote as if the programmer were
    > permitted to remove the third line, not the compiler.


    OK, even that does not make sense. Take it that I misread everything!

    --
    Ben.
     
    Ben Bacarisse, Aug 18, 2009
    #12
  13. Noob

    Eric Sosman Guest

    Peter Nilsson wrote:
    > Eric Sosman <> wrote:
    >> Noob wrote:
    >>> Hello,
    >>> My compiler complains when I take the address of a member
    >>> in a union.

    > <snip sample>
    >>> Is this an aliasing problem?
    >>> What am I doing wrong?

    >> Nothing that I can see. My guess is that the compiler
    >> wants to assume that an int* and a double* point to different
    >> objects since they point to different types. That is, if you
    >> pass both &bar.i and &bar.d as arguments to
    >>
    >> void silly(int *ip, double *dp) {
    >> *ip = 42;
    >> *dp = 42.0;
    >> *ip = 42;
    >> }
    >>
    >> ... the compiler might assume that the two differently-typed
    >> parameters point to two differently-typed and distinct objects,
    >> so the third assignment could be omitted.

    >
    > The first perhaps, but not the third. Strictly conforming code
    > could detect the difference...
    >
    > union { int i; double d; } u;
    > silly(&u.i, &u.d);
    > printf("%d\n", u.i);


    Yes, that was my point: The compiler's assumption that
    differently typed pointers point to distinct objects can be
    incorrect. Hence (I guess) the compiler's warning that it
    might be a good idea to run the optimizer at a less aggressive
    level, because at high levels it's a bit over-optimistic.

    --
    Eric Sosman
    lid
     
    Eric Sosman, Aug 18, 2009
    #13
  14. Noob

    Tim Rentsch Guest

    (Richard Bos) writes:

    > Keith Thompson <> wrote:
    >
    >> Consider, for example:
    >>
    >> union foo bar;
    >> int *p = &bar.i;
    >> *p = 10;
    >> bar.d = 12.34;
    >> printf("*p = %d\n", *p);
    >>
    >> The optimizer might assume that, since you just stored the value 10 in
    >> *p, the value 10 must still be there, and optimize the printf to
    >> something like puts("p = 10").
    >>
    >> (The standard has more to say about whether this behavior is defined
    >> or undefined and whether an optimizer is allowed to make this
    >> assumption; I don't have my copy of the standard handy right now and
    >> I'm too lazy to look it up.)

    >
    > You're reading a member of a union which is not the last member that has
    > been assigned to. You're reading it indirectly, but you're still reading
    > it. This means that its bytes have unspecified values,


    Probably you are misremembering. It's only bytes /other than/
    the bytes of the member last stored that take unspecified
    values. Bytes that overlap the member last stored take on
    the values that were stored as a result of assigning to
    that member.

    > and therefore
    > that its value may be a trap value[1]; hence, in theory undefined
    > behaviour, but most likely to result in nonsense values. And AFAICT it's
    > _allowed_ to result in the same nonsense value no matter what you store
    > in bar.d, or even in different nonsense values even if you store the
    > same value in bar.d more than once.


    There's a common misconception that reading a (non-character type)
    union member other than the last member stored is automatically
    undefined behavior. It isn't. Of course, it's possible to
    get undefined behavior if there's a trap representation, but
    if trap representations can be ruled out, the result is only
    implementation defined behavior. For example,

    union {
    int i;
    unsigned u;
    } x;
    x.i = 5;
    return x.u;

    must return the value 5.
     
    Tim Rentsch, Sep 29, 2009
    #14
  15. Noob

    Guest

    Tim Rentsch <> wrote:
    >
    > There's a common misconception that reading a (non-character type)
    > union member other than the last member stored is automatically
    > undefined behavior. It isn't.


    It was, prior to C99.
    --
    Larry Jones

    Like I'm going to get any sleep NOW. -- Calvin
     
    , Oct 12, 2009
    #15
  16. Noob

    Tim Rentsch Guest

    writes:

    > Tim Rentsch <> wrote:
    >>
    >> There's a common misconception that reading a (non-character type)
    >> union member other than the last member stored is automatically
    >> undefined behavior. It isn't.

    >
    > It was, prior to C99.


    That's good to know. Was this deliberate or accidental?
    (I expect it was deliberate, but it seems right to ask.)
    If it was deliberate, what prompted the change?
     
    Tim Rentsch, Oct 12, 2009
    #16
  17. Noob

    Tim Rentsch Guest

    Richard Heathfield <> writes:

    > In <>,
    > wrote:
    >
    >> Tim Rentsch <> wrote:
    >>>
    >>> There's a common misconception that reading a (non-character type)
    >>> union member other than the last member stored is automatically
    >>> undefined behavior. It isn't.

    >>
    >> It was, prior to C99.

    >
    > That's a common misconception. From C89 3.3.2.3:
    >
    > " With one exception, if a member of a union object is accessed after
    > a value has been stored in a different member of the object, the
    > behavior is implementation-defined."


    I don't have a C89 document readily available -- can you
    find out what the exception was?
     
    Tim Rentsch, Oct 14, 2009
    #17
  18. Noob

    Tim Rentsch Guest

    Richard Heathfield <> writes:

    > In <>, Tim Rentsch wrote:
    >
    >> Richard Heathfield <> writes:
    >>
    >>> In <>,
    >>> wrote:
    >>>
    >>>> Tim Rentsch <> wrote:
    >>>>>
    >>>>> There's a common misconception that reading a (non-character
    >>>>> type) union member other than the last member stored is
    >>>>> automatically
    >>>>> undefined behavior. It isn't.
    >>>>
    >>>> It was, prior to C99.
    >>>
    >>> That's a common misconception. From C89 3.3.2.3:
    >>>
    >>> " With one exception, if a member of a union object is accessed
    >>> after a value has been stored in a different member of the object,
    >>> the behavior is implementation-defined."

    >>
    >> I don't have a C89 document readily available -- can you
    >> find out what the exception was?

    >
    > Common initial sequence (for a union made up of several structures).
    > The exception is /well/-defined (not undefined).


    Ahhh, that makes sense. Thank you for the followup.
     
    Tim Rentsch, Oct 14, 2009
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Matt Garman
    Replies:
    1
    Views:
    692
    Matt Garman
    Apr 25, 2004
  2. Denis Remezov

    Address of union members

    Denis Remezov, Jun 30, 2004, in forum: C++
    Replies:
    6
    Views:
    398
    Mike Wahler
    Jul 1, 2004
  3. Peter Dunker

    union in struct without union name

    Peter Dunker, Apr 26, 2004, in forum: C Programming
    Replies:
    2
    Views:
    930
    Chris Torek
    Apr 26, 2004
  4. Kenneth Bull

    struct/union pointer/address stuff.

    Kenneth Bull, Apr 14, 2008, in forum: C Programming
    Replies:
    1
    Views:
    574
    Chris Torek
    Apr 15, 2008
  5. Rui Maciel
    Replies:
    35
    Views:
    2,132
    Tim Rentsch
    Oct 12, 2009
Loading...

Share This Page