Re: Aliasing in C99

Discussion in 'C Programming' started by Eric Sosman, May 31, 2012.

  1. Eric Sosman

    Eric Sosman Guest

    On 5/31/2012 10:01 AM, David Brown wrote:
    > I am trying to figure out how to get aliasing to work correctly according
    > to the C99 rules. For example, converting between a float and its binary
    > representation.
    >
    > float negPCast(float x) {
    > uint32_t u = *((uint32_t *)&x);
    > u ^= 0x80000000u;
    > return *((float *)&u);
    > }
    >
    > In the absence of type-based aliasing, this will negate a float using
    > just a simple xor operation (ignore any issues with endianness, int
    > sizes, NaNs, etc., since this is just an example).
    >
    >
    > The pointer typecasting here will break strict aliasing rules, and is
    > therefore not valid C99. (I'm guessing that in this case, most compilers
    > will generate code that works as desired - but I'm looking for strictly
    > conforming methods.)
    >
    >
    > It is possible to re-implement it using type-punning unions:
    >
    > float negUnion(float x) {
    > union { float f; uint32_t u; } uf;
    > uf.f = x;
    > uf.u ^= 0x80000000;
    > return uf.f;
    > }
    >
    > This doesn't use pointer typecasting, but I believe type-punning unions
    > are undefined in C but implemented "properly" in most compilers.


    6.5p7 lists the kinds of lvalues that can legitimately access
    an object's stored value, and says in a footnote that the list is
    meant to indicate when aliasing is and is not proper. One of the
    proper lvalues is a union containing a member of a suitable type,
    so it looks like the aliasing is permitted here (barring trap
    representations and the like).

    HOWEVER: Although `uf' is certainly an lvalue expression for
    a suitable union, `uf.u' is not: It is an lvalue expression of the
    type `uint32_t', which would not be a legitimate accessor for the
    stored `float' value. There may have been an intent to allow
    this, but I'm not convinced the language of 7.5p7 actually does so.
    (BTW, I'm looking at N1256, not the final Standard.) Moving on...

    6.5.2.3p3 has a footnote stating "If the member used to access
    the contents of a union object is not the same as the member last
    used to store a value in the object, the appropriate part of the
    object representation of the value is reinterpreted as an object
    representation in the new type [...]," which is exactly what you
    want. Unfortunately, footnotes are non-normative, so you need to
    find something in the normative text from which you can deduce what
    the footnote is "clarifying." Perhaps a language lawyer can find
    the right argument.

    > It is also possible to use pointers to unions in casts:
    >
    > float negUnionPCast(float x) {
    > typedef union { float f; uint32_t u; } UF;
    > uint32_t u = ((UF*)&x)->u;
    > u ^= 0x80000000u;
    > return ((UF*)&u)->f;
    > }
    >
    > I /think/ pointer casts like this are not subject to strict aliasing
    > rules, but I don't know if the union usage is valid.


    Looks like it should work if the plain-union approach works,
    provided there aren't alignment issues and such.

    > Does anyone know of other ways that are strictly valid and defined in
    > C99, and that also are efficient in use (I'd like to avoid things like
    > casting back and forth between char or char pointers, or volatile
    > accesses, etc.)?


    Using `char*' (or, better, `unsigned char*') certainly works,
    per 6.5p7. I don't think `volatile' helps or hinders: It constrains
    the sequence of operations, but not their validity.

    --
    Eric Sosman
    d
     
    Eric Sosman, May 31, 2012
    #1
    1. Advertising

  2. Eric Sosman

    Xavier Roche Guest

    On 05/31/2012 04:52 PM, Eric Sosman wrote:
    > Using `char*' (or, better, `unsigned char*') certainly works,
    > per 6.5p7.


    Humm, my understanding was that two variables could have the same
    address if of compatible types, which includes "char" and "anything
    else". But here we would have three variables, two violating strict
    aliasing rules. Am I missing something ?
     
    Xavier Roche, May 31, 2012
    #2
    1. Advertising

  3. Eric Sosman

    Eric Sosman Guest

    On 5/31/2012 10:55 AM, Xavier Roche wrote:
    > On 05/31/2012 04:52 PM, Eric Sosman wrote:
    >> Using `char*' (or, better, `unsigned char*') certainly works,
    >> per 6.5p7.

    >
    > Humm, my understanding was that two variables could have the same
    > address if of compatible types, which includes "char" and "anything
    > else". But here we would have three variables, two violating strict
    > aliasing rules. Am I missing something ?


    I'm not sure what you mean. "Two variables" cannot have the
    same address unless their lifetimes are disjoint, regardless of
    type compatibility:

    int i, j;
    double x, y;
    assert(&i != &j && &x != &y && &i != &x && ...);

    On the other hand, two members of the same union always have
    the same address:

    union { int i; double x; } u;
    assert (&u.i == &u.x && &u == &u.x && ...);

    On the gripping hand, "two variables" can share an address if
    one is an array and the other is the array's [0] element, if you
    consider the [0] element a "variable." (As far as I can see the
    Standard doesn't use "variable" as a noun in normative text, so we
    don't have a formal definition to help us out.)

    6.5p7 lists the kinds of expressions that can access the
    stored value of an object (whether it's a "variable" or not).
    That's where "compatibility" comes in: You can access a stored
    value via an lvalue of type compatible with the stored value's
    effective type (see 6.5p6 for "effective type," which needs all
    that explanation because of things like memcpy() which move
    representations around without regard to types). And you can
    access a stored value with lvalues whose types are "mildly"
    incompatible, as in

    unsigned int ui = 42;
    int *ip = (int*) &ui;
    assert (*ip == 42); // different signedness is OK
    const unsigned int *uip = &ui;
    assert (uip == 42); // different constness is OK

    Finally, by long tradition you can always access the "object
    representation" (6.2.6.1p4) of an addressable object by viewing
    it as an array of bytes, so 6.5p7 lists "a character type" among
    the kinds of lvalues permitted to access stored value.

    --
    Eric Sosman
    d
     
    Eric Sosman, May 31, 2012
    #3
  4. Eric Sosman

    Tim Rentsch Guest

    Eric Sosman <> writes:

    > On 5/31/2012 10:01 AM, David Brown wrote:
    >> I am trying to figure out how to get aliasing to work correctly according
    >> to the C99 rules. For example, converting between a float and its binary
    >> representation.
    >>
    >> float negPCast(float x) {
    >> uint32_t u = *((uint32_t *)&x);
    >> u ^= 0x80000000u;
    >> return *((float *)&u);
    >> }
    >>
    >> In the absence of type-based aliasing, this will negate a float using
    >> just a simple xor operation (ignore any issues with endianness, int
    >> sizes, NaNs, etc., since this is just an example).
    >>
    >>
    >> The pointer typecasting here will break strict aliasing rules, and is
    >> therefore not valid C99. (I'm guessing that in this case, most compilers
    >> will generate code that works as desired - but I'm looking for strictly
    >> conforming methods.)
    >>
    >>
    >> It is possible to re-implement it using type-punning unions:
    >>
    >> float negUnion(float x) {
    >> union { float f; uint32_t u; } uf;
    >> uf.f = x;
    >> uf.u ^= 0x80000000;
    >> return uf.f;
    >> }
    >>
    >> This doesn't use pointer typecasting, but I believe type-punning unions
    >> are undefined in C but implemented "properly" in most compilers.

    >
    > 6.5p7 lists the kinds of lvalues that can legitimately access
    > an object's stored value, and says in a footnote that the list is
    > meant to indicate when aliasing is and is not proper. One of the
    > proper lvalues is a union containing a member of a suitable type,
    > so it looks like the aliasing is permitted here (barring trap
    > representations and the like).
    >
    > HOWEVER: Although `uf' is certainly an lvalue expression for
    > a suitable union, `uf.u' is not: It is an lvalue expression of the
    > type `uint32_t', which would not be a legitimate accessor for the
    > stored `float' value. There may have been an intent to allow
    > this, but I'm not convinced the language of 7.5p7 actually does so.
    > (BTW, I'm looking at N1256, not the final Standard.) Moving on...


    Presumably you meant 6.5p7 in that penultimate sentence. (And these
    paragraphs are unchanged between N1256 and N1570.)

    The effective type of an object is defined in 6.5p6. The expression
    doing the access is 'uf.u'. This expression falls under the first
    sentence of 6.5p6, that is, 'uf.u' has a declared type, and the declared
    type of 'uf.u' is the effective type for this access. Obviously the
    declared type of 'uf.u' matches the type of the expression 'uf.u'
    under 6.5p7, since they are the same type. Therefore the effective
    type rules allow this access.


    > 6.5.2.3p3 has a footnote stating "If the member used to access
    > the contents of a union object is not the same as the member last
    > used to store a value in the object, the appropriate part of the
    > object representation of the value is reinterpreted as an object
    > representation in the new type [...]," which is exactly what you
    > want. Unfortunately, footnotes are non-normative, so you need to
    > find something in the normative text from which you can deduce what
    > the footnote is "clarifying." Perhaps a language lawyer can find
    > the right argument.


    I've posted on unions quite a few times over the last six months
    or so, including citations to sections that define the behavior
    (ie, in normative text) corresponding to this footnote. It isn't
    hard to find the relevant passages if one is of a mind to do so.
    Or, simply use google groups search to find my earlir posting on
    the matter.
     
    Tim Rentsch, May 31, 2012
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Glen Low

    C99 complex numbers and aliasing

    Glen Low, Aug 3, 2004, in forum: C Programming
    Replies:
    5
    Views:
    424
    David R Tribble
    Aug 20, 2004
  2. Mike

    C99, strict aliasing

    Mike, Jul 20, 2010, in forum: C Programming
    Replies:
    5
    Views:
    652
    Tim Rentsch
    Jul 21, 2010
  3. James Kuyper

    Re: Aliasing in C99

    James Kuyper, May 31, 2012, in forum: C Programming
    Replies:
    0
    Views:
    248
    James Kuyper
    May 31, 2012
  4. Xavier Roche

    Re: Aliasing in C99

    Xavier Roche, May 31, 2012, in forum: C Programming
    Replies:
    1
    Views:
    299
    Xavier Roche
    May 31, 2012
  5. Tim Rentsch

    Re: Aliasing in C99

    Tim Rentsch, May 31, 2012, in forum: C Programming
    Replies:
    2
    Views:
    342
    Tim Rentsch
    Jun 1, 2012
Loading...

Share This Page