union arrangement

Discussion in 'C Programming' started by tedu, Jan 24, 2006.

  1. tedu

    tedu Guest

    does anyone know of a platform/compiler which will place union elements
    to not overlap?
    as in
    union u {
    int a;
    long b;
    size_t c;
    };
    in my limited experience, writing to any of (a, b, or c) will affect
    the value read from any other. i understand this is UB, but i'm
    curious if there are any real platforms where this is not the case.
     
    tedu, Jan 24, 2006
    #1
    1. Advertisements

  2. Unions members are meant to overlap. That's exactly what they're there
    for. Having them behave otherwise would make the compiler
    non-conforming.

    What you're describing are the structures. Look up the struct keyword in
    your C manual.

    Cheers

    Vladimir
     
    Vladimir S. Oka, Jan 24, 2006
    #2
    1. Advertisements

  3. tedu

    Mark B Guest

    All of them... just replace the 'union' keyword with 'struct'.
    no, not undefined, "Implementation defined"
     
    Mark B, Jan 24, 2006
    #3
  4. tedu

    Lew Pitcher Guest

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1
    IIRC, it is the definition of a union that /requires/ the elements to
    overlap. ("A union type describes an overlapping nonempty set of member
    objects, each of which has an optionally specified name and possibly
    distinct type.")
    I would hope not, as by definition in the standard, union elements are
    required to overlap.


    - --

    Lew Pitcher, IT Specialist, Enterprise Data Systems
    Enterprise Technology Solutions, TD Bank Financial Group

    (Opinions expressed here are my own, not my employer's)
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.4 (MingW32)

    iD8DBQFD1m+ragVFX4UWr64RArgXAJ9DuYkJhXWeXdSOLTQvHk1gMKNMEgCdGPPk
    W6Mufap6++SwPMnfPPBUh3M=
    =oyvE
    -----END PGP SIGNATURE-----
     
    Lew Pitcher, Jan 24, 2006
    #4
  5. tedu

    tedu Guest

    perhaps i'm misremembering the standard (don't have it atm), but i was
    pretty sure "write to union field A, read from union field B" was not
    defined. is that not correct?
     
    tedu, Jan 24, 2006
    #5
  6. tedu

    Lew Pitcher Guest

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1
    Defined as "implementation defined", I believe.

    However, that does not mean that
    - - union elements do not overlap (they do, they are supposed to), or
    - - write to element A, read from element B will not work (it usually
    does, but just in an "implementation defined" manner)


    - --

    Lew Pitcher, IT Specialist, Enterprise Data Systems
    Enterprise Technology Solutions, TD Bank Financial Group

    (Opinions expressed here are my own, not my employer's)
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.4 (MingW32)

    iD8DBQFD1o1lagVFX4UWr64RAv91AJ9gS8SBQHggyDk2/FGD2akEQQpWGgCfSlOE
    /UTwxC3rSrvh63nw4a9AEDE=
    =W/Y7
    -----END PGP SIGNATURE-----
     
    Lew Pitcher, Jan 24, 2006
    #6
  7. 6.2.6.1.7
    When a value is stored in a member of an object of union type, the bytes
    of the object representation that do not correspond to that member but
    do correspond to other members take unspecified values, but the value
    of the union object shall not thereby become a trap representation.
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    6.7.2.1.4
    As discussed in 6.2.5, a structure is a type consisting of a sequence of
    members, whose storage is allocated in an ordered sequence, and a union
    is a type consisting of a sequence of members whose storage overlap.
    ^^^^^^^

    Cheers

    Vladimir
     
    Vladimir S. Oka, Jan 24, 2006
    #7
  8. No. This is what union members do. Its what they're /supposed/ to do.
    (of writing through one member and reading through another)
    Actually, I recall that its implementation defined.
    Mark McIntyre
     
    Mark McIntyre, Jan 24, 2006
    #8
  9. tedu

    tedu Guest

    thanks. now consider:
    union u {
    char c[8];
    float f;
    };
    according to the above, on a machine with potential float trap
    representations, the c and f fields cannot completely overlap,
    otherwise i would be able to write in the trap representation bits. is
    that correct?
     
    tedu, Jan 24, 2006
    #9
  10. No. The value *of the union object* cannot become a trap
    representation; the value of a member of the union object can be.

    The point of the restriction is to make sure that referring to the
    value of a union (assigning it, passing it to a function, whatever)
    doesn't invoke undefined behavior just because one of its members has
    a trap representation. In effect, any operations that work on the
    union as a whole rather than on one if its members should treat the
    union as an uninterpreted bag of bits.

    That's probably not *quite* true, though. The standard doesn't say
    that unions can't have trap representations; it merely says that a
    union's value can't become a trap representation as a result of
    storing a valid value into one of its members. It's at least
    conceivable that, given:

    union f {
    float x;
    float y;
    } obj;

    storing a trap representation into obj.x (perhaps via memcpy() could
    cause obj itself to have a trap representation. But I'd be surprised
    if any implementation other than the DS9K actually worked this way.

    Digressing a bit, the OP's question wasn't entirely silly. The
    question was more or less equivalent to:

    Could a lazy C implementation that simply treats "union" as a
    synonym for "struct" be conforming?

    If all instances of storing a value in one union member and then
    reading the value of a different member invoked undefined behavior,
    and if the standard didn't specifically say that the members overlap,
    the answer would be yes. (Programs that try to use unions for type
    punning would fail, but we're assuming that would be undefined
    behavior.) But they don't, and it does, so it isn't.
     
    Keith Thompson, Jan 25, 2006
    #10
  11. tedu

    tedu Guest

    ok, let me try to pin this down a bit more.
    considering:
    union u {
    int x;
    int y;
    };
    (u.x == u.y) should always evaluate true?
    and with
    union u {
    int x;
    short y;
    };
    {
    int x = u.x;
    u.y = u.y + 4; /* anything to change value of y */
    x != u.x; /* this must evaluate true? */
    }
    or, in the last case, the value of u.x must change (because u.y must
    overlap it), but there's no way to determine which bits of u.x changed.
     
    tedu, Jan 25, 2006
    #11
  12. I believe so, yes. At least I can't think of a way a conforming
    implementation could avoid it. (I'll assume u is an object of type
    "union u".)
    That's likely to be true on any real-world implementation, but not on
    the DS9K. If adding 4 to u.y affects only bits that happen to be
    padding bits of u.x, the representation of x will change, but its
    value might not -- or it might become a trap representation.

    For that matter, it's not inconceivable that sizeof(short) > sizeof(int).
    This could happen if short has more padding bits than int. If adding
    4 to u.y affects only bits that aren't part of u.x, both the
    representation and value of u.x could be unchanged. (The standard
    requires the range of int to include the range of short; it doesn't
    actually say anything about their sizes.) But this should happen only
    in a deliberately perverse implementation.
     
    Keith Thompson, Jan 25, 2006
    #12
  13. tedu

    boa Guest

    Are you sure about this?

    From C99, §6.2.6.1 #7:
    When a value is stored in a member of an object of union type, the bytes
    of the object representation that do not correspond to that member but
    do correspond to other members take unspecified values, but the value of
    the union object shall not thereby become a trap representation.

    Boa
     
    boa, Jan 25, 2006
    #13
  14. Reasonably sure, yes. The value *of the union object* cannot become a
    trap representation. The value of a member of the union can.

    If you think about it, there's no way to avoid that possibility.
    Given:

    union u {
    some_type foo;
    unsigned char bar[sizeof(some_type)];
    };
    union u u_obj;

    if some_type has any trap representations at all, it's possible to
    create such a trap representation by assigning appropriate values to
    u_obj.bar.

    The requirement you quoted basically says that any reference to the
    union as a whole (rather than to one of its members) should treat it
    as an uninterpreted bag of bits.
     
    Keith Thompson, Jan 25, 2006
    #14
  15. tedu

    boa Guest

    Thanks. For some reason, I read that as "of the union member object".
    Why? No idea.

    boasema
    [snip]
     
    boa, Jan 26, 2006
    #15
  16. tedu

    S.Tobias Guest

    # 6.7.2.1
    # 14 [...] The value of at most one of the members can be stored in
    # a union object at any time. [...]

    It says that after you store a value into a union members, all other
    members don't have a value, so it must be UB to read through them.
    In the absence of other rules, I believe it gives the compiler license
    to assume that for value reading, different member-access expressions
    cannot alias. This means that a union could be treated as a struct, with
    the exception that for representation purposes the members must lie
    at the beginning of the object.

    I'm not sure how relevant this is (ie. how much the Readers are aware of
    this): In the draft n869.txt in 6.5.2.3#5, the first sentence "With one
    exception, if the value of a member of a union object is used when the
    most recent store to the object was to a different member, the behavior
    is implementation-defined.70)" is *not* in the C99 Standard (but I see
    it in C89 draft, so it must have been in C89.).

    I'm aware of a point in Annex J.1, which lists as unspecified:
    # -- The value of a union member other than the last one stored
    # into (6.2.6.1).
    IMHO it is wrong: 6.2.6.1p7 is talking about the representation of a
    union object, and does not give the complete semantics of member access.

    I have looked through an amount of posts from the last years. I don't
    give any specific references, it's enough to say that opinions varied to
    the extremes. Eg. C.Feather at the exactly same case as the discussed
    above one, said it was defined (it's undefined if union members are
    incompatible); at another occasion Dan Pop said one could read only the
    last written-to member, with an exception of character members; at yet
    another time Doug Gwyn said reading a not-last-written-to union member
    was meant to be undefined (but his remark was in a context where two
    fields were incompatible, so I couldn't judge how far conclusions could
    be drawn).

    (I suggest, can we add c.s.c. to the discussion?)
     
    S.Tobias, Jan 30, 2006
    #16
  17. tedu

    Netocrat Guest

    [omit listing of specific opinions]

    Did you come across Tim Rentsch's post to c.l.c of 6 December 2005, where
    he points out DR283? Its TC is not yet present in n1124, but it reads:
    | Attach a new footnote 78a to the words "named member" in 6.5.2.3#3:
    |
    | 78a If the member used to access the contents of a union object is
    | not the same as the member last used to store a value in the object,
    | the appropriate part of the object representation of the value is
    | reinterpreted as an object representation in the new type as
    | described in 6.2.6 (a process sometimes called "type punning"). This
    | might be a trap representation.
    I haven't added c.s.c as it seems from that DR and the associated DR257
    that this has been pretty well discussed already.
     
    Netocrat, Jan 30, 2006
    #17
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.