union arrangement

Discussion in 'C Programming' started by tedu, Jan 24, 2006.

  1. tedu

    tedu Guest

    does anyone know of a platform/compiler which will place union elements
    to not overlap?
    as in
    union u {
    int a;
    long b;
    size_t c;
    };
    in my limited experience, writing to any of (a, b, or c) will affect
    the value read from any other. i understand this is UB, but i'm
    curious if there are any real platforms where this is not the case.
     
    tedu, Jan 24, 2006
    #1
    1. Advertising

  2. tedu wrote:

    > does anyone know of a platform/compiler which will place union
    > elements to not overlap?
    > as in
    > union u {
    > int a;
    > long b;
    > size_t c;
    > };
    > in my limited experience, writing to any of (a, b, or c) will affect
    > the value read from any other. i understand this is UB, but i'm
    > curious if there are any real platforms where this is not the case.


    Unions members are meant to overlap. That's exactly what they're there
    for. Having them behave otherwise would make the compiler
    non-conforming.

    What you're describing are the structures. Look up the struct keyword in
    your C manual.

    Cheers

    Vladimir

    --
    World War Three can be averted by adherence to a strictly enforced
    dress code!
     
    Vladimir S. Oka, Jan 24, 2006
    #2
    1. Advertising

  3. tedu

    Mark B Guest

    "tedu" <> wrote in message
    news:...
    > does anyone know of a platform/compiler which will place union elements
    > to not overlap?


    All of them... just replace the 'union' keyword with 'struct'.

    > as in
    > union u {
    > int a;
    > long b;
    > size_t c;
    > };
    > in my limited experience, writing to any of (a, b, or c) will affect
    > the value read from any other. i understand this is UB,


    no, not undefined, "Implementation defined"
     
    Mark B, Jan 24, 2006
    #3
  4. tedu

    Lew Pitcher Guest

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    tedu wrote:
    > does anyone know of a platform/compiler which will place union elements
    > to not overlap?


    IIRC, it is the definition of a union that /requires/ the elements to
    overlap. ("A union type describes an overlapping nonempty set of member
    objects, each of which has an optionally specified name and possibly
    distinct type.")

    > as in
    > union u {
    > int a;
    > long b;
    > size_t c;
    > };
    > in my limited experience, writing to any of (a, b, or c) will affect
    > the value read from any other. i understand this is UB, but i'm
    > curious if there are any real platforms where this is not the case.


    I would hope not, as by definition in the standard, union elements are
    required to overlap.


    - --

    Lew Pitcher, IT Specialist, Enterprise Data Systems
    Enterprise Technology Solutions, TD Bank Financial Group

    (Opinions expressed here are my own, not my employer's)
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.4 (MingW32)

    iD8DBQFD1m+ragVFX4UWr64RArgXAJ9DuYkJhXWeXdSOLTQvHk1gMKNMEgCdGPPk
    W6Mufap6++SwPMnfPPBUh3M=
    =oyvE
    -----END PGP SIGNATURE-----
     
    Lew Pitcher, Jan 24, 2006
    #4
  5. tedu

    tedu Guest

    Lew Pitcher wrote:
    > tedu wrote:
    > > in my limited experience, writing to any of (a, b, or c) will affect
    > > the value read from any other. i understand this is UB, but i'm
    > > curious if there are any real platforms where this is not the case.

    >
    > I would hope not, as by definition in the standard, union elements are
    > required to overlap.


    perhaps i'm misremembering the standard (don't have it atm), but i was
    pretty sure "write to union field A, read from union field B" was not
    defined. is that not correct?
     
    tedu, Jan 24, 2006
    #5
  6. tedu

    Lew Pitcher Guest

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    tedu wrote:
    > Lew Pitcher wrote:
    >
    >>tedu wrote:
    >>
    >>>in my limited experience, writing to any of (a, b, or c) will affect
    >>>the value read from any other. i understand this is UB, but i'm
    >>>curious if there are any real platforms where this is not the case.

    >>
    >>I would hope not, as by definition in the standard, union elements are
    >>required to overlap.

    >
    >
    > perhaps i'm misremembering the standard (don't have it atm), but i was
    > pretty sure "write to union field A, read from union field B" was not
    > defined. is that not correct?


    Defined as "implementation defined", I believe.

    However, that does not mean that
    - - union elements do not overlap (they do, they are supposed to), or
    - - write to element A, read from element B will not work (it usually
    does, but just in an "implementation defined" manner)


    - --

    Lew Pitcher, IT Specialist, Enterprise Data Systems
    Enterprise Technology Solutions, TD Bank Financial Group

    (Opinions expressed here are my own, not my employer's)
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.4 (MingW32)

    iD8DBQFD1o1lagVFX4UWr64RAv91AJ9gS8SBQHggyDk2/FGD2akEQQpWGgCfSlOE
    /UTwxC3rSrvh63nw4a9AEDE=
    =W/Y7
    -----END PGP SIGNATURE-----
     
    Lew Pitcher, Jan 24, 2006
    #6
  7. tedu wrote:

    > Lew Pitcher wrote:
    >> tedu wrote:
    >> > in my limited experience, writing to any of (a, b, or c) will
    >> > affect
    >> > the value read from any other. i understand this is UB, but i'm
    >> > curious if there are any real platforms where this is not the case.

    >>
    >> I would hope not, as by definition in the standard, union elements
    >> are required to overlap.

    >
    > perhaps i'm misremembering the standard (don't have it atm), but i was
    > pretty sure "write to union field A, read from union field B" was not
    > defined. is that not correct?


    6.2.6.1.7
    When a value is stored in a member of an object of union type, the bytes
    of the object representation that do not correspond to that member but
    do correspond to other members take unspecified values, but the value
    of the union object shall not thereby become a trap representation.
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    6.7.2.1.4
    As discussed in 6.2.5, a structure is a type consisting of a sequence of
    members, whose storage is allocated in an ordered sequence, and a union
    is a type consisting of a sequence of members whose storage overlap.
    ^^^^^^^

    Cheers

    Vladimir

    --
    In Lexington, Kentucky, it's illegal to carry an ice cream cone in your
    pocket.
     
    Vladimir S. Oka, Jan 24, 2006
    #7
  8. On 24 Jan 2006 10:12:04 -0800, in comp.lang.c , "tedu"
    <> wrote:

    >does anyone know of a platform/compiler which will place union elements
    >to not overlap?


    No. This is what union members do. Its what they're /supposed/ to do.
    (of writing through one member and reading through another)

    >i understand this is UB,


    Actually, I recall that its implementation defined.
    Mark McIntyre
    --
    "Debugging is twice as hard as writing the code in the first place.
    Therefore, if you write the code as cleverly as possible, you are,
    by definition, not smart enough to debug it."
    --Brian Kernighan

    ----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==----
    http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
    ----= East and West-Coast Server Farms - Total Privacy via Encryption =----
     
    Mark McIntyre, Jan 24, 2006
    #8
  9. tedu

    tedu Guest

    Vladimir S. Oka wrote:
    > 6.2.6.1.7
    > When a value is stored in a member of an object of union type, the bytes
    > of the object representation that do not correspond to that member but
    > do correspond to other members take unspecified values, but the value
    > of the union object shall not thereby become a trap representation.
    > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    >
    > 6.7.2.1.4
    > As discussed in 6.2.5, a structure is a type consisting of a sequence of
    > members, whose storage is allocated in an ordered sequence, and a union
    > is a type consisting of a sequence of members whose storage overlap.
    > ^^^^^^^


    thanks. now consider:
    union u {
    char c[8];
    float f;
    };
    according to the above, on a machine with potential float trap
    representations, the c and f fields cannot completely overlap,
    otherwise i would be able to write in the trap representation bits. is
    that correct?
     
    tedu, Jan 24, 2006
    #9
  10. "tedu" <> writes:
    > Vladimir S. Oka wrote:
    >> 6.2.6.1.7
    >> When a value is stored in a member of an object of union type, the bytes
    >> of the object representation that do not correspond to that member but
    >> do correspond to other members take unspecified values, but the value
    >> of the union object shall not thereby become a trap representation.
    >> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    >>
    >> 6.7.2.1.4
    >> As discussed in 6.2.5, a structure is a type consisting of a sequence of
    >> members, whose storage is allocated in an ordered sequence, and a union
    >> is a type consisting of a sequence of members whose storage overlap.
    >> ^^^^^^^

    >
    > thanks. now consider:
    > union u {
    > char c[8];
    > float f;
    > };
    > according to the above, on a machine with potential float trap
    > representations, the c and f fields cannot completely overlap,
    > otherwise i would be able to write in the trap representation bits. is
    > that correct?


    No. The value *of the union object* cannot become a trap
    representation; the value of a member of the union object can be.

    The point of the restriction is to make sure that referring to the
    value of a union (assigning it, passing it to a function, whatever)
    doesn't invoke undefined behavior just because one of its members has
    a trap representation. In effect, any operations that work on the
    union as a whole rather than on one if its members should treat the
    union as an uninterpreted bag of bits.

    That's probably not *quite* true, though. The standard doesn't say
    that unions can't have trap representations; it merely says that a
    union's value can't become a trap representation as a result of
    storing a valid value into one of its members. It's at least
    conceivable that, given:

    union f {
    float x;
    float y;
    } obj;

    storing a trap representation into obj.x (perhaps via memcpy() could
    cause obj itself to have a trap representation. But I'd be surprised
    if any implementation other than the DS9K actually worked this way.

    Digressing a bit, the OP's question wasn't entirely silly. The
    question was more or less equivalent to:

    Could a lazy C implementation that simply treats "union" as a
    synonym for "struct" be conforming?

    If all instances of storing a value in one union member and then
    reading the value of a different member invoked undefined behavior,
    and if the standard didn't specifically say that the members overlap,
    the answer would be yes. (Programs that try to use unions for type
    punning would fail, but we're assuming that would be undefined
    behavior.) But they don't, and it does, so it isn't.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Jan 25, 2006
    #10
  11. tedu

    tedu Guest

    Keith Thompson wrote:
    > Could a lazy C implementation that simply treats "union" as a
    > synonym for "struct" be conforming?
    >
    > If all instances of storing a value in one union member and then
    > reading the value of a different member invoked undefined behavior,
    > and if the standard didn't specifically say that the members overlap,
    > the answer would be yes. (Programs that try to use unions for type
    > punning would fail, but we're assuming that would be undefined
    > behavior.) But they don't, and it does, so it isn't.


    ok, let me try to pin this down a bit more.
    considering:
    union u {
    int x;
    int y;
    };
    (u.x == u.y) should always evaluate true?
    and with
    union u {
    int x;
    short y;
    };
    {
    int x = u.x;
    u.y = u.y + 4; /* anything to change value of y */
    x != u.x; /* this must evaluate true? */
    }
    or, in the last case, the value of u.x must change (because u.y must
    overlap it), but there's no way to determine which bits of u.x changed.
     
    tedu, Jan 25, 2006
    #11
  12. "tedu" <> writes:
    > Keith Thompson wrote:
    >> Could a lazy C implementation that simply treats "union" as a
    >> synonym for "struct" be conforming?
    >>
    >> If all instances of storing a value in one union member and then
    >> reading the value of a different member invoked undefined behavior,
    >> and if the standard didn't specifically say that the members overlap,
    >> the answer would be yes. (Programs that try to use unions for type
    >> punning would fail, but we're assuming that would be undefined
    >> behavior.) But they don't, and it does, so it isn't.

    >
    > ok, let me try to pin this down a bit more.
    > considering:
    > union u {
    > int x;
    > int y;
    > };
    > (u.x == u.y) should always evaluate true?


    I believe so, yes. At least I can't think of a way a conforming
    implementation could avoid it. (I'll assume u is an object of type
    "union u".)

    > and with
    > union u {
    > int x;
    > short y;
    > };
    > {
    > int x = u.x;
    > u.y = u.y + 4; /* anything to change value of y */
    > x != u.x; /* this must evaluate true? */
    > }
    > or, in the last case, the value of u.x must change (because u.y must
    > overlap it), but there's no way to determine which bits of u.x changed.


    That's likely to be true on any real-world implementation, but not on
    the DS9K. If adding 4 to u.y affects only bits that happen to be
    padding bits of u.x, the representation of x will change, but its
    value might not -- or it might become a trap representation.

    For that matter, it's not inconceivable that sizeof(short) > sizeof(int).
    This could happen if short has more padding bits than int. If adding
    4 to u.y affects only bits that aren't part of u.x, both the
    representation and value of u.x could be unchanged. (The standard
    requires the range of int to include the range of short; it doesn't
    actually say anything about their sizes.) But this should happen only
    in a deliberately perverse implementation.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Jan 25, 2006
    #12
  13. tedu

    boa Guest

    Keith Thompson wrote:
    > "tedu" <> writes:

    [snip]

    >> and with union u { int x; short y; }; { int x = u.x; u.y = u.y + 4;
    >> /* anything to change value of y */ x != u.x; /* this must evaluate
    >> true? */ } or, in the last case, the value of u.x must change
    >> (because u.y must overlap it), but there's no way to determine
    >> which bits of u.x changed.

    >
    > That's likely to be true on any real-world implementation, but not on
    > the DS9K. If adding 4 to u.y affects only bits that happen to be
    > padding bits of u.x, the representation of x will change, but its
    > value might not -- or it might become a trap representation.


    Are you sure about this?

    From C99, §6.2.6.1 #7:
    When a value is stored in a member of an object of union type, the bytes
    of the object representation that do not correspond to that member but
    do correspond to other members take unspecified values, but the value of
    the union object shall not thereby become a trap representation.

    Boa
     
    boa, Jan 25, 2006
    #13
  14. boa <> writes:
    > Keith Thompson wrote:
    >> "tedu" <> writes:

    > [snip]
    >
    >>> and with union u { int x; short y; }; { int x = u.x; u.y = u.y + 4;
    >>> /* anything to change value of y */ x != u.x; /* this must evaluate
    >>> true? */ } or, in the last case, the value of u.x must change
    >>> (because u.y must overlap it), but there's no way to determine
    >>> which bits of u.x changed.

    >> That's likely to be true on any real-world implementation, but not on
    >> the DS9K. If adding 4 to u.y affects only bits that happen to be
    >> padding bits of u.x, the representation of x will change, but its
    >> value might not -- or it might become a trap representation.

    >
    > Are you sure about this?
    >
    > From C99, §6.2.6.1 #7:
    > When a value is stored in a member of an object of union type, the bytes
    > of the object representation that do not correspond to that member but
    > do correspond to other members take unspecified values, but the value
    > of the union object shall not thereby become a trap representation.


    Reasonably sure, yes. The value *of the union object* cannot become a
    trap representation. The value of a member of the union can.

    If you think about it, there's no way to avoid that possibility.
    Given:

    union u {
    some_type foo;
    unsigned char bar[sizeof(some_type)];
    };
    union u u_obj;

    if some_type has any trap representations at all, it's possible to
    create such a trap representation by assigning appropriate values to
    u_obj.bar.

    The requirement you quoted basically says that any reference to the
    union as a whole (rather than to one of its members) should treat it
    as an uninterpreted bag of bits.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Jan 25, 2006
    #14
  15. tedu

    boa Guest

    Keith Thompson wrote:
    > boa <> writes:
    >> Keith Thompson wrote:
    >>> "tedu" <> writes:

    >> [snip]
    >>
    >>>> and with union u { int x; short y; }; { int x = u.x; u.y = u.y + 4;
    >>>> /* anything to change value of y */ x != u.x; /* this must evaluate
    >>>> true? */ } or, in the last case, the value of u.x must change
    >>>> (because u.y must overlap it), but there's no way to determine
    >>>> which bits of u.x changed.
    >>> That's likely to be true on any real-world implementation, but not on
    >>> the DS9K. If adding 4 to u.y affects only bits that happen to be
    >>> padding bits of u.x, the representation of x will change, but its
    >>> value might not -- or it might become a trap representation.

    >> Are you sure about this?
    >>
    >> From C99, §6.2.6.1 #7:
    >> When a value is stored in a member of an object of union type, the bytes
    >> of the object representation that do not correspond to that member but
    >> do correspond to other members take unspecified values, but the value
    >> of the union object shall not thereby become a trap representation.

    >
    > Reasonably sure, yes. The value *of the union object* cannot become a
    > trap representation. The value of a member of the union can.


    Thanks. For some reason, I read that as "of the union member object".
    Why? No idea.

    boasema
    [snip]
     
    boa, Jan 26, 2006
    #15
  16. tedu

    S.Tobias Guest

    Keith Thompson <> wrote:
    > "tedu" <> writes:


    >> ok, let me try to pin this down a bit more.
    >> considering:
    >> union u {
    >> int x;
    >> int y;
    >> };
    >> (u.x == u.y) should always evaluate true?

    >
    > I believe so, yes. At least I can't think of a way a conforming
    > implementation could avoid it. (I'll assume u is an object of type
    > "union u".)
    >

    # 6.7.2.1
    # 14 [...] The value of at most one of the members can be stored in
    # a union object at any time. [...]

    It says that after you store a value into a union members, all other
    members don't have a value, so it must be UB to read through them.
    In the absence of other rules, I believe it gives the compiler license
    to assume that for value reading, different member-access expressions
    cannot alias. This means that a union could be treated as a struct, with
    the exception that for representation purposes the members must lie
    at the beginning of the object.

    I'm not sure how relevant this is (ie. how much the Readers are aware of
    this): In the draft n869.txt in 6.5.2.3#5, the first sentence "With one
    exception, if the value of a member of a union object is used when the
    most recent store to the object was to a different member, the behavior
    is implementation-defined.70)" is *not* in the C99 Standard (but I see
    it in C89 draft, so it must have been in C89.).

    I'm aware of a point in Annex J.1, which lists as unspecified:
    # -- The value of a union member other than the last one stored
    # into (6.2.6.1).
    IMHO it is wrong: 6.2.6.1p7 is talking about the representation of a
    union object, and does not give the complete semantics of member access.

    I have looked through an amount of posts from the last years. I don't
    give any specific references, it's enough to say that opinions varied to
    the extremes. Eg. C.Feather at the exactly same case as the discussed
    above one, said it was defined (it's undefined if union members are
    incompatible); at another occasion Dan Pop said one could read only the
    last written-to member, with an exception of character members; at yet
    another time Doug Gwyn said reading a not-last-written-to union member
    was meant to be undefined (but his remark was in a context where two
    fields were incompatible, so I couldn't judge how far conclusions could
    be drawn).

    (I suggest, can we add c.s.c. to the discussion?)

    --
    Stan Tobias
    mailx `echo LID | sed s/[[:upper:]]//g`
     
    S.Tobias, Jan 30, 2006
    #16
  17. tedu

    Netocrat Guest

    On Mon, 30 Jan 2006 11:17:30 +0000, S.Tobias wrote:
    > Keith Thompson <> wrote:
    >> "tedu" <> writes:

    >
    >>> ok, let me try to pin this down a bit more. considering:
    >>> union u {
    >>> int x;
    >>> int y;
    >>> };
    >>> (u.x == u.y) should always evaluate true?

    >>
    >> I believe so, yes. At least I can't think of a way a conforming
    >> implementation could avoid it. (I'll assume u is an object of type
    >> "union u".)
    >>

    > # 6.7.2.1
    > # 14 [...] The value of at most one of the members can be stored in #
    > a union object at any time. [...]
    >
    > It says that after you store a value into a union members, all other
    > members don't have a value, so it must be UB to read through them. In
    > the absence of other rules, I believe it gives the compiler license to
    > assume that for value reading, different member-access expressions
    > cannot alias. This means that a union could be treated as a struct,
    > with the exception that for representation purposes the members must lie
    > at the beginning of the object.
    >
    > I'm not sure how relevant this is (ie. how much the Readers are aware of
    > this): In the draft n869.txt in 6.5.2.3#5, the first sentence "With one
    > exception, if the value of a member of a union object is used when the
    > most recent store to the object was to a different member, the behavior
    > is implementation-defined.70)" is *not* in the C99 Standard (but I see
    > it in C89 draft, so it must have been in C89.).
    >
    > I'm aware of a point in Annex J.1, which lists as unspecified: # --
    > The value of a union member other than the last one stored # into
    > (6.2.6.1).
    > IMHO it is wrong: 6.2.6.1p7 is talking about the representation of a
    > union object, and does not give the complete semantics of member access.
    >
    > I have looked through an amount of posts from the last years. I don't
    > give any specific references, it's enough to say that opinions varied to
    > the extremes.

    [omit listing of specific opinions]

    Did you come across Tim Rentsch's post to c.l.c of 6 December 2005, where
    he points out DR283? Its TC is not yet present in n1124, but it reads:
    | Attach a new footnote 78a to the words "named member" in 6.5.2.3#3:
    |
    | 78a If the member used to access the contents of a union object is
    | not the same as the member last used to store a value in the object,
    | the appropriate part of the object representation of the value is
    | reinterpreted as an object representation in the new type as
    | described in 6.2.6 (a process sometimes called "type punning"). This
    | might be a trap representation.

    > (I suggest, can we add c.s.c. to the discussion?)


    I haven't added c.s.c as it seems from that DR and the associated DR257
    that this has been pretty well discussed already.

    --
    http://members.dodo.com.au/~netocrat
     
    Netocrat, Jan 30, 2006
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Cuthbert
    Replies:
    8
    Views:
    437
    Ancient_Hacker
    Sep 13, 2006
  2. all arrangement of a word

    , Jul 7, 2008, in forum: C Programming
    Replies:
    12
    Views:
    518
    Antoninus Twink
    Jul 8, 2008
  3. Bugcy 013

    Python-GUI Button arrangement Help

    Bugcy 013, Feb 16, 2011, in forum: Python
    Replies:
    0
    Views:
    318
    Bugcy 013
    Feb 16, 2011
  4. Ganesh Kumar

    Python GUI Tkinter Button arrangement

    Ganesh Kumar, Feb 16, 2011, in forum: Python
    Replies:
    0
    Views:
    309
    Ganesh Kumar
    Feb 16, 2011
  5. Daniel Waite
    Replies:
    15
    Views:
    256
    Rick DeNatale
    Aug 22, 2006
Loading...

Share This Page