question on union

Discussion in 'C Programming' started by Morris Dovey, Feb 13, 2008.

  1. Morris Dovey

    Morris Dovey Guest

    Roman Mashak wrote:

    > What I don't get is how come that un.c[0] and un.c[1] both contain what has
    > been un.s initialized, i.e. 0x0102. Is it a feature of 'union'?
    > Why could not we use 'struct' to check how bytes are placed in memory ?


    The elements in a structure all occupy separate and distinct
    "pieces" of memory, but the elements of a union all occupy a
    common "piece" of memory.

    --
    Morris Dovey
    DeSoto Solar
    DeSoto, Iowa USA
    http://www.iedu.com/DeSoto
    Morris Dovey, Feb 13, 2008
    #1
    1. Advertising

  2. Morris Dovey

    user923005 Guest

    On Feb 13, 12:43 pm, "Roman Mashak" <> wrote:
    > Hello, Morris!
    > You wrote  on Tue, 12 Feb 2008 20:32:28 -0500:
    >
    >  ??>> What I don't get is how come that un.c[0] and un.c[1] both contain
    >  ??>> what has been un.s initialized, i.e. 0x0102. Is it a feature of
    >  ??>> 'union'? Why could not we use 'struct' to check how bytes are placed
    >  ??>> in memory ?
    >
    >  MD> The elements in a structure all occupy separate and distinct
    >  MD> "pieces" of memory, but the elements of a union all occupy a
    >  MD> common "piece" of memory.
    > What is the mechanics behind that? Say, in posted example at run-time un.s =
    > 0x0102 and this value (0x0102) occupies some common memory. Is it CPU who is
    > in charge to lay out un.c value in the memory according to architecture?


    The mechanics don't matter. But one possible implementation is to
    simply alias several distinct data type addresses to the same address
    in memory, with a pointer for each desired type. In the real world,
    it probably won't happen that way, since it would be wasteful. More
    likely, there is a table in memory somewhere describing the layout for
    each distinct union type. That table will have the different ways to
    interpret the different members of the union recorded.

    In C, you don't have to worry about the physical mechanics of such a
    thing. Trust the compiler writers to create a good implementation of
    it. And if they don't switch compilers.

    > Then why do both values look differently, in debugger:
    >
    > (gdb) p/x un
    > $3 = {s = 0x102, c = {0x2, 0x1}}
    > (gdb)


    Because you are interpreting one value as one type and the other value
    as a different type.
    user923005, Feb 13, 2008
    #2
    1. Advertising

  3. Morris Dovey

    Arthur Guest

    On Feb 14, 4:43 am, "Roman Mashak" <> wrote:
    > Hello, Morris!
    > You wrote on Tue, 12 Feb 2008 20:32:28 -0500:
    >
    > ??>> What I don't get is how come that un.c[0] and un.c[1] both contain
    > ??>> what has been un.s initialized, i.e. 0x0102. Is it a feature of
    > ??>> 'union'? Why could not we use 'struct' to check how bytes are placed
    > ??>> in memory ?
    >
    > MD> The elements in a structure all occupy separate and distinct
    > MD> "pieces" of memory, but the elements of a union all occupy a
    > MD> common "piece" of memory.
    > What is the mechanics behind that? Say, in posted example at run-time un.s =
    > 0x0102 and this value (0x0102) occupies some common memory. Is it CPU who is
    > in charge to lay out un.c value in the memory according to architecture?
    > Then why do both values look differently, in debugger:
    >
    > (gdb) p/x un
    > $3 = {s = 0x102, c = {0x2, 0x1}}
    > (gdb)
    >
    > With best regards, Roman Mashak. E-mail:


    Hello! The CPU doesn't place an 'union' according to this
    architecture, in fact,
    your compiler does so when you compiles your program.

    Suppose you define:
    union u_tag {
    int i;
    char c[sizeof(int)];
    } u;
    and refers it using
    u.i = 0x12;
    u.c[0] = 0x12;
    the compiler will simply convert them into instructions like this:
    movl $0x12, _u
    movb $0x12, _u
    The compiler uses same symbols for u.i and u.c[0].

    The reason why they look different in your debugger is that
    Intel CPUs use little-endian.
    un.s is placed in memory like this:
    0x02 0x01
    when referred as u.s, it means a short int 0x0102, i.e. s = 0x102
    when referred as u.c, it means an array of char, {0x02, 0x01}

    Please correct me if I made any mistakes. Have a good day!
    Arthur, Feb 13, 2008
    #3
  4. Morris Dovey

    Arthur Guest

    On Feb 13, 1:19 pm, user923005 <> wrote:
    > On Feb 13, 12:43 pm, "Roman Mashak" <> wrote:
    >
    > > Hello, Morris!
    > > You wrote on Tue, 12 Feb 2008 20:32:28 -0500:

    >
    > > ??>> What I don't get is how come that un.c[0] and un.c[1] both contain
    > > ??>> what has been un.s initialized, i.e. 0x0102. Is it a feature of
    > > ??>> 'union'? Why could not we use 'struct' to check how bytes are placed
    > > ??>> in memory ?

    >
    > > MD> The elements in a structure all occupy separate and distinct
    > > MD> "pieces" of memory, but the elements of a union all occupy a
    > > MD> common "piece" of memory.
    > > What is the mechanics behind that? Say, in posted example at run-time un.s =
    > > 0x0102 and this value (0x0102) occupies some common memory. Is it CPU who is
    > > in charge to lay out un.c value in the memory according to architecture?

    >
    > The mechanics don't matter. But one possible implementation is to
    > simply alias several distinct data type addresses to the same address
    > in memory, with a pointer for each desired type. In the real world,
    > it probably won't happen that way, since it would be wasteful. More
    > likely, there is a table in memory somewhere describing the layout for
    > each distinct union type. That table will have the different ways to
    > interpret the different members of the union recorded.
    >
    > In C, you don't have to worry about the physical mechanics of such a
    > thing. Trust the compiler writers to create a good implementation of
    > it. And if they don't switch compilers.
    >
    > > Then why do both values look differently, in debugger:

    >
    > > (gdb) p/x un
    > > $3 = {s = 0x102, c = {0x2, 0x1}}
    > > (gdb)

    >
    > Because you are interpreting one value as one type and the other value
    > as a different type.
    Arthur, Feb 13, 2008
    #4
  5. Morris Dovey

    Arthur Guest

    On Feb 14, 7:32 am, "Roman Mashak" <> wrote:
    > Hello, Arthur!
    > You wrote on Tue, 12 Feb 2008 21:59:55 -0800 (PST):
    >
    > [skip]
    > Thanks for your explanations.
    >
    > A> and refers it using
    > A> u.i = 0x12;
    > A> u.c[0] = 0x12;
    > A> the compiler will simply convert them into instructions like this:
    > A> movl $0x12, _u
    > A> movb $0x12, _u
    > A> The compiler uses same symbols for u.i and u.c[0].
    >
    > A> The reason why they look different in your debugger is that
    > A> Intel CPUs use little-endian.
    > A> un.s is placed in memory like this:
    > A> 0x02 0x01
    > A> when referred as u.s, it means a short int 0x0102, i.e. s = 0x102
    > A> when referred as u.c, it means an array of char, {0x02, 0x01}
    > But both u.i and u.c are placed in memory on the same little-endian machine,
    > why do they look differently? I can't catch how it is done.
    >
    > With best regards, Roman Mashak. E-mail:


    Hello! Your compiler stores the information that un.s is a short int
    and un.c[] is an array of char. And when you compile your program with
    -g, it passes the info to your debugger, so your debugger knows it.

    To understand why it looks differently, you have to keep in mind that
    both un.c and un.s are symbols that are simply addresses in
    memory. (And the two address are the same)

    Suppose the union 'un' has been placed in address 0x80490d4, and when
    you command 'un.s = 0x102;', the processor will set the one byte
    located
    at 0x80490d4 to 0x02, and the one at 0x80490d5 to 0x01, since the
    Intel
    CPU is little-endian

    And when you refer to un.s, since sizeof(short) is 2(in most 32-bit
    systems),
    the CPU will fetch the two byte at 0x80490d4(0x02) and
    0x80490d5(0x01), and
    connect them, in little-endian. That will be 0x0102, i.e.,
    0x102 just as your debugger reports.

    But when you refer to un.c, since sizeof(char) is 1, the CPU fetches
    one
    byte at 0x80490d4(0x02), and present it to the debugger, and then
    fetches
    the next one.(0x01) It doesn't connect them (in little-endian), so
    they
    look like what they are placed in the memory, 0x02, 0x01, just as
    your debugger reports.

    My explanation is lengthily, sorry.
    Arthur, Feb 13, 2008
    #5
  6. Roman Mashak wrote:
    > Hello,
    >
    > I'm going through the "UNIX network programming" by R.Stevens and stuck with
    > the following code, determining the endiannes of a host it is running on:
    >
    > #include <stdio.h>
    > #include <stdlib.h>
    >
    > #define CPU_VENDOR_OS "i686-pc-linux-gnu"
    >
    > int main(void)
    > {
    > union {
    > short s;
    > char c[sizeof(short)];
    > } un;
    >
    > un.s = 0x0102;
    > printf("%s: ", CPU_VENDOR_OS);
    > if (sizeof(short) == 2) {
    > if (un.c[0] == 1 && un.c[1] == 2)
    > printf("big-endian\n");
    > else if (un.c[0] == 2 && un.c[1] == 1)
    > printf("little-endian\n");
    > else
    > printf("unknown\n");
    > } else
    > printf("sizeof(short) = %d\n", sizeof(short));
    >
    > exit(0);
    > }
    >
    > What I don't get is how come that un.c[0] and un.c[1] both contain what has
    > been un.s initialized, i.e. 0x0102. Is it a feature of 'union'?
    > Why could not we use 'struct' to check how bytes are placed in memory ?


    The program is doing a very bad thing. The folks who "explained" why
    this code "works" are doing you a disservice. The value of any union
    member other than the last stored into is unspecified. _Never_ store
    into one member of a union and attempt to access its value though
    another except when accessing an indentical common initial segment of
    struct members. This is a special exception to the general rule that a
    union can contain only one of its component values at a time. Storing
    into one member and accessing another is attempting to have the unison
    contain more than one component values at a time.

    You can accomplish the above with a non-union array into which you
    memmove a value.
    Martin Ambuhl, Feb 13, 2008
    #6
  7. Morris Dovey

    Mark Bluemel Guest

    Martin Ambuhl wrote:
    > Roman Mashak wrote:
    >> Hello,
    >>
    >> I'm going through the "UNIX network programming" by R.Stevens and
    >> stuck with the following code, determining the endiannes of a host it
    >> is running on:
    >>
    >> #include <stdio.h>
    >> #include <stdlib.h>
    >>
    >> #define CPU_VENDOR_OS "i686-pc-linux-gnu"
    >>
    >> int main(void)
    >> {
    >> union {
    >> short s;
    >> char c[sizeof(short)];
    >> } un;
    >>
    >> un.s = 0x0102;
    >> printf("%s: ", CPU_VENDOR_OS);
    >> if (sizeof(short) == 2) {
    >> if (un.c[0] == 1 && un.c[1] == 2)
    >> printf("big-endian\n");
    >> else if (un.c[0] == 2 && un.c[1] == 1)
    >> printf("little-endian\n");
    >> else
    >> printf("unknown\n");
    >> } else
    >> printf("sizeof(short) = %d\n", sizeof(short));
    >>
    >> exit(0);
    >> }


    > The program is doing a very bad thing.


    I _think_ that's a bit of an overstatement. The program is not intended
    to be totally portable C, given that it's included in a book on Unix
    programming. I've just glanced at my copy of the book and from context
    and comments in the text, it's clear that the gcc compiler is assumed.
    Mark Bluemel, Feb 13, 2008
    #7
  8. Mark Bluemel wrote:
    > Martin Ambuhl wrote:


    >
    >> The program is doing a very bad thing.

    >
    > I _think_ that's a bit of an overstatement. The program is not intended
    > to be totally portable C, given that it's included in a book on Unix
    > programming.


    Then it belongs in a Unix newsgroup, not in comp.lang.c

    > I've just glanced at my copy of the book and from context
    > and comments in the text, it's clear that the gcc compiler is assumed.


    And if gcc is significant, and if you are allergic to posting in the
    Unix newsgroups, your second choice is a gnu newsgroup.

    The people "answering" your question did you two gross disservices
    1) they told you that undefined behavior was defined and
    2) they led you to believe that off-topic posts were OK.
    Martin Ambuhl, Feb 13, 2008
    #8
  9. Morris Dovey

    Mark Bluemel Guest

    Martin Ambuhl wrote:
    > Mark Bluemel wrote:
    >> Martin Ambuhl wrote:

    >
    >>
    >>> The program is doing a very bad thing.

    >>
    >> I _think_ that's a bit of an overstatement. The program is not intended
    >> to be totally portable C, given that it's included in a book on Unix
    >> programming.

    >
    > Then it belongs in a Unix newsgroup, not in comp.lang.c


    Perhaps, but the OP didn't realise that. (Note that I am not the OP).

    >> I've just glanced at my copy of the book and from context
    >> and comments in the text, it's clear that the gcc compiler is assumed.


    > The people "answering" your question did you two gross disservices


    Again, it was not my question.

    > 1) they told you that undefined behavior was defined and
    > 2) they led you to believe that off-topic posts were OK.


    I'm not convinced it was an off-topic post.
    * The OP saw a piece of code, didn't quite "get it" and asked for
    clarification.
    * Some people answered his query inaccurately.
    * You answered somewhat harshly, but accurately, as far as I know.
    * I chose to add some further clarification.

    To the Original Poster:

    The program depends on behaviour which is not required by the C
    standard, but which appears to be dependable in the context in
    which the original author wrote it.

    Following Martin's suggestion, the program could perhaps be better
    written as :-

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>

    #define CPU_VENDOR_OS "i686-pc-linux-gnu"

    int main(void)
    {
    short s;
    char c[sizeof(short)];

    s = 0x0102;
    printf("%s: ", CPU_VENDOR_OS);
    if (sizeof(short) == 2) {
    memcpy((void *)&c, (void *)&s, sizeof(short));
    if (c[0] == 1 && c[1] == 2)
    printf("big-endian\n");
    else if (c[0] == 2 && c[1] == 1)
    printf("little-endian\n");
    else
    printf("unknown\n");
    } else
    printf("sizeof(short) = %d\n", sizeof(short));

    exit(0);
    }
    Mark Bluemel, Feb 13, 2008
    #9
  10. Morris Dovey

    Martin Guest

    On Feb 13, 8:50 am, Martin Ambuhl <> wrote:
    >_Never_ store into one member of a union and attempt to access its value though
    > another except when accessing an indentical common initial segment of
    > struct members.  This is a special exception to the general rule that a
    > union can contain only one of its component values at a time.  Storing
    > into one member and accessing another is attempting to have the unison
    > contain more than one component values at a time.


    Does that mean that the answer to Summit's "C Programming FAQs"
    Question 20.9 is wrong? Viz.:

    union {
    int i;
    char c[sizeof(int)];
    } x;

    x.i = 1;

    if (x.c[0] == 1)
    printf("little-endian\n");
    else
    printf("big-endian\n");
    Martin, Feb 13, 2008
    #10
  11. Martin wrote:
    > On Feb 13, 8:50 am, Martin Ambuhl <> wrote:
    >> _Never_ store into one member of a union and attempt to access its value though
    >> another except when accessing an indentical common initial segment of
    >> struct members. This is a special exception to the general rule that a
    >> union can contain only one of its component values at a time. Storing
    >> into one member and accessing another is attempting to have the unison
    >> contain more than one component values at a time.

    >
    > Does that mean that the answer to Summit's "C Programming FAQs"
    > Question 20.9 is wrong? Viz.:
    >
    > union {
    > int i;
    > char c[sizeof(int)];
    > } x;
    >
    > x.i = 1;
    >
    > if (x.c[0] == 1)
    > printf("little-endian\n");
    > else
    > printf("big-endian\n");
    >


    Yes, it does. Notice that this snippet corresponds to Harbison &
    Steele's example program in 6.1.2 "Byte Ordering". H&S correctly
    introduce it with this text: "Here is a program that determines a
    computer's byte ordering by using a union in a nonportable fashion."
    The FAQ's reference is to an older edition of H&S, so I don't know if
    that text was there. If that text was there, Steve ought not to have
    suppressed it. In any case, if he keeps that example he ought to add
    such a disclaimer. Nonportable uses of the language in the FAQ ought to
    be flagged. Interestingly, the nonportable use of the language makes
    this code worthless, since it is designed to tell you something about
    nonportable aspects of an implementation.
    Martin Ambuhl, Feb 13, 2008
    #11
  12. Morris Dovey

    Roman Mashak Guest

    Hello,

    I'm going through the "UNIX network programming" by R.Stevens and stuck with
    the following code, determining the endiannes of a host it is running on:

    #include <stdio.h>
    #include <stdlib.h>

    #define CPU_VENDOR_OS "i686-pc-linux-gnu"

    int main(void)
    {
    union {
    short s;
    char c[sizeof(short)];
    } un;

    un.s = 0x0102;
    printf("%s: ", CPU_VENDOR_OS);
    if (sizeof(short) == 2) {
    if (un.c[0] == 1 && un.c[1] == 2)
    printf("big-endian\n");
    else if (un.c[0] == 2 && un.c[1] == 1)
    printf("little-endian\n");
    else
    printf("unknown\n");
    } else
    printf("sizeof(short) = %d\n", sizeof(short));

    exit(0);
    }

    What I don't get is how come that un.c[0] and un.c[1] both contain what has
    been un.s initialized, i.e. 0x0102. Is it a feature of 'union'?
    Why could not we use 'struct' to check how bytes are placed in memory ?

    Thanks in advance!


    With best regards, Roman Mashak. E-mail:
    Roman Mashak, Feb 13, 2008
    #12
  13. Morris Dovey

    Roman Mashak Guest

    Hello, Morris!
    You wrote on Tue, 12 Feb 2008 20:32:28 -0500:

    ??>> What I don't get is how come that un.c[0] and un.c[1] both contain
    ??>> what has been un.s initialized, i.e. 0x0102. Is it a feature of
    ??>> 'union'? Why could not we use 'struct' to check how bytes are placed
    ??>> in memory ?

    MD> The elements in a structure all occupy separate and distinct
    MD> "pieces" of memory, but the elements of a union all occupy a
    MD> common "piece" of memory.
    What is the mechanics behind that? Say, in posted example at run-time un.s =
    0x0102 and this value (0x0102) occupies some common memory. Is it CPU who is
    in charge to lay out un.c value in the memory according to architecture?
    Then why do both values look differently, in debugger:

    (gdb) p/x un
    $3 = {s = 0x102, c = {0x2, 0x1}}
    (gdb)

    With best regards, Roman Mashak. E-mail:
    Roman Mashak, Feb 13, 2008
    #13
  14. Martin Ambuhl <> writes:

    > Roman Mashak wrote:
    >> I'm going through the "UNIX network programming" by R.Stevens and
    >> stuck with the following code, determining the endiannes of a host
    >> it is running on:
    >>
    >> #include <stdio.h>
    >> #include <stdlib.h>
    >>
    >> #define CPU_VENDOR_OS "i686-pc-linux-gnu"
    >>
    >> int main(void)
    >> {
    >> union {
    >> short s;
    >> char c[sizeof(short)];
    >> } un;
    >>
    >> un.s = 0x0102;
    >> printf("%s: ", CPU_VENDOR_OS);
    >> if (sizeof(short) == 2) {
    >> if (un.c[0] == 1 && un.c[1] == 2)
    >> printf("big-endian\n");
    >> else if (un.c[0] == 2 && un.c[1] == 1)
    >> printf("little-endian\n");
    >> else
    >> printf("unknown\n");
    >> } else
    >> printf("sizeof(short) = %d\n", sizeof(short));
    >>
    >> exit(0);
    >> }
    >>
    >> What I don't get is how come that un.c[0] and un.c[1] both contain
    >> what has been un.s initialized, i.e. 0x0102. Is it a feature of
    >> union'?
    >> Why could not we use 'struct' to check how bytes are placed in memory ?

    >
    > The program is doing a very bad thing. The folks who "explained" why
    > this code "works" are doing you a disservice. The value of any union
    > member other than the last stored into is unspecified.


    Can you cite the prohibition? I thought it had been removed. There
    is a footnote (yes, I know, non-normative) that states:

    If the member used to access the contents of a union object is not
    the same as the member last used to store a value in the object, the
    appropriate part of the object representation of the value is
    reinterpreted as an object representation in the new type as
    described in 6.2.6 (a process sometimes called "type punning"). This
    might be a trap representation.

    (6.3.6 is the section of the representation of types.) Since unsigned
    char can't have trap representations, I think the code above could be
    re-written to stay within the letter of C99. The intent seems clear:
    to allow type punning using a union.

    --
    Ben.
    Ben Bacarisse, Feb 13, 2008
    #14
  15. Morris Dovey

    Michael Mair Guest

    Ben Bacarisse wrote:
    > Martin Ambuhl <> writes:
    >>Roman Mashak wrote:
    >>
    >>>I'm going through the "UNIX network programming" by R.Stevens and
    >>>stuck with the following code, determining the endiannes of a host
    >>>it is running on:
    >>>
    >>>#include <stdio.h>
    >>>#include <stdlib.h>
    >>>
    >>>#define CPU_VENDOR_OS "i686-pc-linux-gnu"
    >>>
    >>>int main(void)
    >>>{
    >>> union {
    >>> short s;
    >>> char c[sizeof(short)];
    >>> } un;
    >>>
    >>> un.s = 0x0102;
    >>> printf("%s: ", CPU_VENDOR_OS);
    >>> if (sizeof(short) == 2) {
    >>> if (un.c[0] == 1 && un.c[1] == 2)
    >>> printf("big-endian\n");
    >>> else if (un.c[0] == 2 && un.c[1] == 1)
    >>> printf("little-endian\n");
    >>> else
    >>> printf("unknown\n");
    >>> } else
    >>> printf("sizeof(short) = %d\n", sizeof(short));
    >>>
    >>> exit(0);
    >>>}
    >>>
    >>>What I don't get is how come that un.c[0] and un.c[1] both contain
    >>>what has been un.s initialized, i.e. 0x0102. Is it a feature of
    >>>union'?
    >>>Why could not we use 'struct' to check how bytes are placed in memory ?

    >>
    >>The program is doing a very bad thing. The folks who "explained" why
    >>this code "works" are doing you a disservice. The value of any union
    >>member other than the last stored into is unspecified.

    >
    > Can you cite the prohibition? I thought it had been removed. There
    > is a footnote (yes, I know, non-normative) that states:
    >
    > If the member used to access the contents of a union object is not
    > the same as the member last used to store a value in the object, the
    > appropriate part of the object representation of the value is
    > reinterpreted as an object representation in the new type as
    > described in 6.2.6 (a process sometimes called "type punning"). This
    > might be a trap representation.
    >
    > (6.3.6 is the section of the representation of types.) Since unsigned
    > char can't have trap representations, I think the code above could be
    > re-written to stay within the letter of C99. The intent seems clear:
    > to allow type punning using a union.


    In the thread starting at
    <>
    Tim Rentsch pointed out

    ,- From <> --
    My understanding is that the storing one member of a union in
    different memory than another member was the result of unclear
    language in the standard, and that the unclear language is
    expected to be addressed through a TC. See:
    http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_283.htm
    `----

    Cheers
    Michael
    --
    E-Mail: Mine is an /at/ gmx /dot/ de address.
    Michael Mair, Feb 13, 2008
    #15
  16. Morris Dovey

    Roman Mashak Guest

    Hello, Arthur!
    You wrote on Tue, 12 Feb 2008 21:59:55 -0800 (PST):

    [skip]
    Thanks for your explanations.

    A> and refers it using
    A> u.i = 0x12;
    A> u.c[0] = 0x12;
    A> the compiler will simply convert them into instructions like this:
    A> movl $0x12, _u
    A> movb $0x12, _u
    A> The compiler uses same symbols for u.i and u.c[0].

    A> The reason why they look different in your debugger is that
    A> Intel CPUs use little-endian.
    A> un.s is placed in memory like this:
    A> 0x02 0x01
    A> when referred as u.s, it means a short int 0x0102, i.e. s = 0x102
    A> when referred as u.c, it means an array of char, {0x02, 0x01}
    But both u.i and u.c are placed in memory on the same little-endian machine,
    why do they look differently? I can't catch how it is done.

    With best regards, Roman Mashak. E-mail:
    Roman Mashak, Feb 13, 2008
    #16
  17. Ben Bacarisse wrote:
    > Martin Ambuhl <> writes:

    [...]
    >> The value of any union
    >> member other than the last stored into is unspecified.

    >
    > Can you cite the prohibition?


    Appendix J is "informative", but includes explictly:

    J.1 Unspecified behavior
    1 The following are unspecified:
    [...]
    -- The value of a union member other than the last one stored into
    (6.2.6.1).
    Martin Ambuhl, Feb 14, 2008
    #17
  18. Martin Ambuhl <> writes:

    > Ben Bacarisse wrote:
    >> Martin Ambuhl <> writes:

    > [...]
    >>> The value of any union
    >>> member other than the last stored into is unspecified.

    >>
    >> Can you cite the prohibition?

    >
    > Appendix J is "informative", but includes explictly:
    >
    > J.1 Unspecified behavior
    > 1 The following are unspecified:
    > [...]
    > -- The value of a union member other than the last one stored into
    > (6.2.6.1).


    Ah, right. I misunderstood your rather strong prohibition on not
    doing this type punnig with a union. The behaviour is unspecified,
    but so is the behaviour of your suggested alternative. Using memcpy
    and inspecting the result will be no more specified than doing the
    union trick. Is your objection to the union method stronger than
    this?

    --
    Ben.
    Ben Bacarisse, Feb 14, 2008
    #18
  19. Morris Dovey

    Martin Guest

    On Feb 13, 5:49 pm, Martin Ambuhl <> wrote:
    > Yes, it does.  Notice that this snippet corresponds
    > to Harbison & Steele's example program in 6.1.2 "Byte Ordering".
    > H&S correctly introduce it with this text: "Here is a program
    > that determines a computer's byte ordering by using a union in
    > a nonportable fashion." The FAQ's reference is to an older
    > edition of H&S, so I don't know if that text was there.  If that
    > text was there, Steve ought not to have suppressed it.  In any
    > case, if he keeps that example he ought to add such a disclaimer.
    > Nonportable uses of the language in the FAQ ought to be flagged.
    > Interestingly, the nonportable use of the language makes this
    > code worthless, since it is designed to tell you something about
    > nonportable aspects of an implementation.


    My copy of the book is dated 1996. I don't think there is a later
    version.

    In the book, as well as the union example I posted, there is also the
    example as provided in the online FAQ, which uses a pointer. The
    online FAQ and my edition of the book also cross-reference to Harbison
    & Steel Sec. 6.1.2 pp. 163-4.

    The introductory text you quote is not in my edition of the book.

    --
    Martin
    Martin, Feb 14, 2008
    #19
  20. Morris Dovey

    Roman Mashak Guest

    Hello, Martin!
    You wrote on Wed, 13 Feb 2008 05:31:07 -0500:

    ??>> I _think_ that's a bit of an overstatement. The program is not
    ??>> intended to be totally portable C, given that it's included in a book
    ??>> on Unix programming.

    MA> Then it belongs in a Unix newsgroup, not in comp.lang.c

    I thought the code rather belongs to C forum, because there were no Unix
    specific calls. And this turn out to be true, I learned such behavior is
    undefined and not portable.
    Thanks for your explanations.

    ??>> I've just glanced at my copy of the book and from context
    ??>> and comments in the text, it's clear that the gcc compiler is assumed.

    MA> And if gcc is significant, and if you are allergic to posting in the
    MA> Unix newsgroups, your second choice is a gnu newsgroup.


    With best regards, Roman Mashak. E-mail:
    Roman Mashak, Feb 14, 2008
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Matt Garman
    Replies:
    1
    Views:
    649
    Matt Garman
    Apr 25, 2004
  2. Luca

    union question

    Luca, Aug 27, 2003, in forum: C++
    Replies:
    3
    Views:
    310
    Ron Natalie
    Sep 2, 2003
  3. Peter Dunker

    union in struct without union name

    Peter Dunker, Apr 26, 2004, in forum: C Programming
    Replies:
    2
    Views:
    852
    Chris Torek
    Apr 26, 2004
  4. =?gb2312?B?zfWzrLey?=

    About Union's question

    =?gb2312?B?zfWzrLey?=, Mar 30, 2007, in forum: C Programming
    Replies:
    32
    Views:
    932
    CBFalconer
    Apr 2, 2007
  5. JDT
    Replies:
    7
    Views:
    354
    Ian Collins
    Apr 23, 2007
Loading...

Share This Page