Pre-offsetof() question

Discussion in 'C Programming' started by Arthur J. O'Dwyer, Oct 6, 2003.

  1. As far as I know, C89/C90 did not contain the
    now-standard offsetof() macro.

    Did C89 mandate that structs had to have a consistent
    layout? For example, consider the typical layout of
    the following structure:

    struct weird
    {
    int x; /* sizeof(int)==4 here */
    double y; /* sizeof(double)==8 here */
    int z;
    };

    Now, let's suppose that the target architecture has typical
    80x86 alignment requirements, where 'int' aligns on 4-byte
    boundaries and 'double' on 8-byte boundaries.
    A C99 compiler might produce a layout that looked like
    this:

    |_x__|####|___y____|_z__|####|

    sizeof (struct weird) == 24 bytes


    But could a C89, pre-offsetof() compiler decide to make
    the layout of the struct vary, like this:

    |_x__|####|___y____|_z__| on 8-byte alignment

    |_x__|___y____|####|_z__| on 4-byte alignment

    sizeof (struct weird) == 20 bytes


    Note that the relative ordering of the members is
    preserved; each 'struct weird' has the same size in
    bytes; and all objects are properly aligned for their
    type. But the "weird" ordering has saved us 4 bytes
    per structure!

    Does C89 allow this, or is it disallowed by something
    in that standard? If so, what?

    TIA,
    -Arthur
    Arthur J. O'Dwyer, Oct 6, 2003
    #1
    1. Advertising

  2. Arthur J. O'Dwyer

    Eric Sosman Guest

    "Arthur J. O'Dwyer" wrote:
    >
    > As far as I know, C89/C90 did not contain the
    > now-standard offsetof() macro.


    Full stop: C89 invented the <stddef.h> header, and specified
    that it must provide offsetof().

    > Did C89 mandate that structs had to have a consistent
    > layout? For example, consider the typical layout of
    > the following structure:
    >
    > struct weird
    > {
    > int x; /* sizeof(int)==4 here */
    > double y; /* sizeof(double)==8 here */
    > int z;
    > };
    >
    > Now, let's suppose that the target architecture has typical
    > 80x86 alignment requirements, where 'int' aligns on 4-byte
    > boundaries and 'double' on 8-byte boundaries.
    > A C99 compiler might produce a layout that looked like
    > this:
    >
    > |_x__|####|___y____|_z__|####|
    >
    > sizeof (struct weird) == 24 bytes
    >
    > But could a C89, pre-offsetof() compiler decide to make
    > the layout of the struct vary, like this:
    >
    > |_x__|####|___y____|_z__| on 8-byte alignment
    >
    > |_x__|___y____|####|_z__| on 4-byte alignment
    >
    > sizeof (struct weird) == 20 bytes
    >
    > Note that the relative ordering of the members is
    > preserved; each 'struct weird' has the same size in
    > bytes; and all objects are properly aligned for their
    > type. But the "weird" ordering has saved us 4 bytes
    > per structure!
    >
    > Does C89 allow this, or is it disallowed by something
    > in that standard? If so, what?


    No version of the Standard describes what alignments
    are to be enforced. However, the rules for compatibility
    of types guarantee that the same struct type will have the
    same arrangement of padding bytes in all translation units.

    Could this arrangement be different depending on flags
    calling for different "strictnesses" of alignment? Yes, of
    course -- but this isn't a contradiction, because using a
    different set of compiler flags gives you a different
    implementation of C, and the Standard makes no requirement
    that translation units compiled by different implementations
    must interoperate.

    By the way, note that your 8-byte alignment example is
    faulty. If a double must be aligned to an 8-byte boundary,
    the sizeof a struct containing a double must be a multiple
    of 8 bytes. Otherwise, you would not be able to malloc()
    an array of two such structs:

    struct weird *p = malloc(2 * sizeof *p); // assume 40

    0 4 8 16 20 24 28 36 40
    |_x__|####|___y____|_z__|_x__|####|___y____|_z__|
    ^ ^
    | |
    p p+1

    Note that (p+1)->y is mis-aligned.

    --
    Eric Sosman, Oct 6, 2003
    #2
    1. Advertising

  3. On Mon, 6 Oct 2003, Eric Sosman wrote:
    >
    > Arthur J. O'Dwyer wrote:
    > >
    > > As far as I know, C89/C90 did not contain the
    > > now-standard offsetof() macro.

    >
    > Full stop: C89 invented the <stddef.h> header, and specified
    > that it must provide offsetof().


    Oops. I guess the point is moot, then.

    > > struct weird
    > > {
    > > int x; /* sizeof(int)==4 here */
    > > double y; /* sizeof(double)==8 here */
    > > int z;
    > > };


    > > But could a C89, pre-offsetof() compiler decide to make
    > > the layout of the struct vary, like this:
    > >
    > > |_x__|####|___y____|_z__| on 8-byte alignment
    > >
    > > |_x__|___y____|####|_z__| on 4-byte alignment
    > >
    > > sizeof (struct weird) == 20 bytes
    > >
    > > Note that the relative ordering of the members is
    > > preserved; each 'struct weird' has the same size in
    > > bytes; and all objects are properly aligned for their
    > > type. But the "weird" ordering has saved us 4 bytes
    > > per structure!



    > No version of the Standard describes what alignments
    > are to be enforced. However, the rules for compatibility
    > of types guarantee that the same struct type will have the
    > same arrangement of padding bytes in all translation units.


    How so? (Obviously, the existence of 'offsetof' assumes
    that all 'struct weird's will have the same layout -- but
    would that rule be explicitly stated anywhere if 'offsetof'
    didn't exist?)


    > By the way, note that your 8-byte alignment example is
    > faulty. If a double must be aligned to an 8-byte boundary,
    > the sizeof a struct containing a double must be a multiple
    > of 8 bytes.


    Why? (Other than the paragraph which in N869 is 7.17#3,
    that is.)

    > Otherwise, you would not be able to malloc()
    > an array of two such structs:
    >
    > struct weird *p = malloc(2 * sizeof *p); // assume 40
    >
    > 0 4 8 16 20 24 28 36 40
    > |_x__|####|___y____|_z__|_x__|####|___y____|_z__|


    Ah -- your diagram is incorrect. :) The "correct" layout
    for two optimized (but apparently non-conforming) 'struct
    weird's is:

    > 0 4 8 16 20 24 28 36 40

    |_x__|####|___y____|_z__|_x__|___y____|####|_z__|
    > ^ ^
    > | |
    > p p+1
    >
    > Note that (p+1)->y is mis-aligned.


    Not anymore -- not if we remove 7.17#3. I had thought
    that C89 didn't have offsetof(); apparently I was
    wrong. Never mind, then.

    -Arthur
    Arthur J. O'Dwyer, Oct 6, 2003
    #3
  4. Arthur J. O'Dwyer

    Eric Sosman Guest

    "Arthur J. O'Dwyer" wrote:
    >
    > On Mon, 6 Oct 2003, Eric Sosman wrote:
    > >
    > > By the way, note that your 8-byte alignment example is
    > > faulty. If a double must be aligned to an 8-byte boundary,
    > > the sizeof a struct containing a double must be a multiple
    > > of 8 bytes.

    >
    > Why? (Other than the paragraph which in N869 is 7.17#3,
    > that is.)
    >
    > > Otherwise, you would not be able to malloc()
    > > an array of two such structs:
    > >
    > > struct weird *p = malloc(2 * sizeof *p); // assume 40
    > >
    > > 0 4 8 16 20 24 28 36 40
    > > |_x__|####|___y____|_z__|_x__|####|___y____|_z__|

    >
    > Ah -- your diagram is incorrect. :) The "correct" layout
    > for two optimized (but apparently non-conforming) 'struct
    > weird's is:
    >
    > > 0 4 8 16 20 24 28 36 40

    > |_x__|####|___y____|_z__|_x__|___y____|####|_z__|
    > > ^ ^
    > > | |
    > > p p+1
    > >
    > > Note that (p+1)->y is mis-aligned.

    >
    > Not anymore -- not if we remove 7.17#3. I had thought
    > that C89 didn't have offsetof(); apparently I was
    > wrong. Never mind, then.


    Aha! Finally, the mystery of why offsetof intruded itself
    into an apparently unrelated question becomes clear. Just to
    be sure I've understood you: You're wondering whether different
    instances of struct weird in the same program could arrange
    their padding differently. Clearly, this cannot be the case
    if offsetof(struct weird, y) is single-valued.

    But even without offsetof I think you can rule out such
    shenanigans. True, direct assignment of struct objects might
    perhaps be clever enough to play games. But memcpy() must
    also work:

    struct weird *p = malloc(2 * sizeof *p);
    p[0].x = ...; p[0].y = ...; p[0].z = ...;
    memcpy (p+1, p, sizeof *p);
    assert (p[1].x == p[0].x);
    assert (p[1].y == p[0].y); // the crucial point
    assert (p[2].z == p[0].z);

    Since memcpy() knows only the size of the data being copied
    and nothing about the nature of the object those data bytes
    represent, it cannot possibly know enough to "slide" the
    `y' element while copying the bag of bytes from one place
    to another. Similar remarks apply to realloc() and to
    fwrite()/fread(), and to other type-oblivious ways of moving
    data from place to place.

    --
    Eric Sosman, Oct 6, 2003
    #4
  5. On Mon, 6 Oct 2003, Eric Sosman wrote:
    >
    > Aha! Finally, the mystery of why offsetof intruded itself
    > into an apparently unrelated question becomes clear. Just to
    > be sure I've understood you: You're wondering whether different
    > instances of struct weird in the same program could arrange
    > their padding differently. Clearly, this cannot be the case
    > if offsetof(struct weird, y) is single-valued.


    Yes! You've hit the nail on the head.

    > But even without offsetof I think you can rule out such
    > shenanigans. True, direct assignment of struct objects might
    > perhaps be clever enough to play games. But memcpy() must
    > also work:
    >
    > struct weird *p = malloc(2 * sizeof *p);
    > p[0].x = ...; p[0].y = ...; p[0].z = ...;
    > memcpy (p+1, p, sizeof *p);
    > assert (p[1].x == p[0].x);
    > assert (p[1].y == p[0].y); // the crucial point
    > assert (p[2].z == p[0].z);


    Yes, but *must* these 'assert(...)'s succeed? (Obviously
    they needn't succeed if p[0].y is a trap representation,
    or one of p[0],p[1] is volatile, for instance.)

    Where does it say that

    foo x = ...;
    foo y = ...;
    memcpy(&x, &y, sizeof (foo))
    assert (x==y);

    must necessarily succeed? I don't see anywhere, except perhaps
    footnote 38 (which says that struct assignment may be done
    "element-at-a-time or via memcpy"). And I don't think footnotes
    are normative, even if the intent of the footnote were clearer.

    -Arthur
    [Remember, the whole question is moot.] ;-)
    Arthur J. O'Dwyer, Oct 7, 2003
    #5
  6. Arthur J. O'Dwyer

    Jack Klein Guest

    On Mon, 6 Oct 2003 19:08:06 -0400 (EDT), "Arthur J. O'Dwyer"
    <> wrote in comp.lang.c:

    >
    > On Mon, 6 Oct 2003, Eric Sosman wrote:
    > >
    > > Aha! Finally, the mystery of why offsetof intruded itself
    > > into an apparently unrelated question becomes clear. Just to
    > > be sure I've understood you: You're wondering whether different
    > > instances of struct weird in the same program could arrange
    > > their padding differently. Clearly, this cannot be the case
    > > if offsetof(struct weird, y) is single-valued.

    >
    > Yes! You've hit the nail on the head.
    >
    > > But even without offsetof I think you can rule out such
    > > shenanigans. True, direct assignment of struct objects might
    > > perhaps be clever enough to play games. But memcpy() must
    > > also work:
    > >
    > > struct weird *p = malloc(2 * sizeof *p);
    > > p[0].x = ...; p[0].y = ...; p[0].z = ...;
    > > memcpy (p+1, p, sizeof *p);
    > > assert (p[1].x == p[0].x);
    > > assert (p[1].y == p[0].y); // the crucial point
    > > assert (p[2].z == p[0].z);

    >
    > Yes, but *must* these 'assert(...)'s succeed? (Obviously
    > they needn't succeed if p[0].y is a trap representation,
    > or one of p[0],p[1] is volatile, for instance.)
    >
    > Where does it say that
    >
    > foo x = ...;
    > foo y = ...;
    > memcpy(&x, &y, sizeof (foo))
    > assert (x==y);
    >
    > must necessarily succeed? I don't see anywhere, except perhaps
    > footnote 38 (which says that struct assignment may be done
    > "element-at-a-time or via memcpy"). And I don't think footnotes
    > are normative, even if the intent of the footnote were clearer.
    >
    > -Arthur
    > [Remember, the whole question is moot.] ;-)


    What you missed is:

    ========
    6.2.6 Representations of types

    6.2.6.1 General

    1 The representations of all types are unspecified except as stated in
    this subclause.

    2 Except for bit-fields, objects are composed of contiguous sequences
    of one or more bytes, the number, order, and encoding of which are
    either explicitly specified or implementation-defined.

    3 Values stored in unsigned bit-fields and objects of type unsigned
    char shall be represented using a pure binary notation.

    4 Values stored in non-bit-field objects of any other object type
    consist of n ´ CHAR_BIT bits, where n is the size of an object of that
    type, in bytes. The value may be copied into an object of type
    unsigned char [n] (e.g., by memcpy); the resulting set of bytes is
    called the object representation of the value. Values stored in
    bit-fields consist of m bits, where m is the size specified for the
    bit-field. The object representation is the set of m bits the
    bit-field comprises in the addressable storage unit holding it. Two
    values (other than NaNs) with the same object representation compare
    equal, but values that compare equal may have different object
    representations.
    ========

    From C99, and note the last sentence in paragraph 4.

    Even without this, it would be impossible pass or return structures or
    pointers to structures to functions in separate translation units if
    an identical structure definition did not result in identically laid
    out objects.

    --
    Jack Klein
    Home: http://JK-Technology.Com
    FAQs for
    comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
    comp.lang.c++ http://www.parashift.com/c -faq-lite/
    alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c /faq
    Jack Klein, Oct 7, 2003
    #6
  7. On Tue, 7 Oct 2003, Jack Klein wrote:
    >
    > Arthur J. O'Dwyer wrote:
    > > On Mon, 6 Oct 2003, Eric Sosman wrote:
    > > >
    > > > Aha! Finally, the mystery of why offsetof intruded itself
    > > > into an apparently unrelated question becomes clear. Just to
    > > > be sure I've understood you: You're wondering whether different
    > > > instances of struct weird in the same program could arrange
    > > > their padding differently. Clearly, this cannot be the case
    > > > if offsetof(struct weird, y) is single-valued.

    > >
    > > Yes! You've hit the nail on the head.
    > >
    > > > But even without offsetof I think you can rule out such
    > > > shenanigans. True, direct assignment of struct objects might
    > > > perhaps be clever enough to play games. But memcpy() must
    > > > also work:


    > > Where does it say that
    > >
    > > foo x = ...;
    > > foo y = ...;
    > > memcpy(&x, &y, sizeof (foo))
    > > assert (x==y);
    > >
    > > must necessarily succeed? I don't see anywhere, except perhaps
    > > footnote 38


    > What you missed is:
    >
    > ========
    > 6.2.6 Representations of types
    >
    > 6.2.6.1 General
    >
    > 1 The representations of all types are unspecified except as stated in
    > this subclause.
    >
    > 2 Except for bit-fields, objects are composed of contiguous sequences
    > of one or more bytes, the number, order, and encoding of which are
    > either explicitly specified or implementation-defined.


    Okay, no problems here. The "weird" layout can be defined easily
    by the implementation.

    > 3 Values stored in unsigned bit-fields and objects of type unsigned
    > char shall be represented using a pure binary notation.
    >
    > 4 Values stored in non-bit-field objects of any other object type
    > consist of n ´ CHAR_BIT bits, where n is the size of an object of that
    > type, in bytes. The value may be copied into an object of type
    > unsigned char [n] (e.g., by memcpy); the resulting set of bytes is
    > called the object representation of the value. Values stored in
    > bit-fields consist of m bits, where m is the size specified for the
    > bit-field. The object representation is the set of m bits the
    > bit-field comprises in the addressable storage unit holding it. Two
    > values (other than NaNs) with the same object representation compare
    > equal,


    Okay, this is the part I assume you mean. Well,
    <devil's-advocate>
    what exactly does it mean for two structs to "compare equal"?
    I mean, you can't use the == operator on structs, right? And if
    we can only talk about member-by-member equality, well then we'll
    have to consider a *member-by-member* memcpy -- which works fine!
    </devil's-advocate>

    > but values that compare equal may have different object
    > representations.
    > ========
    >
    > From C99, and note the last sentence in paragraph 4.


    (And not in N869, right?)

    > Even without this, it would be impossible pass or return structures or
    > pointers to structures to functions in separate translation units if
    > an identical structure definition did not result in identically laid
    > out objects.


    Debatable. But irrelevant. ;-)
    Remember, the "weird" layout is perfectly consistent between t.u.'s.
    A compiler could say, "Okay, this struct is a candidate for
    weirdification," and generate appropriate code across all t.u.'s,
    easily enough.

    -Arthur
    [Remember, still moot.]

    P.S.-- As a small on-topic note, am I completely mistaken in my
    prior belief that 'offsetof' was a relatively recent addition to
    C? If so, why do we get so many variations on FAQ 2.14? :)
    Arthur J. O'Dwyer, Oct 7, 2003
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Rocky Moore
    Replies:
    7
    Views:
    1,738
    mikeb
    Jan 14, 2004
  2. =?Utf-8?B?S2V2aW4gQnVydG9u?=

    Pre-Send Request Headers, Pre-Send Request Content

    =?Utf-8?B?S2V2aW4gQnVydG9u?=, Dec 31, 2004, in forum: ASP .Net
    Replies:
    0
    Views:
    1,045
    =?Utf-8?B?S2V2aW4gQnVydG9u?=
    Dec 31, 2004
  3. Wladimir Borsov
    Replies:
    7
    Views:
    475
    Raymond Loman
    May 5, 2004
  4. Juha Nieminen
    Replies:
    3
    Views:
    1,149
    Juha Nieminen
    Feb 22, 2008
  5. Chris M. Thomasson
    Replies:
    10
    Views:
    652
    Nobody
    Aug 29, 2009
Loading...

Share This Page