Structure byte padding rule

Discussion in 'C Programming' started by Shivanand Kadwadkar, Feb 24, 2011.

  1. can any one give rule behind the how structure byte padding works

    Is it depends on machine word size or size of the largest data type or
    something else.
    Shivanand Kadwadkar, Feb 24, 2011
    #1
    1. Advertising

  2. Shivanand Kadwadkar

    bert Guest

    On Feb 24, 10:11 am, Shivanand Kadwadkar
    <> wrote:
    > can any one give rule behind the how structure byte padding works
    >
    > Is it depends on machine word size or size of the largest data type or
    > something else.


    No. Whatever one implementation does,
    another one can do quite differently,
    so long as well-defined code (that is,
    code that does not depend on how the
    padding bytes are implemented) works
    as it is expected to.
    --
    bert, Feb 24, 2011
    #2
    1. Advertising

  3. Shivanand Kadwadkar

    Eric Sosman Guest

    On 2/24/2011 5:11 AM, Shivanand Kadwadkar wrote:
    > can any one give rule behind the how structure byte padding works


    There can be padding after any element, including the last.
    (Bit-field elements are special, and complicated, so let's just
    ignore them -- besides, you can't point at them anyhow, so their
    position within the larger struct doesn't matter much.)

    > Is it depends on machine word size or size of the largest data type or
    > something else.


    The implementation is free to use as much or as little padding
    as it wants, and to arrange the padding any way it wants, provided
    any padding bytes come after struct elements (that is, there can be
    no padding before the first element). Usually, an implementation
    will insert the smallest amount of padding necessary to satisfy the
    alignment requirements of the element's own type. For example, on
    a system where a `double' is eight bytes long and must be aligned
    on a four-byte boundary, the struct

    struct s { char x; double y; char z; };

    .... will probably have six padding bytes: three after `x' so that
    `y' begins four bytes in (and will be four-byte-aligned if the
    struct itself is), and another three after `z' so that in an array
    of `struct s' objects the second array element will be four-byte-
    aligned if the array itself is. If you want to discover how a given
    implementation has padded a given struct, you can use the offsetof()
    macro from <stddef.h>:

    printf ("struct s takes %d bytes\n", (int)sizeof(struct s));
    printf ("x starts %d bytes in\n", (int)offsetof(struct s, x));
    printf ("y starts %d bytes in\n", (int)offsetof(struct s, y));
    printf ("z starts %d bytes in\n", (int)offsetof(struct s, z));

    However, the alignment requirements for various data types are
    also entirely up to the implementation. Thus, different compilers
    may pad the same source-code struct differently to satisfy their
    differing alignment needs, and the values printed by this code may
    differ from one system to another. (Except that the offset of `x'
    will always be zero; no padding before the first element.)

    --
    Eric Sosman
    d
    Eric Sosman, Feb 24, 2011
    #3
  4. Shivanand Kadwadkar

    BGB Guest

    On 2/24/2011 3:11 AM, Shivanand Kadwadkar wrote:
    > can any one give rule behind the how structure byte padding works
    >
    > Is it depends on machine word size or size of the largest data type or
    > something else.


    as others, have noted, the specifics are somewhat compiler/target specific.


    however, there are a few common "rules of thumb" (for compilers/targets
    which use padding):
    most base types have a power-of-2 size (note 1);
    most base types require an alignment which is the same as their size
    (note 2) often up to a certain limit (note 3);
    ....

    note 1: except "long double", which even on x86, differs widely between
    compilers and CPU mode. 80, 96, and 128 bit storage sizes exist, as well
    as some compilers which simply treat them as double.

    note 2: this is not always consistent, as targets may require an
    alignment smaller than the size for some types. an example is in 32-bit
    x86, where sometimes "long long" will only require a 32-bit alignment
    despite being a 64 bit type, and other compilers will still align it to
    64 bits out of principle.

    note 3: often an architecture will only care about alignment up to a
    certain point (such as the native word size, address size, or bus
    width), and past this point no greater alignment is needed (even if
    larger sizes may exist). for example, on x86 at present such limit is 16
    bytes (128 bits), but this may change later if/when larger CPU registers
    are added...


    so, usual strategy:
    for each struct member, it figures out the needed alignment, and the
    current offset within the struct (directly following the prior member);
    if the offset is not aligned, it is padded up to the needed alignment;
    following the last member, the struct may be in-turn padded up to its
    own needed alignment (so they can go nicely into arrays), which is
    usually that of the greatest needed alignment within the struct.


    or such...
    BGB, Feb 24, 2011
    #4
  5. Shivanand Kadwadkar

    sandeep Guest

    Eric Sosman writes:
    > The implementation is free to use as much or as little padding
    > as it wants, and to arrange the padding any way it wants, provided any
    > padding bytes come after struct elements (that is, there can be no
    > padding before the first element). Usually, an implementation will
    > insert the smallest amount of padding necessary to satisfy the alignment
    > requirements of the element's own type. For example, on a system where
    > a `double' is eight bytes long and must be aligned on a four-byte
    > boundary, the struct
    >
    > struct s { char x; double y; char z; };
    >
    > ... will probably have six padding bytes: three after `x' so that `y'
    > begins four bytes in (and will be four-byte-aligned if the struct itself
    > is), and another three after `z' so that in an array of `struct s'
    > objects the second array element will be four-byte- aligned if the array
    > itself is. If you want to discover how a given implementation has
    > padded a given struct, you can use the offsetof() macro from <stddef.h>:
    >
    > printf ("struct s takes %d bytes\n", (int)sizeof(struct s));

    printf ("x
    > starts %d bytes in\n", (int)offsetof(struct s, x)); printf ("y

    starts
    > %d bytes in\n", (int)offsetof(struct s, y)); printf ("z starts %d

    bytes
    > in\n", (int)offsetof(struct s, z));


    Unfortunately though, this code will invoke an undefined behavior on an
    implementation where sizeof(struct s) is bigger than INTMAX. I would
    advise using the %z argument to printf, this matches the return type of
    sizeof() and offsetof() so no explicit casts will be needed.
    sandeep, Feb 24, 2011
    #5
  6. sandeep <> writes:
    > Eric Sosman writes:
    >> The implementation is free to use as much or as little padding
    >> as it wants, and to arrange the padding any way it wants, provided any
    >> padding bytes come after struct elements (that is, there can be no
    >> padding before the first element). Usually, an implementation will
    >> insert the smallest amount of padding necessary to satisfy the alignment
    >> requirements of the element's own type. For example, on a system where
    >> a `double' is eight bytes long and must be aligned on a four-byte
    >> boundary, the struct
    >>
    >> struct s { char x; double y; char z; };
    >>
    >> ... will probably have six padding bytes: three after `x' so that `y'
    >> begins four bytes in (and will be four-byte-aligned if the struct itself
    >> is), and another three after `z' so that in an array of `struct s'
    >> objects the second array element will be four-byte- aligned if the array
    >> itself is. If you want to discover how a given implementation has
    >> padded a given struct, you can use the offsetof() macro from <stddef.h>:
    >>
    >> printf ("struct s takes %d bytes\n", (int)sizeof(struct s));

    > printf ("x
    >> starts %d bytes in\n", (int)offsetof(struct s, x)); printf ("y

    > starts
    >> %d bytes in\n", (int)offsetof(struct s, y)); printf ("z starts %d

    > bytes
    >> in\n", (int)offsetof(struct s, z));

    >
    > Unfortunately though, this code will invoke an undefined behavior on an
    > implementation where sizeof(struct s) is bigger than INTMAX. I would
    > advise using the %z argument to printf, this matches the return type of
    > sizeof() and offsetof() so no explicit casts will be needed.


    A struct containing a char, a double, and a char is vanishingly
    unlikely to exceed INT_MAX bytes.

    But yes, using "%zu" would make the code a bit cleaner (assuming your
    implementation supports it; not all do).

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Feb 24, 2011
    #6
  7. sandeep <> writes:

    > Eric Sosman writes:

    <snip>
    >> struct s { char x; double y; char z; };

    <snip>
    >> printf ("struct s takes %d bytes\n", (int)sizeof(struct s));

    > printf ("x
    >> starts %d bytes in\n", (int)offsetof(struct s, x)); printf ("y

    > starts
    >> %d bytes in\n", (int)offsetof(struct s, y)); printf ("z starts %d

    > bytes
    >> in\n", (int)offsetof(struct s, z));

    >
    > Unfortunately though, this code will invoke an undefined behavior on an
    > implementation where sizeof(struct s) is bigger than INTMAX.


    It's not undefined behaviour -- it's implementation-defined.

    > I would
    > advise using the %z argument to printf,


    Presumably you mean %zu. 'z' is just a length modifier.

    > this matches the return type of
    > sizeof() and offsetof() so no explicit casts will be needed.


    If you don't have a C99 version of printf, the most portable solution is
    to cast to unsigned long (so there is not even any implementation-
    defined behaviour) and use %lu as the format.

    However (as I am sure you know) even this advice is over the top for the
    code in question!

    --
    Ben.
    Ben Bacarisse, Feb 24, 2011
    #7
  8. Ben Bacarisse <> writes:
    > sandeep <> writes:
    >
    >> Eric Sosman writes:

    > <snip>
    >>> struct s { char x; double y; char z; };

    > <snip>
    >>> printf ("struct s takes %d bytes\n", (int)sizeof(struct s));

    >> printf ("x
    >>> starts %d bytes in\n", (int)offsetof(struct s, x)); printf ("y

    >> starts
    >>> %d bytes in\n", (int)offsetof(struct s, y)); printf ("z starts %d

    >> bytes
    >>> in\n", (int)offsetof(struct s, z));

    >>
    >> Unfortunately though, this code will invoke an undefined behavior on an
    >> implementation where sizeof(struct s) is bigger than INTMAX.

    >
    > It's not undefined behaviour -- it's implementation-defined.

    [...]

    An overflowing conversion to a signed type either yields an
    implementation-defined result or raises an implementation-defined signal
    (C99 6.3.1.3p3). The consequences of raising an implementation-defined
    signal are (at least potentially) undefined.

    The permission to raise a signal is new in C99, and I've never
    heard of any compiler taking advantage of it.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Feb 24, 2011
    #8
  9. Shivanand Kadwadkar

    BGB Guest

    On 2/24/2011 3:33 PM, Keith Thompson wrote:
    > sandeep<> writes:
    >> Eric Sosman writes:
    >>> The implementation is free to use as much or as little padding
    >>> as it wants, and to arrange the padding any way it wants, provided any
    >>> padding bytes come after struct elements (that is, there can be no
    >>> padding before the first element). Usually, an implementation will
    >>> insert the smallest amount of padding necessary to satisfy the alignment
    >>> requirements of the element's own type. For example, on a system where
    >>> a `double' is eight bytes long and must be aligned on a four-byte
    >>> boundary, the struct
    >>>
    >>> struct s { char x; double y; char z; };
    >>>
    >>> ... will probably have six padding bytes: three after `x' so that `y'
    >>> begins four bytes in (and will be four-byte-aligned if the struct itself
    >>> is), and another three after `z' so that in an array of `struct s'
    >>> objects the second array element will be four-byte- aligned if the array
    >>> itself is. If you want to discover how a given implementation has
    >>> padded a given struct, you can use the offsetof() macro from<stddef.h>:
    >>>
    >>> printf ("struct s takes %d bytes\n", (int)sizeof(struct s));

    >> printf ("x
    >>> starts %d bytes in\n", (int)offsetof(struct s, x)); printf ("y

    >> starts
    >>> %d bytes in\n", (int)offsetof(struct s, y)); printf ("z starts %d

    >> bytes
    >>> in\n", (int)offsetof(struct s, z));

    >>
    >> Unfortunately though, this code will invoke an undefined behavior on an
    >> implementation where sizeof(struct s) is bigger than INTMAX. I would
    >> advise using the %z argument to printf, this matches the return type of
    >> sizeof() and offsetof() so no explicit casts will be needed.

    >
    > A struct containing a char, a double, and a char is vanishingly
    > unlikely to exceed INT_MAX bytes.
    >
    > But yes, using "%zu" would make the code a bit cleaner (assuming your
    > implementation supports it; not all do).
    >


    a struct exceeding INT_MAX bytes on any "reasonable" architecture seems
    itself exceedingly unlikely...

    on a 16-bit target, having a struct this large would be itself a problem
    (yes, yes, say on DOS one could have a far pointer and a 64kB struct,
    but how likely is this?...).

    on most 32-bit systems, this can't practically happen (would need a 2GB
    struct, which would have problems fitting into most address spaces).

    on 64-bit systems, it could happen, but seriously, how likely is it in
    the near future that there will be >=2GB structs?...

    unless, maybe:
    struct foo_s
    {
    int arr[1000][1000][1000];
    };


    more subtly, there is the issue of if existing 64-bit systems have
    memory managers which allow objects this large? (such as via
    malloc/free...).

    or, additionally, the last time I did a multi-GB memory allocation (on
    64-bit Windows, via "VirtualAlloc()"...), the computer lagged so hard
    (due to swapping) that I worried a crash was likely (although, I changed
    it to not use COMMIT on the memory, and problem fixed...).


    OT:

    mostly though this was for a region for my "code/data/bss heap":
    basically, for dynamically generated machine code, which has a +-2GB limit.
    x86-64 doesn't allow direct 64-bit memory addressing or jumps, meaning
    one either has to load addresses into a register and use an indirect
    addressing, or use the new RIP-relative addressing and live with a +-2GB
    limit, or have all code/data/bss sections within the lower 4GB.

    but, if one uses a single 2GB region, they can assure that any local
    accesses will be within the +-2GB window, and thus use the cheaper
    direct addressing (non-local calls then being handled via trampoline
    thunks, and non-local global variables being assumed to be invalid).
    BGB, Feb 24, 2011
    #9
  10. Keith Thompson <> writes:

    > Ben Bacarisse <> writes:
    >> sandeep <> writes:
    >>
    >>> Eric Sosman writes:

    >> <snip>
    >>>> struct s { char x; double y; char z; };

    >> <snip>
    >>>> printf ("struct s takes %d bytes\n", (int)sizeof(struct s));
    >>> printf ("x
    >>>> starts %d bytes in\n", (int)offsetof(struct s, x)); printf ("y
    >>> starts
    >>>> %d bytes in\n", (int)offsetof(struct s, y)); printf ("z starts %d
    >>> bytes
    >>>> in\n", (int)offsetof(struct s, z));
    >>>
    >>> Unfortunately though, this code will invoke an undefined behavior on an
    >>> implementation where sizeof(struct s) is bigger than INTMAX.

    >>
    >> It's not undefined behaviour -- it's implementation-defined.

    > [...]
    >
    > An overflowing conversion to a signed type either yields an
    > implementation-defined result or raises an implementation-defined signal
    > (C99 6.3.1.3p3). The consequences of raising an implementation-defined
    > signal are (at least potentially) undefined.


    I don't see how except as a rather extreme reading the standard. The
    implementation-defined signal must be "set" to either SIG_IGN or
    SIG_DFL. The SIG_IGN case is well-defined; that of SIG_DFL says that
    "default handling for that signal will occur". That's maybe a bit vague
    but J.3.2 says of implementation-defined behaviour that "[t]he set of
    signals, their semantics, and their default handling" must be
    documented.

    Of course, you could say that the implementation may document the
    default handling as being "undefined behaviour" but seems to me to be a
    perverse interpretation. In effect it requires that implementation-
    defined behaviour may be defined as undefined!

    <snip>
    --
    Ben.
    Ben Bacarisse, Feb 24, 2011
    #10
  11. Shivanand Kadwadkar

    Tim Rentsch Guest

    Keith Thompson <> writes:

    > Ben Bacarisse <> writes:
    >> sandeep <> writes:
    >>
    >>> Eric Sosman writes:

    >> <snip>
    >>>> struct s { char x; double y; char z; };

    >> <snip>
    >>>> printf ("struct s takes %d bytes\n", (int)sizeof(struct s));
    >>> printf ("x
    >>>> starts %d bytes in\n", (int)offsetof(struct s, x)); printf ("y
    >>> starts
    >>>> %d bytes in\n", (int)offsetof(struct s, y)); printf ("z starts %d
    >>> bytes
    >>>> in\n", (int)offsetof(struct s, z));
    >>>
    >>> Unfortunately though, this code will invoke an undefined behavior on an
    >>> implementation where sizeof(struct s) is bigger than INTMAX.

    >>
    >> It's not undefined behaviour -- it's implementation-defined.

    > [...]
    >
    > An overflowing conversion to a signed type [...snip...]


    Nit: an out-of-range conversion. "Overflow", as used in
    the Standard, is something else (admittedly similar but
    still something else).
    Tim Rentsch, Mar 11, 2011
    #11
  12. Shivanand Kadwadkar

    Tim Rentsch Guest

    Ben Bacarisse <> writes:

    > Keith Thompson <> writes:
    >
    >> Ben Bacarisse <> writes:
    >>> sandeep <> writes:
    >>>
    >>>> Eric Sosman writes:
    >>> <snip>
    >>>>> struct s { char x; double y; char z; };
    >>> <snip>
    >>>>> printf ("struct s takes %d bytes\n", (int)sizeof(struct s));
    >>>> printf ("x
    >>>>> starts %d bytes in\n", (int)offsetof(struct s, x)); printf ("y
    >>>> starts
    >>>>> %d bytes in\n", (int)offsetof(struct s, y)); printf ("z starts %d
    >>>> bytes
    >>>>> in\n", (int)offsetof(struct s, z));
    >>>>
    >>>> Unfortunately though, this code will invoke an undefined behavior on an
    >>>> implementation where sizeof(struct s) is bigger than INTMAX.
    >>>
    >>> It's not undefined behaviour -- it's implementation-defined.

    >> [...]
    >>
    >> An overflowing conversion to a signed type either yields an
    >> implementation-defined result or raises an implementation-defined signal
    >> (C99 6.3.1.3p3). The consequences of raising an implementation-defined
    >> signal are (at least potentially) undefined.

    >
    > I don't see how except as a rather extreme reading the standard.
    > [snip elaboration]


    Because, for example, an implementation can choose to specify
    the behavior of the default signal handler by giving a
    function body that would exhibit undefined behavior in some
    code paths under some conditions (such as trying to convert
    the bit pattern corresponding to negative zero on a machine
    that uses ones complement but doesn't support negative
    zeroes).
    Tim Rentsch, Mar 11, 2011
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Amarendra

    Structure padding.

    Amarendra, Jun 21, 2004, in forum: C Programming
    Replies:
    13
    Views:
    9,553
    Ralmin
    Jun 22, 2004
  2. Replies:
    0
    Views:
    1,345
  3. Kislay

    Size of a structure : Structure Padding

    Kislay, Oct 1, 2007, in forum: C Programming
    Replies:
    15
    Views:
    930
    clinuxpro
    Jul 13, 2011
  4. Shivanand Kadwadkar

    Structure copy rule in c

    Shivanand Kadwadkar, Dec 25, 2010, in forum: C Programming
    Replies:
    4
    Views:
    1,985
    arnuld
    Dec 27, 2010
  5. Replies:
    2
    Views:
    177
    Eric Sosman
    Oct 1, 2013
Loading...

Share This Page