Padding involved

Discussion in 'C Programming' started by anish singh, Mar 7, 2014.

  1. anish singh

    anish singh Guest

    Struct abbcd{
    Char c;
    Int b;
    Short d;
    };

    What will be the size of abbcd ? If padding involved and without padding?Suppose that the processor has only 4 byte registers.

    What will be the size if the particular processor has register for 2 byte, 4 byte and 1 byte?

    Note; size of char is 1,size of int, short is 4 and 2 respectively.
     
    anish singh, Mar 7, 2014
    #1
    1. Advertisements

  2. anish singh

    jacob navia Guest

    Le 07/03/2014 22:13, anish singh a écrit :
    Yes, that is very easy. Just send us the email of your teacher and we
    will mail the answers to him directly ok?
     
    jacob navia, Mar 7, 2014
    #2
    1. Advertisements

  3. anish singh

    Eric Sosman Guest

    Impossible to say, on the basis of the information available.
    The code as shown requires that the compiler issue a diagnostic
    message because of the four undeclared identifiers, and after
    issuing the diagnostic the compiler is likely to announce that it
    could not compile the code. (Or it might make some guesses and
    adjustments and compile something a little different -- but we have
    no way to know what that different something might be.)

    The code you've shown could be made compilable if preceded by
    suitable preprocessor macros and/or other declarations, but again:
    We haven't seen those parts, so we can't tell what they do.
     
    Eric Sosman, Mar 7, 2014
    #3
  4. anish singh

    James Kuyper Guest

    The keywords "struct", "char", "int" and "short" are all lower case.
    This may seem like quibbling, but C is a case sensitive language - if
    you plan to use C, you need to learn to be careful about case.
    The only answer that works across all implementations is "sizeof(struct
    abbcd)".

    Without padding, the size will be 7 bytes, though it depends upon the
    implementation whether you even have the option of avoiding padding.
    With padding, it will be larger than 7, and almost certainly smaller
    than SIZE_MAX. The actual value within that range depends upon the
    implementation. If you want a more specific answer, the information
    you've provided is insufficient to answer it. You need to fully specify
    which implementation of C you're using: identify which compiler you're
    using, and what compiler options you've chosen, including the target
    platform. Once you've specified those things, the easiest way to find
    out is to print out the value of sizeof(struct abbcd). That's a lot
    quicker than asking us.
     
    James Kuyper, Mar 7, 2014
    #4
  5. anish singh

    anish singh Guest

    I have not given a compilable code. I am just
    asking the size of the struct given.
     
    anish singh, Mar 7, 2014
    #5
  6. anish singh

    anish singh Guest

    I have not given a compilable code. I am just
    asking the size of the struct given.
     
    anish singh, Mar 7, 2014
    #6
  7. anish singh

    BartC Guest

    Do you have access to a C compiler? Then you can do your own experiments
    with programs such as the following. (Note the #pragma line, to turn off
    padding for alignment, will vary between compilers.)

    The offsets and padding will depend more on the memory alignments needed for
    the machine, then the sizes of the registers.

    #include <stdio.h>
    #include <stddef.h>

    int main(void) {

    struct abbcd {
    char c;
    int b;
    short d;
    };

    #pragma pack(1)
    struct abbcd_packed {
    char c;
    int b;
    short d;
    };

    printf("Size of char = %d\n",sizeof(char));
    printf("Size of int = %d\n",sizeof(int));
    printf("Size of short = %d\n",sizeof(short));
    puts("");

    puts("Normal padding:");
    printf("Offset of c = %d\n",offsetof(struct abbcd,c));
    printf("Offset of b = %d\n",offsetof(struct abbcd,b));
    printf("Offset of d = %d\n",offsetof(struct abbcd,d));
    printf("Size of abbcd = %d\n",sizeof(struct abbcd));
    puts("");

    puts("Without padding:");
    printf("Offset of c = %d\n",offsetof(struct abbcd_packed,c));
    printf("Offset of b = %d\n",offsetof(struct abbcd_packed,b));
    printf("Offset of d = %d\n",offsetof(struct abbcd_packed,d));
    printf("Size of abbcd_packed = %d\n",sizeof(struct abbcd_packed));

    }
     
    BartC, Mar 7, 2014
    #7
  8. anish singh

    anish kumar Guest

    Are you sure that memory alignments have nothing to do with register size
    or the address/data bus size?
     
    anish kumar, Mar 7, 2014
    #8
  9. anish singh

    Eric Sosman Guest

    How big is this array:

    int array<7>;

    ? In other words, if the code describing your struct won't even
    compile, then you have not "given" a struct at all. If there is
    no struct, it has no size and no padding -- and no existence.
     
    Eric Sosman, Mar 7, 2014
    #9
  10. [...]

    The OP may not be aware that #pragma pack is non-standard. It's an
    extension implemented by gcc (and probably other C compilers).
     
    Keith Thompson, Mar 7, 2014
    #10
  11. anish singh

    anish kumar Guest

    Understood. How about below:
    int main(void) {
    struct test {
    char a;
    int b;
    short c;
    };
    printf("%d\n", sizeof(struct test));
    return 0;
    }
    I completely understand what will be the size of the struct with and
    without mapping but the question is what parameters decides the padding
    involved? Such as size of registers or size of address/data bus of the
    processor?
     
    anish kumar, Mar 7, 2014
    #11
  12. anish singh

    Joe Pfeiffer Guest

    I know the answers you're getting are frustrating to you, but better
    ones really aren't possible: it depends on the choices made by the
    compiler writer. Consider a very simple case: a 32-bit bus,
    with unaligned loads and stores allowed but slow (because they take two
    memory accesses). If the compiler writer optimizes for size (so it's as
    small as possible), it'll be seven bytes. If he optimizes for speed,
    it might be ten (three bytes inserted between a and b).
     
    Joe Pfeiffer, Mar 7, 2014
    #12
  13. anish singh

    anish kumar Guest

    Wow and you understood pretty fine my question. Thanks but any resources
    for understanding this in detail?
     
    anish kumar, Mar 7, 2014
    #13
  14. anish singh

    James Kuyper Guest

    On 03/07/2014 06:41 PM, anish kumar wrote:
    ....
    Ask your compiler vendor for detailed documentation for that compiler.
    It might or might not be available for public consumption. What you
    learn about the details for that compiler might (or might not) be of
    some use when using a different compiler.
     
    James Kuyper, Mar 7, 2014
    #14
  15. anish singh

    BartC Guest

    There will be a relationship between memory organisation and register width,
    but the memory layout will be more important.

    In your example, there are three kinds of alignment, but there might only
    one width of register.

    Or you can have the same register model, but another version of the
    processor might arrange the memory and data bus differently.

    Have you been looking at real processors, or made-up ones?

    In the example you gave of one-byte registers and 4-byte ints, then such a
    machine could have an 8-bit databus (so alignment is not important), but
    could also have a 16, 32 or 64-bit one, if the processor could use that to
    advantage.
     
    BartC, Mar 7, 2014
    #15
  16. And, more generally, the smallest it can be is the sum of
    the sizeof of the members.
    Wouldn't it have to be 12 in that case? I thought that sizeof
    a struct had to be big enough that an array of them would result
    in all elements being aligned.

    I know that a pointer to a struct has to, with appropriate
    casting, equal a pointer to its first member, but I am not sure of
    the requirements after that.

    Is the compiler allowed to keep the char at the beginning, but
    move the short before the int, such that sizeof would be 8?

    Is there a reason why sizeof can't be 16 or 32, if that happens
    to be faster on a certain processor?

    -- glen
     
    glen herrmannsfeldt, Mar 8, 2014
    #16
  17. One more nitpick: sizeof yields a result of type size_t; the "%d"
    format requires an argument of type int. Use the "%zu" format, or
    convert the sizeof result to int (or to unsigned long and use "%lu").

    The amount of padding is determined by the alignment requirement
    for each member type. The amount of padding at the end is also
    influenced by the maximum alignment requirement for any member.
    And there might be an additional alignment requirement for the
    structure as a whole.

    Alignment requirements are, to some extent, up to whim of the
    compiler developers, but typically they're determined by an ABI
    for a given platform. On some CPUs misaligned accesses (e.g.,
    reading or writing a 4-byte integer at an odd address) are merely
    slower than aligned accesses; on others they can crash your program
    (or, worse, quietly give you incorrect results).

    The comp.lang.c FAQ is at http://www.c-faq.com/; question 2.12
    discusses structure padding.
     
    Keith Thompson, Mar 8, 2014
    #17
  18. anish singh

    Kaz Kylheku Guest

    Padding involved is determined by the "ABI" rules for the given architecture.
    It has grave impact for the interoperability of programs, especially ina mixed
    environment either with multiple compilers for C or C-related dialects, and
    other languages that need to "bind" to C interfaces.

    The width of the address or data bus of the processor is a very low-level
    implementation detail on the actual silicon die, and is largely irrelevant.
    Moreover, there is more than noe bus. Are you talking about the connection
    betwen the L1 cache and L2 cache? A processor may read an entire cache line (or
    several of them in burst mode) at a time from main memory nowadays; that
    doesn't mean we align every structure member to a cache line.

    ABI rules also span multiple implementations of an architecture. If we are
    compiling for 32 bit x86, we might tell the compiler to optimize for a 386,
    486, Pentium, i7 or whatever, but the layout of the structures should be
    interoperable across the family.

    I think how it will work on GCC targetting 32 bit Intel is this.

    The int will be padded so that it is aligned to an offset divisible by 4,
    and the short will be padded so that it is aligned to an offset divisible by 2.

    The reason for this is not that the alignment must be there, because processors
    in this family support unaligned reads. It's for efficiency of access.
    Even processors that can read a word at any byte address stil read it faster
    if the address is aligned.

    So there is a byte for "a", then three padding bytes. Then "b" is placed,
    occupying four bytes, bringing us to offset 8. This is divisible by two, so "c"
    is placed there taking two bytes, for a total of ten.

    If we add a second "char a2" after "char a", it should go into the padding.

    Generally, if a type with a weaker alignment is placed after a type with a
    stronger alignment, it shouldn't need alignment. Conversely,
    if a type with stronger alignment is placed after one with weaker alignment,
    then it may require padding up to an offset that is a multiple of its
    alignment.

    Furthermore, there may be additional padding at the end of a structure, to
    support the notion that structures can be combined together to form an array,
    whereby the padding at the end of element [n] establishes the alignment of the
    first member of element[n+1].

    Thus a structure which is like this { int a; char b; } might have,
    depending no architectural details, some bytes of padding after the b, so that
    the overall size is divisible by sizeof(int). On Intel x86, I might expect
    no padding between a and b, and three bytes after b.

    These are very general concepts and the deatils vary quite a lot among
    architectures.

    For some architectures, the vendors or other organizations who standardize the
    architectures, develop a set of documents which specify the ABI. The documents
    dicate everything from how structures are laid out, to what stack frames look
    like, what registers are used for what, how arguments are passed between
    functions and so on. If there is such a body of standards, then generally the
    compiler implementor follows that.
     
    Kaz Kylheku, Mar 8, 2014
    #18
  19. anish singh

    anish kumar Guest

    Rightly said and quoting from here
    http://www.x86-64.org/documentation/abi.pdf?
    Aggregates and union section:

    "structures and unions assume the alignment of their most strictly
    aligned component. Each member is assigned to the lowest available
    offset with the appropriate alignment. The size of the object is always
    a multiple of the object's alignment."

    I couldn't understand the above statement.
     
    anish kumar, Mar 8, 2014
    #19
  20. anish singh

    Kaz Kylheku Guest

    This simply means that an N byte type has an alignment requirement to be on an
    offset divisible by N. And when a structure member of size N is being
    allocated, the next available offset that is divisible by N is chosen (the
    "lowest available offset witha ppropriate alignment").

    The "structures and unions assume the alignment of their most strictly aligned
    component" means that if the structure contains an element of size N, then
    there is enough padding at the end of the structure so that this element
    will be correctly aligned as an array member.

    For instance

    struct foo { char a; long long b; char c; }

    char a is at offset zero. Then 64 bit wide b goes to an address divisible
    by 8, leaving a padding of 7, bringing us to 16 bytes. Now c is
    allocated to the 17th byte. 17 cannot be the final size, because
    in an array struct foo x[2], x[1].b will end up on a funny address!

    The "most strictly aligned component" is b: aligned to 8 byte boundaries. And
    so, the structure must be padded so its size is divisible by 8: matching the
    alignment requirement of the most strictly aligned component. The size will be
    the next available multiple of 8 after 17: 24. Seven bytes of padding, again.

    Also, the malloc function is required to return pointers that are at
    least as strictly aligned as any basic data type.

    In the structure layout, offset 0 is asssumed to be suitably aligned
    for anything. Regardless of type, the first member is placed at offset
    zero without any padding at the start. The allocator has to make that
    assumption true.

    Not only malloc, but the compiler and linker: how they lay out objects
    in static storage and in automatic storage (the stack).
     
    Kaz Kylheku, Mar 8, 2014
    #20
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.