Re: what determines the size of a struct?

Discussion in 'C++' started by Thomas Matthews, Aug 18, 2003.

  1. Charles Wilkins wrote:

    > Given the following example...
    >
    > #include <iostream>
    > using namespace std;
    >
    > struct SIZE
    > {
    > char a;
    > bool b;
    > int c;
    > long d;
    > float e;
    > double f;
    > };
    > struct EMPTY {};
    >
    > int main()
    > {
    > struct SIZE size;
    > struct EMPTY empty;
    > cout << "sizeof(char) = " << sizeof(char) << endl;
    > cout << "sizeof(bool) = " << sizeof(bool) << endl;
    > cout << "sizeof(int) = " << sizeof(int) << endl;
    > cout << "sizeof(long) = " << sizeof(long) << endl;
    > cout << "sizeof(float) = " << sizeof(float) << endl;
    > cout << "sizeof(double) = " << sizeof(double) << endl;
    > cout << "total size = " << sizeof(char) + sizeof(bool) +
    > sizeof(int) + sizeof(long) +
    > sizeof(float) + sizeof(double)
    > << endl;
    > cout << "sizeof(empty) = " << sizeof(empty) << endl;
    > cout << "sizeof(size) = " << sizeof(size) << endl;
    > return 0;
    > }
    >
    > ...which generates, on my system, the following output:
    >
    > sizeof(char) = 1
    > sizeof(bool) = 1
    > sizeof(int) = 4
    > sizeof(long) = 4
    > sizeof(float) = 4
    > sizeof(double) = 8
    > total size = 22
    > sizeof(empty) = 1
    > sizeof(size) = 24
    >
    > Just curious, where is the extra byte coming from in the sizeof(size)?
    > Why 24 bytes and not 23 bytes?
    >
    > Charles


    Welcome to the world of structures, unions and classes.
    The sum of the size of the contents may not be the sum of the
    size of the structure.

    The compiler implementor is allowed to add padding:
    1. between fields
    2. at the end.
    Many compiler vendors like to have the fields aligned on
    boundaries, such as 16-bit, 32-bit, etc. This alignment
    may make accessing the fields easier on the processor.

    Some compiler vendors may also have a #pragma for
    packing (i.e. removing padding between fields). Some
    may provide this feature but still generate code that
    has a problem with misaligned fields.

    If padding is an issue for you, then I suggest that
    you use unsigned char buffers for I/O (including
    writing to devices). Create functions that load
    the fields of your union, struct or class from the
    unsigned char buffer. Similarly with storing (writing).
    Let your program work with the fields the way the
    compiler has organized them.

    This is a big pain in the embedded systems arena.

    --
    Thomas Matthews

    C++ newsgroup welcome message:
    http://www.slack.net/~shiva/welcome.txt
    C++ Faq: http://www.parashift.com/c -faq-lite
    C Faq: http://www.eskimo.com/~scs/c-faq/top.html
    alt.comp.lang.learn.c-c++ faq:
    http://www.raos.demon.uk/acllc-c /faq.html
    Other sites:
    http://www.josuttis.com -- C++ STL Library book
     
    Thomas Matthews, Aug 18, 2003
    #1
    1. Advertising

  2. Thomas Matthews <> wrote in message news:<Yeb0b.26776$>...
    > If padding is an issue for you, then I suggest that
    > you use unsigned char buffers for I/O (including
    > writing to devices). Create functions that load
    > the fields of your union, struct or class from the
    > unsigned char buffer. Similarly with storing (writing).
    > Let your program work with the fields the way the
    > compiler has organized them.
    >
    > This is a big pain in the embedded systems arena.


    If you are designing the structure, it's easy enough to arrange for
    each item to be aligned on a natural boundary, hence no padding.

    Sam
     
    Samuel Barber, Aug 19, 2003
    #2
    1. Advertising

  3. Samuel Barber wrote:

    > Thomas Matthews <> wrote in message news:<Yeb0b.26776$>...
    >
    >>If padding is an issue for you, then I suggest that
    >>you use unsigned char buffers for I/O (including
    >>writing to devices). Create functions that load
    >>the fields of your union, struct or class from the
    >>unsigned char buffer. Similarly with storing (writing).
    >>Let your program work with the fields the way the
    >>compiler has organized them.
    >>
    >>This is a big pain in the embedded systems arena.

    >
    >
    > If you are designing the structure, it's easy enough to arrange for
    > each item to be aligned on a natural boundary, hence no padding.
    >
    > Sam

    Actually no, depends on the nature of the data.
    If I have a message:
    message ID -- 1 byte
    message length -- 4 bytes
    message data -- unknown
    And place it into a structure:
    struct Message
    {
    unsigned char id;
    unsigned int length;
    unsigned char * data;
    };
    many compilers will add padding between the
    "id" and "length" fields. The "unsigned char"
    causes the "length" field to start on an odd
    byte, which many processors don't relish.

    To make the processor fetches smoother, the
    compiler will padd the structure:
    struct Actual_Message
    {
    unsigned char id;
    unsigned char padding_from_compiler[3];
    unsigned int length;
    unsigned char * data;
    };

    The Actual_Message structure does not match the
    layout of the original message. Thus one cannot
    use any of the binary stream functions (write
    and read) because the layouts are different.

    In many applications, the data layout cannot be changed,
    so one has to code around it. In the above case, the
    solution would be to load the structure from a buffer:
    struct Actual_Message
    {
    // As above
    void load_from_buffer(unsigned char *& bufptr)
    {
    id = *bufptr++;
    // assume big endian
    length = *bufptr++;
    length <<= 8;
    length += *bufptr++;
    length <<= 8;
    length += *bufptr++;
    length <<= 8;
    length += *bufptr++;
    data = bufptr;
    return;
    }
    }

    In summary, the sum of the field sizes may not be the
    size of the structure. This leads to the fact that
    there cannot be a 1:1 mapping between a structure
    and the data layout. One must use an alternative
    method for reading and writing structures than the
    write and read methods of ostream and istream.

    However, if the data is organized around the padding
    concept, the binary I/O methods _may_ be used. Again,
    the processor may change, the compiler may change and
    the structure padding rules may change.

    --
    Thomas Matthews

    C++ newsgroup welcome message:
    http://www.slack.net/~shiva/welcome.txt
    C++ Faq: http://www.parashift.com/c -faq-lite
    C Faq: http://www.eskimo.com/~scs/c-faq/top.html
    alt.comp.lang.learn.c-c++ faq:
    http://www.raos.demon.uk/acllc-c /faq.html
    Other sites:
    http://www.josuttis.com -- C++ STL Library book
     
    Thomas Matthews, Aug 19, 2003
    #3
  4. How about using

    #pragma pack(push,1)

    //struct definition

    #pragma pack(pop)

    ??



    (Samuel Barber) wrote in message news:<>...
    > Thomas Matthews <> wrote in message news:<Yeb0b.26776$>...
    > > If padding is an issue for you, then I suggest that
    > > you use unsigned char buffers for I/O (including
    > > writing to devices). Create functions that load
    > > the fields of your union, struct or class from the
    > > unsigned char buffer. Similarly with storing (writing).
    > > Let your program work with the fields the way the
    > > compiler has organized them.
    > >
    > > This is a big pain in the embedded systems arena.

    >
    > If you are designing the structure, it's easy enough to arrange for
    > each item to be aligned on a natural boundary, hence no padding.
    >
    > Sam
     
    pradeep raghavan, Aug 19, 2003
    #4
  5. Thomas Matthews <> wrote in message news:<Lmq0b.27252$>...
    > Samuel Barber wrote:
    >
    > > Thomas Matthews <> wrote in message news:<Yeb0b.26776$>...
    > >
    > >>If padding is an issue for you, then I suggest that
    > >>you use unsigned char buffers for I/O (including
    > >>writing to devices). Create functions that load
    > >>the fields of your union, struct or class from the
    > >>unsigned char buffer. Similarly with storing (writing).
    > >>Let your program work with the fields the way the
    > >>compiler has organized them.
    > >>
    > >>This is a big pain in the embedded systems arena.

    > >
    > >
    > > If you are designing the structure, it's easy enough to arrange for
    > > each item to be aligned on a natural boundary, hence no padding.
    > >

    > Actually no, depends on the nature of the data.
    > If I have a message:
    > message ID -- 1 byte
    > message length -- 4 bytes
    > message data -- unknown
    > And place it into a structure:
    > struct Message
    > {
    > unsigned char id;
    > unsigned int length;
    > unsigned char * data;
    > };
    > many compilers will add padding between the
    > "id" and "length" fields.


    Easy to eliminate. Either expand the id to 4-bytes, or add 3 bytes of
    explicit padding. The compiler will never pad as long as everything is
    aligned on a natural boundary (2-byte objects on 2-byte boundaries;
    4-byte objects on 4-byte boundaries; 8-byte objects on 8-byte
    boundaries...)

    Of course, I said "If you are designing the structure...". If somebody
    else designed the structure, and you have to adapt to it, that's
    another matter.

    Sam
     
    Samuel Barber, Aug 19, 2003
    #5
  6. pradeep raghavan wrote:

    > How about using
    >
    > #pragma pack(push,1)
    >
    > //struct definition
    >
    > #pragma pack(pop)
    >
    > ??
    >


    On this group please do not

    * Top post
    * Post non-standard "solutions"

    Review the welcome message and section 5 of the FAQ.

    http://www.slack.net/~shiva/welcome.txt
    http://www.parashift.com/c -faq-lite/

    -Kevin
    --
    My email address is valid, but changes periodically.
    To contact me please use the address from a recent posting.
     
    Kevin Goodsell, Aug 20, 2003
    #6
  7. > How about using
    >
    > #pragma pack(push,1)
    >
    > file://struct definition
    >
    > #pragma pack(pop)


    That may work on some compilers, and do nothing on others. The behaviour of
    the #pragma directive is by definition implementation defined.

    --
    Peter van Merkerk
    peter.van.merkerk(at)dse.nl
     
    Peter van Merkerk, Aug 23, 2003
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Soefara
    Replies:
    0
    Views:
    875
    Soefara
    Feb 10, 2004
  2. Jonathan Wood
    Replies:
    7
    Views:
    381
    Jonathan Wood
    Nov 15, 2007
  3. Stuart Smith
    Replies:
    1
    Views:
    125
    David Heinemeier Hansson
    Apr 14, 2005
  4. OrganicFreeStyle

    Style: Case determines what's a constants?

    OrganicFreeStyle, Jun 11, 2006, in forum: Ruby
    Replies:
    5
    Views:
    102
    OrganicFreeStyle
    Jun 12, 2006
  5. George Hester

    The user determines the Site

    George Hester, Jan 11, 2004, in forum: Javascript
    Replies:
    2
    Views:
    86
    George Hester
    Jan 11, 2004
Loading...

Share This Page