structures and alignment issues

Discussion in 'C Programming' started by silpau@gmail.com, Jun 14, 2007.

  1. Guest

    struct a
    {
    int b;
    char a;
    int c;
    }

    On i386 the sizeof this structure would be 12 bytes,as 3 bytes are
    padded after a so that c is aligned on 4byte boundary.

    So the doubts that i have is

    1) Does the poratbility issue come into play only when i persist this
    structure on one architecture ( for ex i386) and try to read the
    structure back on a different architecture(for ex motorola series)
     
    , Jun 14, 2007
    #1
    1. Advertising

  2. Guest

    On Jun 14, 8:30 am, wrote:
    > struct a
    > {
    > int b;
    > char a;
    > int c;
    >
    > }
    >
    > On i386 the sizeof this structure would be 12 bytes,as 3 bytes are
    > padded after a so that c is aligned on 4byte boundary.
    >
    > So the doubts that i have is
    >
    > 1) Does the poratbility issue come into play only when i persist this
    > structure on one architecture ( for ex i386) and try to read the
    > structure back on a different architecture(for ex motorola series)


    One more thing i want to get clarified is do all the compilers align
    structure members using natural alignement or does this all differ
    from architecture to architecture
     
    , Jun 14, 2007
    #2
    1. Advertising

  3. Ian Collins Guest

    wrote:
    > struct a
    > {
    > int b;
    > char a;
    > int c;
    > }
    >
    > On i386 the sizeof this structure would be 12 bytes,as 3 bytes are
    > padded after a so that c is aligned on 4byte boundary.
    >

    That's one possible alignment.

    > So the doubts that i have is
    >
    > 1) Does the poratbility issue come into play only when i persist this
    > structure on one architecture ( for ex i386) and try to read the
    > structure back on a different architecture(for ex motorola series)
    >

    Or the same architecture with a different compiler, or the same compiler
    with different options. Or...

    --
    Ian Collins.
     
    Ian Collins, Jun 14, 2007
    #3
  4. Morris Dovey Guest

    wrote:

    | 1) Does the poratbility issue come into play only when i persist
    | this structure on one architecture ( for ex i386) and try to read
    | the structure back on a different architecture(for ex motorola
    | series)

    No, it can always be an issue. Consider the differences in how a
    single compiler on a single architecture might treat this structure
    when told to optimize for speed vs when told to optimize for size...

    --
    Morris Dovey
    DeSoto Solar
    DeSoto, Iowa USA
    http://www.iedu.com/DeSoto/
     
    Morris Dovey, Jun 14, 2007
    #4
  5. <> wrote in message
    news:...
    > On Jun 14, 8:30 am, wrote:
    >> struct a
    >> {
    >> int b;
    >> char a;
    >> int c;
    >>
    >> }
    >>
    >> On i386 the sizeof this structure would be 12 bytes,as 3 bytes are
    >> padded after a so that c is aligned on 4byte boundary.
    >>
    >> So the doubts that i have is
    >>
    >> 1) Does the poratbility issue come into play only when i persist this
    >> structure on one architecture ( for ex i386) and try to read the
    >> structure back on a different architecture(for ex motorola series)

    >
    > One more thing i want to get clarified is do all the compilers align
    > structure members using natural alignement or does this all differ
    > from architecture to architecture
    >

    Obviously compiler designers don't insert padding for fun. It is because
    memory accesses to aligned members are more efficient. However it is always
    possible and usually not very inefficient to access non-aligned members. The
    question is where to make the trade off, and opinions differ.
     
    Malcolm McLean, Jun 14, 2007
    #5
  6. Taran Guest

    On Jun 14, 10:23 am, "Malcolm McLean" <>
    wrote:
    > <> wrote in message
    >
    > news:...
    >
    >
    >
    > > On Jun 14, 8:30 am, wrote:
    > >> struct a
    > >> {
    > >> int b;
    > >> char a;
    > >> int c;

    >
    > >> }

    >
    > >> On i386 the sizeof this structure would be 12 bytes,as 3 bytes are
    > >> padded after a so that c is aligned on 4byte boundary.

    >
    > >> So the doubts that i have is

    >
    > >> 1) Does the poratbility issue come into play only when i persist this
    > >> structure on one architecture ( for ex i386) and try to read the
    > >> structure back on a different architecture(for ex motorola series)


    There should't be any issues if you use the structure name to
    reference the members of it.
    like struct_var.a, struct_var.b and so on. The byte padding is
    transparent and the compiler will take care that when you get the same
    value when you read struct_var.a as you would have stored using
    struct_var.a = value.

    But what the compiler doesn't gaurantee is that you take a pointer to
    this struct and then try this

    struct a * ptr = &struct_a_var;
    int byte_padding = 3;
    if( &struct_a_var.c == (ptr + sizeof(struct_a_var.a)
    +sizeof(struct_a_var.b) + byte_padding))
    {
    .........
    }

    The above if condition may fail or may succeed and is really
    architecture dependent and non-portable.

    I have a piece of code which manipulates lt many strcutures and work
    semalessly well whether it is run on intel or powerpc.


    > > One more thing i want to get clarified is do all the compilers align
    > > structure members using natural alignement or does this all differ
    > > from architecture to architecture



    This also differes from architecture to architecture. If an
    architecture has faster access to memories on double word boundary
    then the byte padding would be more. If the architecure has faster
    access to addresses on byte boundaries then there will not be any
    padding.

    HTH.
    ----
    Regards,
    Taran
     
    Taran, Jun 14, 2007
    #6
  7. On Thu, 14 Jun 2007 03:30:03 -0000, in comp.lang.c ,
    wrote:

    >struct a
    >{
    > int b;
    > char a;
    > int c;
    >}
    >
    >On i386 the sizeof this structure would be 12 bytes,as 3 bytes are
    >padded after a so that c is aligned on 4byte boundary.
    >
    >So the doubts that i have is
    >
    >1) Does the poratbility issue come into play only when i persist this
    >structure on one architecture ( for ex i386) and try to read the
    >structure back on a different architecture(for ex motorola series)


    No - even on one h/w platform you can see different padding depending
    on compiler settings. See the FAQ for further discussion.
    --
    Mark McIntyre

    "Debugging is twice as hard as writing the code in the first place.
    Therefore, if you write the code as cleverly as possible, you are,
    by definition, not smart enough to debug it."
    --Brian Kernighan
     
    Mark McIntyre, Jun 14, 2007
    #7
  8. On Thu, 14 Jun 2007 03:34:25 -0000, in comp.lang.c ,
    wrote:

    >One more thing i want to get clarified is do all the compilers align
    >structure members using natural alignement or does this all differ
    >from architecture to architecture


    Its platform-dependent.
    --
    Mark McIntyre

    "Debugging is twice as hard as writing the code in the first place.
    Therefore, if you write the code as cleverly as possible, you are,
    by definition, not smart enough to debug it."
    --Brian Kernighan
     
    Mark McIntyre, Jun 14, 2007
    #8
  9. On Thu, 14 Jun 2007 00:09:06 -0700, in comp.lang.c , Taran
    <> wrote:

    >> >> 1) Does the poratbility issue come into play only when i persist this
    >> >> structure on one architecture ( for ex i386) and try to read the
    >> >> structure back on a different architecture(for ex motorola series)

    >
    >There should't be any issues if you use the structure name to
    >reference the members of it.


    Remember he is talking about persisting the data ie storing it to disk
    or similar. When you read it back in, you will have to account for the
    padding properly, in order to read in the data to the right members.

    >I have a piece of code which manipulates lt many strcutures and work
    >semalessly well whether it is run on intel or powerpc.


    Yes -it'll work fine provided you don't store binary data to disk,
    copy the file to a different platform, and try to read it in again.

    The FAQ talks about this in section 20.

    --
    Mark McIntyre

    "Debugging is twice as hard as writing the code in the first place.
    Therefore, if you write the code as cleverly as possible, you are,
    by definition, not smart enough to debug it."
    --Brian Kernighan
     
    Mark McIntyre, Jun 14, 2007
    #9
  10. Malcolm McLean <> wrote:

    > Obviously compiler designers don't insert padding for fun. It is because
    > memory accesses to aligned members are more efficient. However it is always
    > possible and usually not very inefficient to access non-aligned members.


    Always possible? I'm sure many folks who have had to deal with "bus
    error" and its friends would beg to differ.

    --
    C. Benson Manica | I *should* know what I'm talking about - if I
    cbmanica(at)gmail.com | don't, I need to know. Flames welcome.
     
    Christopher Benson-Manica, Jun 14, 2007
    #10
  11. wrote:
    >
    > struct a
    > {
    > int b;
    > char a;
    > int c;
    > }
    >
    > On i386 the sizeof this structure would be 12 bytes,as 3 bytes are
    > padded after a so that c is aligned on 4byte boundary.


    Well, that's one possibility, but not the only one.

    > So the doubts that i have is
    >
    > 1) Does the poratbility issue come into play only when i persist this
    > structure on one architecture ( for ex i386) and try to read the
    > structure back on a different architecture(for ex motorola series)


    Not only do you have to be concerned about padding, but byte order
    as well. The i386 series is little-endian, and the Motorola chips
    that I am familiar with are big-endian. Even if the compiler were
    to use the same padding, the values of b and c won't be interpreted
    the same way. For example, the i386 writes 1 as 01/00/00/00 which
    will be seem by the big-endian CPU as 0x01000000, or 16,777,216.

    BTDTGTTS.

    I support a cross-platform database which writes such things to the
    data file. (Please note that this app doesn't live in the strict C
    world, but rather C plus POSIX plus some limited system-specific code
    world.) While the source is 99% platform independent, the data files
    are not, and a utility is included to massage the data from one
    platform to another, should you wish to move the data files to
    another system.

    --
    +-------------------------+--------------------+-----------------------+
    | Kenneth J. Brody | www.hvcomputer.com | #include |
    | kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
    +-------------------------+--------------------+-----------------------+
    Don't e-mail me at: <mailto:>
     
    Kenneth Brody, Jun 14, 2007
    #11
  12. On Thu, 14 Jun 2007 00:09:06 -0700, Taran <>
    wrote:

    >On Jun 14, 10:23 am, "Malcolm McLean" <>
    >wrote:
    >> <> wrote in message
    >>
    >> news:...
    >>
    >>
    >>
    >> > On Jun 14, 8:30 am, wrote:
    >> >> struct a
    >> >> {
    >> >> int b;
    >> >> char a;
    >> >> int c;

    >>
    >> >> }

    >>
    >> >> On i386 the sizeof this structure would be 12 bytes,as 3 bytes are
    >> >> padded after a so that c is aligned on 4byte boundary.

    >>
    >> >> So the doubts that i have is

    >>
    >> >> 1) Does the poratbility issue come into play only when i persist this
    >> >> structure on one architecture ( for ex i386) and try to read the
    >> >> structure back on a different architecture(for ex motorola series)

    >
    >There should't be any issues if you use the structure name to
    >reference the members of it.
    >like struct_var.a, struct_var.b and so on. The byte padding is
    >transparent and the compiler will take care that when you get the same
    >value when you read struct_var.a as you would have stored using
    >struct_var.a = value.
    >
    >But what the compiler doesn't gaurantee is that you take a pointer to
    >this struct and then try this
    >
    >struct a * ptr = &struct_a_var;
    >int byte_padding = 3;
    >if( &struct_a_var.c == (ptr + sizeof(struct_a_var.a)
    >+sizeof(struct_a_var.b) + byte_padding))
    >{
    > .........
    >}
    >
    >The above if condition may fail or may succeed and is really
    >architecture dependent and non-portable.


    The above condition is guaranteed to fail (actually evaluate to 0)
    since pointer arithmetic is performed in units that match the sizeof
    the object pointed to. ptr+1 is not the address of the second byte in
    the structure but the address one byte beyond the end of the
    structure.

    If you cast the left expression to (char*) and change the right
    expression to ((char*)ptr + ... + byte_padding) you at least have
    something to discuss. If byte_padding is the sum of all the padding
    prior to member c, why do you think the expression would ever be
    false?

    >
    >I have a piece of code which manipulates lt many strcutures and work
    >semalessly well whether it is run on intel or powerpc.
    >
    >
    >> > One more thing i want to get clarified is do all the compilers align
    >> > structure members using natural alignement or does this all differ
    >> > from architecture to architecture

    >
    >
    >This also differes from architecture to architecture. If an
    >architecture has faster access to memories on double word boundary
    >then the byte padding would be more. If the architecure has faster
    >access to addresses on byte boundaries then there will not be any
    >padding.


    The compiler aligns members according to the way the compiler writer
    decided it should. It may do it the way you describe. It may do it
    some other way the compiler designer decided was more important (or
    easier to implement or to simplify debugging etc). It may do it
    differently depending on the options the user specified in that
    particular run.

    While most of us probably hope the compiler writer takes the
    architecture into serious consideration when making these decisions,
    he is not required to nor is he required to give the various aspects
    of the architecture the same weight any of us would. It is entirely
    possible for two compilers for the same architecture to do things
    completely differently. It is even possible for different versions of
    the same compiler to do it differently.


    Remove del for email
     
    Barry Schwarz, Jun 14, 2007
    #12
  13. In article <f4rfp6$884$>,
    Christopher Benson-Manica <> wrote:

    >> Obviously compiler designers don't insert padding for fun. It is because
    >> memory accesses to aligned members are more efficient. However it is always
    >> possible and usually not very inefficient to access non-aligned members.


    >Always possible? I'm sure many folks who have had to deal with "bus
    >error" and its friends would beg to differ.


    It's always possible for the implementation. However, it's likely to be
    slower, even if the hardware has support for it.

    -- Richard
    --
    "Consideration shall be given to the need for as many as 32 characters
    in some alphabets" - X3.4, 1963.
     
    Richard Tobin, Jun 14, 2007
    #13
  14. Christopher Benson-Manica wrote:
    >
    > Malcolm McLean <> wrote:
    >
    > > Obviously compiler designers don't insert padding for fun. It is because
    > > memory accesses to aligned members are more efficient. However it is always
    > > possible and usually not very inefficient to access non-aligned members.

    >
    > Always possible? I'm sure many folks who have had to deal with "bus
    > error" and its friends would beg to differ.


    And for "not very inefficiently", I've used a system which would
    allow you to access non-aligned values by catching the hardware
    fault, reading the properly-aligned values containing the non-
    aligned address you wanted, and (for read operations) extract the
    bits you accessed, or (for write operations) store the bits you
    were writing into the aligned values and write them back out.

    --
    +-------------------------+--------------------+-----------------------+
    | Kenneth J. Brody | www.hvcomputer.com | #include |
    | kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
    +-------------------------+--------------------+-----------------------+
    Don't e-mail me at: <mailto:>
     
    Kenneth Brody, Jun 14, 2007
    #14
  15. Richard Tobin <> wrote:

    > > (MMcL wrote:)
    > >> Obviously compiler designers don't insert padding for fun. It is because
    > >> memory accesses to aligned members are more efficient. However it is always
    > >> possible and usually not very inefficient to access non-aligned members.


    > It's always possible for the implementation. However, it's likely to be
    > slower, even if the hardware has support for it.


    I read the quoted text as "It is always possible for the implementor",
    i.e. the developer (definitely not on the DS9K) or the compiler writer
    (I would think that DS9K hardware could be sufficiently evil to
    make it impossible). Of course anything is possible for the hardware
    designer, and if that was what Malcolm actually intended I accept the
    correction.

    --
    C. Benson Manica | I *should* know what I'm talking about - if I
    cbmanica(at)gmail.com | don't, I need to know. Flames welcome.
     
    Christopher Benson-Manica, Jun 14, 2007
    #15
  16. In article <f4s7st$brb$>,
    Christopher Benson-Manica <> wrote:

    >> >> Obviously compiler designers don't insert padding for fun. It is because
    >> >> memory accesses to aligned members are more efficient. However it

    >is always
    >> >> possible and usually not very inefficient to access non-aligned members.


    >> It's always possible for the implementation. However, it's likely to be
    >> slower, even if the hardware has support for it.


    >I read the quoted text as "It is always possible for the implementor",
    >i.e. the developer (definitely not on the DS9K) or the compiler writer
    >(I would think that DS9K hardware could be sufficiently evil to
    >make it impossible).


    Even on the DS9K it must be possible.

    For example - and unfortunately I don't have the DS9K manual handy, so
    I will use a fictitious machine - suppose we can only do 4-byte
    aligned reads, but we want to read a 4-byte int at address 4n+1. Just
    generate code to read the two ints at 4n and 4n+4 and extract and
    combine the relevant bytes. I believe that some compilers on machines
    with alignment restrictions have had the option to use just such code.

    -- Richard
    --
    "Consideration shall be given to the need for as many as 32 characters
    in some alphabets" - X3.4, 1963.
     
    Richard Tobin, Jun 14, 2007
    #16
  17. Richard Tobin wrote:
    > In article <f4s7st$brb$>,
    > Christopher Benson-Manica <> wrote:
    >
    >>> >> Obviously compiler designers don't insert padding for fun. It is
    >>> >> because memory accesses to aligned members are more efficient.
    >>> >> However it

    >>is always
    >>> >> possible and usually not very inefficient to access non-aligned
    >>> >> members.

    >
    >>> It's always possible for the implementation. However, it's likely to be
    >>> slower, even if the hardware has support for it.

    >
    >>I read the quoted text as "It is always possible for the implementor",
    >>i.e. the developer (definitely not on the DS9K) or the compiler writer
    >>(I would think that DS9K hardware could be sufficiently evil to
    >>make it impossible).

    >
    > Even on the DS9K it must be possible.


    Yes, even if the only possibility is to use memcpy() or equivalent, that's
    one possibility.

    > For example - and unfortunately I don't have the DS9K manual handy, so
    > I will use a fictitious machine - suppose we can only do 4-byte
    > aligned reads, but we want to read a 4-byte int at address 4n+1. Just
    > generate code to read the two ints at 4n and 4n+4 and extract and
    > combine the relevant bytes. I believe that some compilers on machines
    > with alignment restrictions have had the option to use just such code.


    That would read before the start and beyond the end of the object, which
    causes problems, even if the machine words are partially accessible, on
    some current real-world (debugging) implementations.
     
    Harald van =?UTF-8?B?RMSzaw==?=, Jun 14, 2007
    #17
  18. In article <f4s9a1$2uja$>,
    Richard Tobin <> wrote:
    >In article <f4s7st$brb$>,
    >Christopher Benson-Manica <> wrote:
    >
    >>> >> Obviously compiler designers don't insert padding for fun. It is because
    >>> >> memory accesses to aligned members are more efficient. However it

    >>is always
    >>> >> possible and usually not very inefficient to access non-aligned members.


    >Even on the DS9K it must be possible.


    >For example - and unfortunately I don't have the DS9K manual handy, so
    >I will use a fictitious machine - suppose we can only do 4-byte
    >aligned reads, but we want to read a 4-byte int at address 4n+1. Just
    >generate code to read the two ints at 4n and 4n+4 and extract and
    >combine the relevant bytes.


    If the data to be read is same width as the bus read size, but the
    data is unaligned, then at least two bus reads would be necessary
    to fetch the unaligned data. Unfortunately, when you use multiple
    reads, you lose internal atomiticity, and by the time you get to
    issue the second read, the second part of the data might have changed.
    Or the first might have, leading you to write out the write sliced
    result. It becomes a race condition, even if you don't have multiple
    processors. And if you do have multiple processors... sometimes the
    maximum coherency lock you can assert is for the maximum bus read size,
    leading to problems.

    But you should expect problems with this setup anyhow. To make this
    clear: make the part that matches bus alignment volatile, so
    issuing the extra read or write on the alignment boundary results in
    undesirable behaviour.
    --
    "law -- it's a commodity"
    -- Andrew Ryan (The Globe and Mail, 2005/11/26)
     
    Walter Roberson, Jun 14, 2007
    #18
  19. Richard Tobin <> wrote:

    > For example - and unfortunately I don't have the DS9K manual handy, so
    > I will use a fictitious machine - suppose we can only do 4-byte
    > aligned reads, but we want to read a 4-byte int at address 4n+1. Just
    > generate code to read the two ints at 4n and 4n+4 and extract and
    > combine the relevant bytes. I believe that some compilers on machines
    > with alignment restrictions have had the option to use just such code.


    Aha, yes, I suppose I should have thought of that, although I imagine
    it adds another couple of espressos to the lives of compiler
    implementors. Thanks for the answer.

    --
    C. Benson Manica | I *should* know what I'm talking about - if I
    cbmanica(at)gmail.com | don't, I need to know. Flames welcome.
     
    Christopher Benson-Manica, Jun 14, 2007
    #19
  20. "Harald van Dijk" <> wrote in message
    news:f4sa6f$ceo$1.ov.home.nl...
    > Richard Tobin wrote:
    >> For example - and unfortunately I don't have the DS9K manual
    >> handy, so I will use a fictitious machine - suppose we can only
    >> do 4-byte aligned reads, but we want to read a 4-byte int at
    >> address 4n+1. Just generate code to read the two ints at 4n
    >> and 4n+4 and extract and combine the relevant bytes. I believe
    >> that some compilers on machines with alignment restrictions
    >> have had the option to use just such code.

    >
    > That would read before the start and beyond the end of the object,
    > which causes problems, even if the machine words are partially
    > accessible, on some current real-world (debugging)
    > implementations.


    It's UB to do that in C, but it's entirely possible (and highly likely) that
    the implementation can that under the hood safely. Code the compiler emits
    is only subject to the rules the architecture sets, not the C standard.

    S

    --
    Stephen Sprunk "Those people who think they know everything
    CCIE #3723 are a great annoyance to those of us who do."
    K5SSS --Isaac Asimov


    --
    Posted via a free Usenet account from http://www.teranews.com
     
    Stephen Sprunk, Jun 15, 2007
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. tweak
    Replies:
    14
    Views:
    2,789
    Eric Sosman
    Jun 11, 2004
  2. Alfonso Morra
    Replies:
    11
    Views:
    722
    Emmanuel Delahaye
    Sep 24, 2005
  3. Memory alignment in structures

    , Sep 26, 2005, in forum: C Programming
    Replies:
    1
    Views:
    428
    Gordon Burditt
    Sep 26, 2005
  4. Sandeep
    Replies:
    5
    Views:
    437
    Thomas Tutone
    Dec 4, 2005
  5. Replies:
    20
    Views:
    714
    Richard
    Aug 10, 2007
Loading...

Share This Page