structures and alignment issues

silpau · Jun 14, 2007

struct a
{
int b;
char a;
int c;
}

On i386 the sizeof this structure would be 12 bytes,as 3 bytes are
padded after a so that c is aligned on 4byte boundary.

So the doubts that i have is

1) Does the poratbility issue come into play only when i persist this
structure on one architecture ( for ex i386) and try to read the
structure back on a different architecture(for ex motorola series)

silpau · Jun 14, 2007

struct a
{
int b;
char a;
int c;

}

On i386 the sizeof this structure would be 12 bytes,as 3 bytes are
padded after a so that c is aligned on 4byte boundary.

So the doubts that i have is

1) Does the poratbility issue come into play only when i persist this
structure on one architecture ( for ex i386) and try to read the
structure back on a different architecture(for ex motorola series)

One more thing i want to get clarified is do all the compilers align
structure members using natural alignement or does this all differ
from architecture to architecture

Ian Collins · Jun 14, 2007

struct a
{
int b;
char a;
int c;
}

On i386 the sizeof this structure would be 12 bytes,as 3 bytes are
padded after a so that c is aligned on 4byte boundary.

That's one possible alignment.

So the doubts that i have is

1) Does the poratbility issue come into play only when i persist this
structure on one architecture ( for ex i386) and try to read the
structure back on a different architecture(for ex motorola series)

Or the same architecture with a different compiler, or the same compiler
with different options. Or...

Morris Dovey · Jun 14, 2007

(e-mail address removed) wrote:

| 1) Does the poratbility issue come into play only when i persist
| this structure on one architecture ( for ex i386) and try to read
| the structure back on a different architecture(for ex motorola
| series)

No, it can always be an issue. Consider the differences in how a
single compiler on a single architecture might treat this structure
when told to optimize for speed vs when told to optimize for size...

Malcolm McLean · Jun 14, 2007

One more thing i want to get clarified is do all the compilers align
structure members using natural alignement or does this all differ
from architecture to architecture

Obviously compiler designers don't insert padding for fun. It is because
memory accesses to aligned members are more efficient. However it is always
possible and usually not very inefficient to access non-aligned members. The
question is where to make the trade off, and opinions differ.

Taran · Jun 14, 2007

There should't be any issues if you use the structure name to
reference the members of it.
like struct_var.a, struct_var.b and so on. The byte padding is
transparent and the compiler will take care that when you get the same
value when you read struct_var.a as you would have stored using
struct_var.a = value.

But what the compiler doesn't gaurantee is that you take a pointer to
this struct and then try this

struct a * ptr = &struct_a_var;
int byte_padding = 3;
if( &struct_a_var.c == (ptr + sizeof(struct_a_var.a)
+sizeof(struct_a_var.b) + byte_padding))
{
.........
}

The above if condition may fail or may succeed and is really
architecture dependent and non-portable.

I have a piece of code which manipulates lt many strcutures and work
semalessly well whether it is run on intel or powerpc.

This also differes from architecture to architecture. If an
architecture has faster access to memories on double word boundary
then the byte padding would be more. If the architecure has faster
access to addresses on byte boundaries then there will not be any
padding.

HTH.

Mark McIntyre · Jun 14, 2007

struct a
{
int b;
char a;
int c;
}

On i386 the sizeof this structure would be 12 bytes,as 3 bytes are
padded after a so that c is aligned on 4byte boundary.

So the doubts that i have is

1) Does the poratbility issue come into play only when i persist this
structure on one architecture ( for ex i386) and try to read the
structure back on a different architecture(for ex motorola series)

No - even on one h/w platform you can see different padding depending
on compiler settings. See the FAQ for further discussion.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Mark McIntyre · Jun 14, 2007

One more thing i want to get clarified is do all the compilers align
structure members using natural alignement or does this all differ
from architecture to architecture

Its platform-dependent.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Mark McIntyre · Jun 14, 2007

There should't be any issues if you use the structure name to
reference the members of it.

Remember he is talking about persisting the data ie storing it to disk
or similar. When you read it back in, you will have to account for the
padding properly, in order to read in the data to the right members.

I have a piece of code which manipulates lt many strcutures and work
semalessly well whether it is run on intel or powerpc.

Yes -it'll work fine provided you don't store binary data to disk,
copy the file to a different platform, and try to read it in again.

The FAQ talks about this in section 20.

--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

Christopher Benson-Manica · Jun 14, 2007

Malcolm McLean said:
Obviously compiler designers don't insert padding for fun. It is because
memory accesses to aligned members are more efficient. However it is always
possible and usually not very inefficient to access non-aligned members.

Always possible? I'm sure many folks who have had to deal with "bus
error" and its friends would beg to differ.

Kenneth Brody · Jun 14, 2007

struct a
{
int b;
char a;
int c;
}

On i386 the sizeof this structure would be 12 bytes,as 3 bytes are
padded after a so that c is aligned on 4byte boundary.

Well, that's one possibility, but not the only one.

So the doubts that i have is

1) Does the poratbility issue come into play only when i persist this
structure on one architecture ( for ex i386) and try to read the
structure back on a different architecture(for ex motorola series)

Not only do you have to be concerned about padding, but byte order
as well. The i386 series is little-endian, and the Motorola chips
that I am familiar with are big-endian. Even if the compiler were
to use the same padding, the values of b and c won't be interpreted
the same way. For example, the i386 writes 1 as 01/00/00/00 which
will be seem by the big-endian CPU as 0x01000000, or 16,777,216.

BTDTGTTS.

I support a cross-platform database which writes such things to the
data file. (Please note that this app doesn't live in the strict C
world, but rather C plus POSIX plus some limited system-specific code
world.) While the source is 99% platform independent, the data files
are not, and a utility is included to massage the data from one
platform to another, should you wish to move the data files to
another system.

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>

Barry Schwarz · Jun 14, 2007

There should't be any issues if you use the structure name to
reference the members of it.
like struct_var.a, struct_var.b and so on. The byte padding is
transparent and the compiler will take care that when you get the same
value when you read struct_var.a as you would have stored using
struct_var.a = value.

But what the compiler doesn't gaurantee is that you take a pointer to
this struct and then try this

struct a * ptr = &struct_a_var;
int byte_padding = 3;
if( &struct_a_var.c == (ptr + sizeof(struct_a_var.a)
+sizeof(struct_a_var.b) + byte_padding))
{
.........
}

The above if condition may fail or may succeed and is really
architecture dependent and non-portable.

The above condition is guaranteed to fail (actually evaluate to 0)
since pointer arithmetic is performed in units that match the sizeof
the object pointed to. ptr+1 is not the address of the second byte in
the structure but the address one byte beyond the end of the
structure.

If you cast the left expression to (char*) and change the right
expression to ((char*)ptr + ... + byte_padding) you at least have
something to discuss. If byte_padding is the sum of all the padding
prior to member c, why do you think the expression would ever be
false?

I have a piece of code which manipulates lt many strcutures and work
semalessly well whether it is run on intel or powerpc.

This also differes from architecture to architecture. If an
architecture has faster access to memories on double word boundary
then the byte padding would be more. If the architecure has faster
access to addresses on byte boundaries then there will not be any
padding.

The compiler aligns members according to the way the compiler writer
decided it should. It may do it the way you describe. It may do it
some other way the compiler designer decided was more important (or
easier to implement or to simplify debugging etc). It may do it
differently depending on the options the user specified in that
particular run.

While most of us probably hope the compiler writer takes the
architecture into serious consideration when making these decisions,
he is not required to nor is he required to give the various aspects
of the architecture the same weight any of us would. It is entirely
possible for two compilers for the same architecture to do things
completely differently. It is even possible for different versions of
the same compiler to do it differently.

Remove del for email

Richard Tobin · Jun 14, 2007

Obviously compiler designers don't insert padding for fun. It is because
memory accesses to aligned members are more efficient. However it is always
possible and usually not very inefficient to access non-aligned members.

[/QUOTE]

Always possible? I'm sure many folks who have had to deal with "bus
error" and its friends would beg to differ.

It's always possible for the implementation. However, it's likely to be
slower, even if the hardware has support for it.

-- Richard

Kenneth Brody · Jun 14, 2007

Christopher said:
Always possible? I'm sure many folks who have had to deal with "bus
error" and its friends would beg to differ.

And for "not very inefficiently", I've used a system which would
allow you to access non-aligned values by catching the hardware
fault, reading the properly-aligned values containing the non-
aligned address you wanted, and (for read operations) extract the
bits you accessed, or (for write operations) store the bits you
were writing into the aligned values and write them back out.

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>

Christopher Benson-Manica · Jun 14, 2007

It's always possible for the implementation. However, it's likely to be
slower, even if the hardware has support for it.

I read the quoted text as "It is always possible for the implementor",
i.e. the developer (definitely not on the DS9K) or the compiler writer
(I would think that DS9K hardware could be sufficiently evil to
make it impossible). Of course anything is possible for the hardware
designer, and if that was what Malcolm actually intended I accept the
correction.

Richard Tobin · Jun 14, 2007

It's always possible for the implementation. However, it's likely to be
slower, even if the hardware has support for it.

[/QUOTE]

I read the quoted text as "It is always possible for the implementor",
i.e. the developer (definitely not on the DS9K) or the compiler writer
(I would think that DS9K hardware could be sufficiently evil to
make it impossible).

Even on the DS9K it must be possible.

For example - and unfortunately I don't have the DS9K manual handy, so
I will use a fictitious machine - suppose we can only do 4-byte
aligned reads, but we want to read a 4-byte int at address 4n+1. Just
generate code to read the two ints at 4n and 4n+4 and extract and
combine the relevant bytes. I believe that some compilers on machines
with alignment restrictions have had the option to use just such code.

-- Richard

Harald van =?UTF-8?B?RMSzaw==?= · Jun 14, 2007

I read the quoted text as "It is always possible for the implementor",
i.e. the developer (definitely not on the DS9K) or the compiler writer
(I would think that DS9K hardware could be sufficiently evil to
make it impossible).

Even on the DS9K it must be possible.[/QUOTE]

Yes, even if the only possibility is to use memcpy() or equivalent, that's
one possibility.

For example - and unfortunately I don't have the DS9K manual handy, so
I will use a fictitious machine - suppose we can only do 4-byte
aligned reads, but we want to read a 4-byte int at address 4n+1. Just
generate code to read the two ints at 4n and 4n+4 and extract and
combine the relevant bytes. I believe that some compilers on machines
with alignment restrictions have had the option to use just such code.

That would read before the start and beyond the end of the object, which
causes problems, even if the machine words are partially accessible, on
some current real-world (debugging) implementations.

Walter Roberson · Jun 14, 2007

[/QUOTE]

Even on the DS9K it must be possible.

For example - and unfortunately I don't have the DS9K manual handy, so
I will use a fictitious machine - suppose we can only do 4-byte
aligned reads, but we want to read a 4-byte int at address 4n+1. Just
generate code to read the two ints at 4n and 4n+4 and extract and
combine the relevant bytes.

If the data to be read is same width as the bus read size, but the
data is unaligned, then at least two bus reads would be necessary
to fetch the unaligned data. Unfortunately, when you use multiple
reads, you lose internal atomiticity, and by the time you get to
issue the second read, the second part of the data might have changed.
Or the first might have, leading you to write out the write sliced
result. It becomes a race condition, even if you don't have multiple
processors. And if you do have multiple processors... sometimes the
maximum coherency lock you can assert is for the maximum bus read size,
leading to problems.

But you should expect problems with this setup anyhow. To make this
clear: make the part that matches bus alignment volatile, so
issuing the extra read or write on the alignment boundary results in
undesirable behaviour.

Christopher Benson-Manica · Jun 14, 2007

Richard Tobin said:
For example - and unfortunately I don't have the DS9K manual handy, so
I will use a fictitious machine - suppose we can only do 4-byte
aligned reads, but we want to read a 4-byte int at address 4n+1. Just
generate code to read the two ints at 4n and 4n+4 and extract and
combine the relevant bytes. I believe that some compilers on machines
with alignment restrictions have had the option to use just such code.

Aha, yes, I suppose I should have thought of that, although I imagine
it adds another couple of espressos to the lives of compiler
implementors. Thanks for the answer.

Stephen Sprunk · Jun 15, 2007

Harald van DÄ³k said:
That would read before the start and beyond the end of the object,
which causes problems, even if the machine words are partially
accessible, on some current real-world (debugging)
implementations.

It's UB to do that in C, but it's entirely possible (and highly likely) that
the implementation can that under the hood safely. Code the compiler emits
is only subject to the rules the architecture sets, not the C standard.

S

Data alignment questin, structures	46	Jan 12, 2013
struct alignment	14	Jan 11, 2012
Alignment problems	20	Dec 1, 2011
Alignment of a structure.	6	Jan 23, 2008
alignment issues	8	Sep 24, 2008
gcc alignment options	19	Sep 16, 2012
Need clarity on structure alignment	60	Jan 27, 2009
Memory alignment	53	Oct 3, 2008

structures and alignment issues

silpau

silpau

Ian Collins

Morris Dovey

Malcolm McLean

Taran

Mark McIntyre

Mark McIntyre

Mark McIntyre

Christopher Benson-Manica

Kenneth Brody

Barry Schwarz

Richard Tobin

Kenneth Brody

Christopher Benson-Manica

Richard Tobin

Harald van =?UTF-8?B?RMSzaw==?=

Walter Roberson

Christopher Benson-Manica

Stephen Sprunk

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads