Size of a structure : Structure Padding

K

Kislay

case 1 :
struct s
{
char c1[6]; // 8
double d; // 8
int i1; // 4
char c2[2]; // 4
int i2; // 4
};

case 2:
struct s
{
char c1[6]; // 8
double d; // 8
int i1; // 4
char c2[2]; // 4
int i2; // 4
float f // 4
};

According to the rules of structure padding shouldn't the size of the
above structure be 28 bytes (for a 32 bit compiler , I am using DEV-C+
+) ? But I got 32 as the answer , using sizeof . And when I added
"float f" to the structure , I was getting the same answer , 32 .
Well , if 32 is right instead of 28 in the first case , then after
adding a float , shouldn't it become 36 ? But even in case 2 , its
32 . Is there more to structure padding than simple clubbing together
in groups of 4 ? Somebody please shed some light ?
 
B

borophyll

case 1 :
struct s
{
char c1[6]; // 8
double d; // 8
int i1; // 4
char c2[2]; // 4
int i2; // 4

};

case 2:
struct s
{
char c1[6]; // 8
double d; // 8
int i1; // 4
char c2[2]; // 4
int i2; // 4
float f // 4

};

According to the rules of structure padding shouldn't the size of the
above structure be 28 bytes

The only rules for structure padding is that padding may occur
anywhere within the structure, except at its beginning. If this
structure was 100 bytes, that would be still OK

But I got 32 as the answer , using sizeof . And when I added
"float f" to the structure , I was getting the same answer , 32 .
Well , if 32 is right instead of 28 in the first case , then after
adding a float , shouldn't it become 36 ?

Not necessarily, the first structure may have had four bytes of
padding, and the second structure may have none
Is there more to structure padding than simple clubbing together
in groups of 4 ?

There is no requirement to "club" together any number of bytes. The
implementation is free to do what it wants, but usually it pads based
on the alignment requirements of the machine.

Regards,
B.
 
E

Eric Sosman

Kislay said:
case 1 :
struct s
{
char c1[6]; // 8
double d; // 8
int i1; // 4
char c2[2]; // 4
int i2; // 4
};

case 2:
struct s
{
char c1[6]; // 8
double d; // 8
int i1; // 4
char c2[2]; // 4
int i2; // 4
float f // 4
};

According to the rules of structure padding shouldn't the size of the
above structure be 28 bytes (for a 32 bit compiler , I am using DEV-C+
+) ?

Which of the two is "the above structure?"

But no: it really doesn't matter. The size of the struct
is the amount of memory the compiler chooses to devote to it.
Different compilers make different choices, because they are
concerned with different constraints -- more generally, with
different "figures of merit."
But I got 32 as the answer , using sizeof .

... which was the correct answer; bravo! (Of course, a
different compiler might have given a different answer, which
would *also* have been the correct answer; bravo!)
And when I added
"float f" to the structure , I was getting the same answer , 32 .

Another right answer! You're on a roll.
Well , if 32 is right instead of 28 in the first case , then after
adding a float , shouldn't it become 36 ? But even in case 2 , its
32 . Is there more to structure padding than simple clubbing together
in groups of 4 ? Somebody please shed some light ?

Hypothesis: On the system you are using at the moment, the
compiler tries to ensure that every double begins at a memory
address that is (if considered as a number) divisible by eight.
If the hypothesis is correct, does it imply anything about the
alignment requirement for a struct that has a double as one of
its members? Further, does the struct's alignment requirement
have any influence on its sizeof? Hint: In `struct s array[2];'
ponder the addresses `&array[0]' and `&array[1]'.
 
C

CBFalconer

Kislay said:
case 1 :
struct s
{
char c1[6]; // 8
double d; // 8
int i1; // 4
char c2[2]; // 4
int i2; // 4
};

case 2:
struct s
{
char c1[6]; // 8
double d; // 8
int i1; // 4
char c2[2]; // 4
int i2; // 4
float f // 4
};

According to the rules of structure padding shouldn't the size of the
above structure be 28 bytes (for a 32 bit compiler , I am using DEV-C+
+) ? But I got 32 as the answer , using sizeof . And when I added
"float f" to the structure , I was getting the same answer , 32 .
Well , if 32 is right instead of 28 in the first case , then after
adding a float , shouldn't it become 36 ? But even in case 2 , its
32 . Is there more to structure padding than simple clubbing together
in groups of 4 ? Somebody please shed some light ?

It looks as if your doubles need to be aligned to an address that
is zero modulo 8. Now explain how an array of two case 1 structs
could do that if the structure size is 28? I.e. they need padding
by an additional 4 bytes. Your float f added just uses that
padding space.
 
J

Jack Klein

case 1 :
struct s
{
char c1[6]; // 8
double d; // 8
int i1; // 4
char c2[2]; // 4
int i2; // 4
};

case 2:
struct s
{
char c1[6]; // 8
double d; // 8
int i1; // 4
char c2[2]; // 4
int i2; // 4
float f // 4
};

According to the rules of structure padding shouldn't the size of the
above structure be 28 bytes (for a 32 bit compiler , I am using DEV-C+
+) ? But I got 32 as the answer , using sizeof . And when I added
"float f" to the structure , I was getting the same answer , 32 .
Well , if 32 is right instead of 28 in the first case , then after
adding a float , shouldn't it become 36 ? But even in case 2 , its
32 . Is there more to structure padding than simple clubbing together
in groups of 4 ? Somebody please shed some light ?

No, it shouldn't be 28 bytes for a 32-bit compiler. It should be
whatever the compiler implementer decided it should be. For whatever
reasons the implementer decided.

And some of your assumptions are wrong. You seem to think that the
first member of each struct, "char c1[6]", occupies 8 bytes. I
guarantee you that it does not, never did, never will, and cannot in
C. An array of 6 chars occupies exactly 6 bytes, always has, and
always will.

It may very well be that the second member of your structure, "double
d", has an address that is 8 bytes from the start of the structure.
That does not mean that c1 occupies 8 bytes, it occupies 6 bytes. It
means that the compiler has inserted 2 padding bytes after c1.

Likewise the array c2 does not occupy 4 bytes, but exactly 2 bytes. If
your compiler lays out the structure so that there are 4 bytes between
c2 and i2, that means there are 2 padding bytes after c2.

As for why the size of this structure is 32 bytes, with or without the
final float member, that is because this is what the compiler
implementor(s) decided it should be. Or perhaps always to 16 bytes or
8 bytes. 28 bytes is not an exact multiple of any of these. Nor is
36.

Assuming that you are correct about the sizes other than those for the
two character arrays, the widest member in your structure is an 8-byte
double. It is quite possible that the implementation aligns the size
of the structure to the alignment of its widest member. If doubles
are required to be accessed on 8-byte boundaries, or just more
efficiently accessed that way, the compiler might always add padding
at the end of a structure containing one or more doubles so that the
size of the structure is a multiple of 8.

C guarantees several things about structures:

1. The first member of the structure starts at the same address as
the structure itself, that is there is no padding before the first
member.

2. The members of the structure are allocated in the structure in the
order of their declaration in the structure definition, they are never
rearranged.

3. The size of the structure will be greater than or equal to the sum
of the sizes of its members, never less. There is another language
down the hall where this is not always true.

If you want to know why the implementor(s) of a particular compiler
decided on the alignment and padding strategy that they use, either
ask them (on a gcc mailing list) or ask in a group that supports that
particular compiler.

The behavior that you see is correct under the C standard, and nowhere
does the language define why a particular implementation uses padding
in a particular way.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.club.cc.cmu.edu/~ajo/docs/FAQ-acllc.html
 
K

Kenneth Brody

Kislay said:
case 1 :
struct s
{
char c1[6]; // 8
double d; // 8
int i1; // 4
char c2[2]; // 4
int i2; // 4
};

case 2:
struct s
{
char c1[6]; // 8
double d; // 8
int i1; // 4
char c2[2]; // 4
int i2; // 4
float f // 4
};

According to the rules of structure padding

Well, the only "rule" is that the compiler can't put padding before
the first element. Beyond that, whatever works is valid. Nothing
in the standard (AFAIK) would forbid padding which put every
element of the above structs on 32-byte boundaries.
shouldn't the size of the
above structure be 28 bytes (for a 32 bit compiler , I am using DEV-C+
+) ? But I got 32 as the answer , using sizeof . And when I added
"float f" to the structure , I was getting the same answer , 32 .
Well , if 32 is right instead of 28 in the first case , then after
adding a float , shouldn't it become 36 ? But even in case 2 , its
32 .

Consider the possibility that your compiler places doubles on 8-
byte boundaries, and because your struct includes a double, it
must place the entire struct on an 8-byte boundary. This would
mean that case 1 would place 4 padding bytes at the end of the
struct, for a total size of 32. It would also mean that in case
2, those same 4 padding bytes would be occupied by "float f", and
no additional padding would be necessary, meaning the size would
still be 32.
Is there more to structure padding than simple clubbing together
in groups of 4 ? Somebody please shed some light ?

By "clubbing together in groups of 4", I assume you mean "aligning
things to 4-byte boundaries". First, nothing says that things
must be aligned on 4-byte boundaries.

Consider:

struct s
{
char c1[3];
char c2[3];
};

On my system, this has a size of 6, and no padding is involved at
all. This is because the char arrays have no alignment issues.
There is no need to add any padding, as c1[] and c2[] can be at
any address.

Also, you seem to think that the padding bytes are part of each
elememt. For example, your comment "8" after "char c1[6]" makes
be think that you believe that c1[] occupies 8 bytes. It does
not. Rather, it occupies 6 bytes, and there are 2 bytes of
padding inserted by the compiler. (Well, 2 bytes may be inserted
as padding by your particular implementation. Nothing requires
that it do so, as far as the standard is concerned.)

So, for your particular implementation, you should really have:

struct s
{
char c1[6]; // 6
// 2 padding bytes
double d; // 8
int i1; // 4
char c2[2]; // 2
// 2 padding bytes
int i2; // 4
// 4 padding bytes
}; // (Total: 32 bytes)

(Disregarding the "// as comments" issue for the sake of clarity.)

+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 
K

Kislay

The individual units of the structure occupy the number of bytes they
are supposed to , and the extra bytes , if any are the padded ones .
So far so good . But the padded bytes will never be free to be used
for anything else , right ? So it can be said that on the whole the
structure is occupying the actual bytes plus the padded ones . The
padded bytes go waste . Consider the following structure ,
struct s
{
char c[2];
double d;
};
c occupies 2 B , d occupies 8 & the structure s occupies 16 (6 padded
bytes) . If these padded bytes cannot be put to any use , isn't it a
big waste of memory . Is structure padding really worth that ?
 
S

santosh

Kislay said:
The individual units of the structure occupy the number of bytes they
are supposed to , and the extra bytes , if any are the padded ones .
So far so good . But the padded bytes will never be free to be used
for anything else , right ?

Not easily. But if you know certain details of your implementation, then
can "use" the space occupied by the padding bytes by reading and writing to
the structure as an array of unsigned char.

This hack however is horribly non-portable and should never be actually
tried.
So it can be said that on the whole the
structure is occupying the actual bytes plus the padded ones . The
padded bytes go waste . Consider the following structure ,
struct s
{
char c[2];
double d;
};
c occupies 2 B , d occupies 8 & the structure s occupies 16 (6 padded
bytes) . If these padded bytes cannot be put to any use , isn't it a
big waste of memory . Is structure padding really worth that ?

It's a matter of trade-off. Aligned objects are often much faster to access
than misaligned ones. Under certain systems objects _have_ to aligned at
certain boundaries. Yes, some space is wasted, but generally the gain in
execution speed is more important than the space consumed.

Do keep in mind that for cases where you *really* need to conserve space,
most implementations have specific switches or "directives" to turn off
structure padding.
 
E

Eric Sosman

santosh wrote On 10/02/07 14:38,:
Kislay wrote:




Not easily. But if you know certain details of your implementation, then
can "use" the space occupied by the padding bytes by reading and writing to
the structure as an array of unsigned char.

This hack however is horribly non-portable and should never be actually
tried.

It is also extremely fragile, and subject to silent
breakage. Since the padding bytes contain nothing of
value, the compiler "knows" it doesn't need to copy them
when assigning structs, or preserve them when working
with the struct's named elements. (For example, storing
a value in the named element `x' might also write the
padding bytes adjacent to `x'.)
Do keep in mind that for cases where you *really* need to conserve space,
most implementations have specific switches or "directives" to turn off
structure padding.

Still another approach is to use "parallel arrays"
instead of structs. That is, instead of

struct {
double trouble;
char broiled;
/* padding likely here */
} array[HUGE];

which devotes perhaps 3*HUGE or 7*HUGE bytes to padding,
you could use

double trouble[HUGE];
char broiled[HUGE];

The substitution is not entirely free, of course.
You could have used things like qsort() on the array of
structs, but will need a different approach for the pair
of independent (as far as C knows) arrays. Data locality
may (or may not) suffer. Passing values to and from
functions may be a bit clumsier. But if HUGE is large
enough, the savings may be worth the discomfort.
 
J

J. J. Farrell

The individual units of the structure occupy the number of bytes they
are supposed to , and the extra bytes , if any are the padded ones .
So far so good . But the padded bytes will never be free to be used
for anything else , right ?

More or less.
So it can be said that on the whole the
structure is occupying the actual bytes plus the padded ones .
Yes.

The padded bytes go waste .

Whether or not they are 'waste' is a value judgement. The memory they
occupy is not used for any other purpose than padding, certainly.
Consider the following structure ,
struct s
{
char c[2];
double d;};

c occupies 2 B , d occupies 8 & the structure s occupies 16 (6 padded
bytes) . If these padded bytes cannot be put to any use , isn't it a
big waste of memory .

If the code is running on a processor which can only access doubles if
they are on an 8-byte boundary, then it's clearly not a waste at all -
the program simply wouldn't work without it.

If the code is running on a processor which can access doubles much
more efficiently if they are on an 8-byte boundary, but is able to
access them more slowly if they are on another boundary, then whether
or not it is a waste (and how big a waste it is) depends entirely on
how important speed of execution is to you compared to memory size.

If the code is running on a processor which can access doubles equally
efficiently on any boundary, then it would be a big waste - but I
wouldn't expect the compiler to use any padding for that processor.
Is structure padding really worth that ?

In the first case, obviously yes - it's a choice between having the
padding or not having a working program.

In the second case it depends on the tradeoff between performance and
size.

In the third case, no, I'd call a compiler buggy if it used padding
like that in this case (unless there were some other reason to do so).
 
C

CBFalconer

Kislay said:
The individual units of the structure occupy the number of bytes they
are supposed to , and the extra bytes , if any are the padded ones .
So far so good . But the padded bytes will never be free to be used
for anything else , right ? So it can be said that on the whole the
structure is occupying the actual bytes plus the padded ones . The
padded bytes go waste . Consider the following structure ,
struct s {
char c[2];
double d;
};
c occupies 2 B , d occupies 8 & the structure s occupies 16 (6 padded
bytes) . If these padded bytes cannot be put to any use , isn't it a
big waste of memory . Is structure padding really worth that ?

Yes, it is necessary. Without it s.d would not be accessible as a
double. That is assuming the needed double alignment is an address
that is 0 modulo 8.

Since the needed alignment for a double is system sensitive, you
can quickly see why the interior structure of a struct is not
portable.
 
K

Kenneth Brody

Kislay said:
The individual units of the structure occupy the number of bytes they
are supposed to , and the extra bytes , if any are the padded ones .
So far so good . But the padded bytes will never be free to be used
for anything else , right ? So it can be said that on the whole the
structure is occupying the actual bytes plus the padded ones . The
padded bytes go waste . Consider the following structure ,
struct s
{
char c[2];
double d;
};
c occupies 2 B , d occupies 8 & the structure s occupies 16 (6 padded
bytes) . If these padded bytes cannot be put to any use , isn't it a
big waste of memory . Is structure padding really worth that ?

Define "worth it".

Consider a platform with a 64-bit memory bus. That is, all access
to memory is done 8 bytes* at a time. Consider further that, should
a read to a non-aligned double be performed, this would take two
reads from memory rather than one. Consider further still, that a
write to a non-aligned double would, rather than being just a single
write, need two reads plus two writes. That's at least a 100%
performance hit on reads, and 300% on writes.

Next, consider a similar platform which, rather than allowing the
non-aligned access, instead causes a hardware exception. On such a
platform, this probably means any program using such a construct
would crash. On other platforms, an exception handler takes over
and allows the access to happen, by executing code which reads the
two aligned 8-byte values and pulls out the appropriate bytes to
be assembled and returned. Or, for writes, reads the two values,
combines the bytes as needed, and writes both values back out.
Imagine the performance hit on this platform. It is quite easy to
forsee a 10,000% performace hit or more.

(Note: I have worked on platforms with all three of the above
scenarios, though not necessarily 8-byte alignment. And note, too,
that except for the "crash on non-aligned values" system, the C
compilers I used had an option to change the alignment used.)

So, once again, define "worth it".


* - Okay, so I'm assuming 8-bit bytes here. Sue me.

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 
G

Gordon Burditt

Consider the following structure ,
struct s
{
char c[2];
double d;};

c occupies 2 B , d occupies 8 & the structure s occupies 16 (6 padded
bytes) . If these padded bytes cannot be put to any use , isn't it a
big waste of memory .

If the code is running on a processor which can only access doubles if
they are on an 8-byte boundary, then it's clearly not a waste at all -
the program simply wouldn't work without it.

I'm not sure I believe there are platforms where it "simply wouldn't
work" in a way the compiler writer couldn't work around (perhaps
very painfully). Accessing a (possibly) misaligned double can be
done by memcpy()ing to a temporary variable that is aligned, and
using the value there. Storing values can be done in reverse: store
to an aligned temporary, then memcpy() to the (possibly) misaligned
location. All this can be done automatically by the compiler. It
will likely cause a HUGE performance penalty, but it can be done.
You also probably get the performance penalty if it MIGHT be
misaligned rather than only if it IS misaligned.

It still comes down to a tradeoff: execution time vs. memory use.
If the code is running on a processor which can access doubles much
more efficiently if they are on an 8-byte boundary, but is able to
access them more slowly if they are on another boundary, then whether
or not it is a waste (and how big a waste it is) depends entirely on
how important speed of execution is to you compared to memory size.

I think all processors that can implement memcpy() fall in this category,
but it may be the compiler, not the hardware, that determines the speed.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top