size_t in a struct

E

Edward Rutherford

Hello Group

To avoid padding in structures, where is the best place to put size_t
variables?

According the faq question 2.12 (http://c-faq.com/struct/padding.html),
it says:

"If you're worried about wasted space, you can minimize the effects of
padding by ordering the members of a structure based on their base
types, from largest to smallest."

So if I have the following:

typedef struct __buffer_t {

char *buffer;
size_t size;

} Buffer_t;

I should have mimized padding since size_t is an unsigned long.
However, will size_t ever become an unsigned long long? If it does,
then size_t still wouldn't be larger than a pointer on the system,
right? But where should I put size_t in relation to other integer
declarations? Should I all ways assume size_t is an unsigned long when
building structures?

Now, if I made another structure:

typedef struct __buffer2_t {

Buffer_t name;
char *buffer;
size_t size;

} Buffer2_t;

Is the above still the correct sequence to minimize padding?

Thanks.
 
S

Shao Miller

Hello Group

Hello. :)
To avoid padding in structures, where is the best place to put size_t
variables?

According the faq question 2.12 (http://c-faq.com/struct/padding.html),
it says:

"If you're worried about wasted space, you can minimize the effects of
padding by ordering the members of a structure based on their base
types, from largest to smallest."

And the sizes of those types can vary per implementation, too, if I
recall correctly.
So if I have the following:

typedef struct __buffer_t {

char *buffer;
size_t size;

} Buffer_t;

I should have mimized padding since size_t is an unsigned long.

It is? How did you determine that?
However, will size_t ever become an unsigned long long?

It might, as far as I know.
If it does,
then size_t still wouldn't be larger than a pointer on the system,
right?

Why not? Some pointer types needn't even be the same size. 'void *'
and 'char *' need to be the same size. 'struct XXX *' and 'struct YYY
*' need to be the same size. 'union XXX *' and 'union YYY *' need to be
the same size. (If I'm not mistaken.)
But where should I put size_t in relation to other integer
declarations? Should I all ways assume size_t is an unsigned long when
building structures?

You might wish to make a different decision for each different
implementation. Are you concerned with portability and multiple
implementations?
Now, if I made another structure:

typedef struct __buffer2_t {

Buffer_t name;
char *buffer;
size_t size;

} Buffer2_t;

Please avoid using a leading double-underscore sequence in your
identifiers; they're reserved. So is using a leading underscore
followed by an upper-case letter. Trailing underscores can be fun.
Is the above still the correct sequence to minimize padding?

I think it depends on your implementation. It's possible that some kind
of macro magic could help you to generate a structure with the least
padding, given _any_ implementation. But you might be stuck either
observing the best order for each implementation and changing
accordingly, or playing the statistics for some favourite set of
implementations' common size decisions. For example, it might be common
that both 'size_t' and 'char *' are 32 bits. A struct with those two
members might commonly have no padding (given some set of favourite
implementations).

Or maybe someone else will have better advice.
 
K

Keith Thompson

Edward Rutherford said:
To avoid padding in structures, where is the best place to put size_t
variables?

According the faq question 2.12 (http://c-faq.com/struct/padding.html),
it says:

"If you're worried about wasted space, you can minimize the effects of
padding by ordering the members of a structure based on their base
types, from largest to smallest."

Right, but you can't always know the relative sizes of the various
members. Conceivably you could write a program that prints out the
sizes, then generate C code based on that program's output, but it's
unlikely to be worth the effort.

The FAQ offers a rule of thumb. It's entirely possible that a
smaller type could have a stricter alignment requirement than a
larger type.
So if I have the following:

typedef struct __buffer_t {

char *buffer;
size_t size;

} Buffer_t;

Don't use the name "__buffer_t". Names starting with two underscores
are reserved. (And I think names ending with "_t" are reserved by
POSIX, which may or may not be a concern.)

You can use the same identifier for the struct tag and the typedef:

typedef struct Buffer {
...
} Buffer;

Or you can just omit the typedef and refer to the struct as
"struct Buffer".
I should have mimized padding since size_t is an unsigned long.

Correction: size_t is unsigned long on your system.
However, will size_t ever become an unsigned long long? If it does,
then size_t still wouldn't be larger than a pointer on the system,
right? But where should I put size_t in relation to other integer
declarations? Should I all ways assume size_t is an unsigned long when
building structures?

Don't assume that size_t is unsigned long; it's not necessarily correct,
and even if it were it wouldn't really buy you anything.

size_t and char* are probably going to be the same size anyway. (The
only reason they wouldn't be is if the system has a non-monolithic
addressing scheme; char* has to represent the address of any byte within
any object in memory, but size_t only needs to hold the size of a single
object.)
Now, if I made another structure:

typedef struct __buffer2_t {

Buffer_t name;
char *buffer;
size_t size;

} Buffer2_t;

Is the above still the correct sequence to minimize padding?

That structure *probably* won't have any padding. But I suggest that
it's not worth the time you're spending on it. Compilers can, in
principle, insert padding anywhere they like (other than before the
first member), for any arbitrary reason, so you can't guarantee that
there will be *no* padding unless you resort to compiler-specific
extensions. But in reality, compilers will insert padding only
where it's necessary to satisfy alignment requirements.

Will your code break if there's padding between two members of your
struct? If so, can you fix it?
 
N

Nobody

To avoid padding in structures, where is the best place to put size_t
variables?
I should have mimized padding since size_t is an unsigned long.
However, will size_t ever become an unsigned long long? If it does,
then size_t still wouldn't be larger than a pointer on the system,
right? But where should I put size_t in relation to other integer
declarations? Should I all ways assume size_t is an unsigned long when
building structures?

Not really. There's at least one popular platform where this isn't true:
Win64 has a 32-bit long (for compatbility with the vast amounts of Win32
(and even Win16) code which assumes that long is 32-bit) but 64-bit
pointers and size_t.

The most common cases I know of are:

long size_t pointer architecture
32 32 32 typical 32-bit system
64 64 64 typical 64-bit system
32 64 64 Win64
32 16 32 8086 large/compact memory models
32 16 16 8086 tiny/small/medium memory models
32 32 32 8086 huge memory model

FWIW, I'd assume:

sizeof(long) <= sizeof(size_t) <= sizeof(void*)

This holds for pretty much anything except for 8086 real-mode plus some
architectures you'll probably never hear of, let alone use. Even if it
doesn't hold, you just end up with some extra padding.

If you're writing portable code, it probably isn't worth moving fields
around depending upon the platform just to avoid padding. Unless you'll
be creating a lot of these structures, it probably isn't even worth taking
account of padding when choosing the field placement if that means that it
will be separated from related fields.
 
E

Edward Rutherford

Shao said:
Hello. :)

And the sizes of those types can vary per implementation, too, if I
recall correctly.


It is? How did you determine that?


It might, as far as I know.

So standard-wise, how do I handle size_t in structures to minmize
padding? Should size_t all ways follow after pointers in structures?
And if I have integers in the structure, where should I put size_t?

<OT>
According my style(9) man page, the suggestion is to:

"When declaring variables in structures, declare them sorted by use,
then
by size (largest to smallest), then by alphabetical order. "

This would lead to padding, right? But I would have to assume a size
for size_t to follow that style.
</OT>

Thanks.
 
I

Ian Collins

So standard-wise, how do I handle size_t in structures to minmize
padding? Should size_t all ways follow after pointers in structures?
And if I have integers in the structure, where should I put size_t?

The standard does not go into the specifics of padding.

It is a reasonably safe bet to assume sizeof(size_t) == sizeof(void*).

I would point size_t members after pointer members.
<OT>
According my style(9) man page, the suggestion is to:

"When declaring variables in structures, declare them sorted by use,
then
by size (largest to smallest), then by alphabetical order. "

This would lead to padding, right? But I would have to assume a size
for size_t to follow that style.
</OT>

It would, but in most uses, padding isn't really an issue.
 
K

Keith Thompson

Edward Rutherford said:
So standard-wise, how do I handle size_t in structures to minmize
padding? Should size_t all ways follow after pointers in structures?
And if I have integers in the structure, where should I put size_t?

Standard-wise, you don't. The standard requires at least enough padding
to satisfy alignment requirements, but it says very little about what
those requirements are.

If you really absolutely need to avoid any possibility of padding,
declare an array of unsigned char and use memcpy() to copy the
individual members in an out.
<OT>
According my style(9) man page, the suggestion is to:

"When declaring variables in structures, declare them sorted by use,
then
by size (largest to smallest), then by alphabetical order. "

This would lead to padding, right? But I would have to assume a size
for size_t to follow that style.
</OT>

It might, though I've never followed that kind of rule myself.

Again, *why* are you so concerned about minimizing padding? What bad
thing would happen if there were a little padding between members in
your structs?

If you'll tell us what your actual goal is, we'll be better able to help
you.
 
A

Angel

If you're working with 64-bit pointers, does struct packing have significant
impact on your program size and speed?

Padding is there to prevent unaligned access which would impact speed
and can create subtle bugs on some implementations. Eliminating padding
doesn't improve speed as the compiler already makes sure the data is
accessed in the fastest possible way. It just saves some space.

However, we're talking about just a handful of bytes, nothing in this
day and age where memory is measured in gigabytes. Unless you're developing
for an embedded system or plan to create a heck of a lot of instances of
your structure, padding is really nothing to be worried about and packing
your structures is probably a waste of time better spend on something
more worthwhile.
 
J

Jorgen Grahn

.
According my style(9) man page, the suggestion is to:

"When declaring variables in structures, declare them sorted by use,
then by size (largest to smallest), then by alphabetical order. "

Is this man page something you wrote, or else where does it come from?
Section 9 is for Unix kernel code, so I assume you're on some BSD?

*googles* Ah, nice -- it's the *BSD kernel style guide. But it should
be read with that in mind; kernel programming is a bit special.

/Jorgen
 
E

Edward Rutherford

Keith said:
Standard-wise, you don't. The standard requires at least enough padding
to satisfy alignment requirements, but it says very little about what
those requirements are.

If you really absolutely need to avoid any possibility of padding,
declare an array of unsigned char and use memcpy() to copy the
individual members in an out.


It might, though I've never followed that kind of rule myself.

Again, *why* are you so concerned about minimizing padding? What bad
thing would happen if there were a little padding between members in
your structs?

If you'll tell us what your actual goal is, we'll be better able to help
you.

Thanks for that information.

I want to avoid padding in order to minimize the size of my structs in
order to prevent caches misses where possible. I would expect that the
overhead from using memcpy and unsigned char arrays, would outweight the
savings from optimizing cache usage - happy to be corrected though.
 
J

Joe Pfeiffer

Edward Rutherford said:
Thanks for that information.

I want to avoid padding in order to minimize the size of my structs in
order to prevent caches misses where possible. I would expect that the
overhead from using memcpy and unsigned char arrays, would outweight the
savings from optimizing cache usage - happy to be corrected though.

Ah. The cost in unaligned accesses is virtually certain to outweigh the
savings in hit rate. The padding is inserted precisely to make the code
run faster, at the expense of speed (if smaller were faster, padding
wouldn't be used).
 
K

Keith Thompson

Edward Rutherford said:
Keith Thompson wrote: [...]
Again, *why* are you so concerned about minimizing padding? What bad
thing would happen if there were a little padding between members in
your structs?

If you'll tell us what your actual goal is, we'll be better able to help
you.

Thanks for that information.

I want to avoid padding in order to minimize the size of my structs in
order to prevent caches misses where possible. I would expect that the
overhead from using memcpy and unsigned char arrays, would outweight the
savings from optimizing cache usage - happy to be corrected though.

Have you performed measurements that indicate that your program's
performance is unacceptably slow, and that making your structs
slightly smaller will actually help? Are the hours you're spending
on this worth the CPU time you expect to save?
 
D

Dr Nick

Keith Thompson said:
Edward Rutherford said:
Keith Thompson wrote: [...]
Again, *why* are you so concerned about minimizing padding? What bad
thing would happen if there were a little padding between members in
your structs?

If you'll tell us what your actual goal is, we'll be better able to help
you.

Thanks for that information.

I want to avoid padding in order to minimize the size of my structs in
order to prevent caches misses where possible. I would expect that the
overhead from using memcpy and unsigned char arrays, would outweight the
savings from optimizing cache usage - happy to be corrected though.

Have you performed measurements that indicate that your program's
performance is unacceptably slow, and that making your structs
slightly smaller will actually help? Are the hours you're spending
on this worth the CPU time you expect to save?

I've often thought it would be nice to be able to tell the compiler to
"optimise this structure's order for space" or "optimise this
structure's order for size". It would, of course, only work in modular
code if you never exposed the structure definition and just used opaque
pointers. But you'd be telling it that you never rely on the order of
things, and so could still structure the structure in the way that makes
the code most readable.

But it's very much an "it would be nice", as I suspect the benefits are
pretty small.
 
B

Ben Pfaff

Shao Miller said:
On 5/22/2011 4:08 AM, Dr Nick wrote:
I guess a code generator could potentially accomplish something close
to this. I'd be interested in an algorithm which figures out the most
size-efficient ordering of the members, if anyone has one (or some
ideas for one).

Just sort the members into descending order of size.
 
U

Uncle Steve

Just sort the members into descending order of size.

Another way is to put the structure member most likely to be accessed
first at the start of the struct. Prefetching should have the rest of
the struct in-cache shortly after the first access.



Regards,

Uncle Steve
 
S

Shao Miller

I've often thought it would be nice to be able to tell the compiler to
"optimise this structure's order for space" or "optimise this
structure's order for size". It would, of course, only work in modular
code if you never exposed the structure definition and just used opaque
pointers. But you'd be telling it that you never rely on the order of
things, and so could still structure the structure in the way that makes
the code most readable.

But it's very much an "it would be nice", as I suspect the benefits are
pretty small.

I guess a code generator could potentially accomplish something close to
this. I'd be interested in an algorithm which figures out the most
size-efficient ordering of the members, if anyone has one (or some ideas
for one).

But maybe such an opaque representation might have comparable overhead
to serialization/deserialization, where Keith's memcpy() could be used
and there could be zero padding? (Along the lines of "...suspect the
benefits are pretty small".)
 
K

Keith Thompson

Ben Pfaff said:
Just sort the members into descending order of size.

Or in order of alignment, which might not be the same thing.

On some architectures, there could be an advantage to putting
as many small members as possble at the beginning of the struct,
because smaller offsets can be faster.

If C didn't require that struct members to follow their declared
order, the compiler could optimize the layout any way that makes
sense. It's too late to change that without breaking existing code,
but perhaps a future standard could add an attribute to specify
that the the compiler can alter the layout -- or it could be a
compiler-specific extension.
 
S

Shao Miller

Just sort the members into descending order of size.

I might be missing something... Suppose an 'int' has both size and
alignment of 4:

sizeof (struct {
char ca1[5]; /* 0 through 4 */
/* 3 bytes of padding; 5 through 7 */
int i; /* 8 through 11 */
/* No padding */
char ca2[3]; /* 12 through 14 */
/* 1 byte of padding; 15 */
}) == 16

The above is most likely, right? Whereas:

sizeof (struct {
int i; /* 0 through 3 */
/* No padding */
char ca1[5]; /* 4 through 8 */
/* No padding */
char ca2[3]; /* 9 through 11 */
/* No padding */
}) == 12

Right?
 
B

Ben Bacarisse

Shao Miller said:
Just sort the members into descending order of size.

I might be missing something... Suppose an 'int' has both size and
alignment of 4:

sizeof (struct {
char ca1[5]; /* 0 through 4 */
/* 3 bytes of padding; 5 through 7 */
int i; /* 8 through 11 */
/* No padding */
char ca2[3]; /* 12 through 14 */
/* 1 byte of padding; 15 */
}) == 16

The above is most likely, right? Whereas:

sizeof (struct {
int i; /* 0 through 3 */
/* No padding */
char ca1[5]; /* 4 through 8 */
/* No padding */
char ca2[3]; /* 9 through 11 */
/* No padding */
}) == 12

Right?

Yes. The advice has been shorted beyond usefulness. What I remember
being told (it was all fields round here, in those days) was to pack in
order of size (size being uses here as a proxy for alignment) treating
arrays as if they were just several fields of the array element type.
 
B

Ben Bacarisse

Shao Miller said:
Given your amendment to the previously advised strategy, can you think
of a situation where taking the result of that strategy and swapping
the first member out and injecting it as the last member could make a
difference for padding? For example:

struct foo {
int i;
char c;
/* Some padding, perhaps */
};

struct bar {
char c;
/* Some padding, perhaps */
int i;
};

You mean to the total padding, presumably? No, I can't (well, excluding
rather contrived scenarios involving bit fields). Why do you ask?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,905
Latest member
Kristy_Poole

Latest Threads

Top