Memory alignment

W

Why Tea

typedef struct some_struct
{
int i;
short k,
int m;
char s[1];
} some_struct_t;

Assuming 16 bit or 32-bit alignment, can I assume that
s always gets 4 or 8 bytes of allocation due to padding
in the following? (I.e. s is either 4 or 8 characters long)

some_struct_t *my_struct;
my_struct = malloc(sizeof(some_struct_t));

BTW, this is on a PowerPC architecture.

/Why Tea
 
F

Fred

typedef struct some_struct
{
    int i;
    short k,
    int m;
    char s[1];

} some_struct_t;

Assuming 16 bit or 32-bit alignment, can I assume that
s always gets 4 or 8 bytes of allocation due to padding
in the following? (I.e. s is either 4 or 8 characters long)

some_struct_t *my_struct;
my_struct = malloc(sizeof(some_struct_t));

BTW, this is on a PowerPC architecture.

s will get one byte of space.
The struct itself may or may not get additional bytes or padding,
but writing to s[1] is an error.
 
W

Why Tea

typedef struct some_struct
{
    int i;
    short k,
    int m;
    char s[1];
} some_struct_t;
Assuming 16 bit or 32-bit alignment, can I assume that
s always gets 4 or 8 bytes of allocation due to padding
in the following? (I.e. s is either 4 or 8 characters long)
some_struct_t *my_struct;
my_struct = malloc(sizeof(some_struct_t));
BTW, this is on a PowerPC architecture.

s will get one byte of space.
The struct itself may or may not get additional bytes or padding,

Why not if alignment is done at 16 or 32 bit boundary?
but writing to s[1] is an error.

But strcpy(my->struct->s, "AB"); is OK if there is
padding. Isn't it not? Please go easy on me here.
I just want to understand how this really works...
 
A

Antoninus Twink

typedef struct some_struct
{
    int i;
    short k,
    int m;
    char s[1];
} some_struct_t;

But strcpy(my->struct->s, "AB"); is OK if there is
padding. Isn't it not? Please go easy on me here.
I just want to understand how this really works...

Yes, of course, if there are two or more padding bytes at the end of the
struct then that's your memory to write to, whether it's on the stack if
my_struct is an automatic variable, or on the heap if you got the memory
from malloc().

You should be aware that the regulars here aren't interested in your
wish for a pragmatic answer that's true in practise: they'll just
bombard you with hypothetical answers that are true in theory.
 
D

danmath06

typedef struct some_struct
{
int i;
short k,
int m;
char s[1];

} some_struct_t;

Assuming 16 bit or 32-bit alignment, can I assume that
s always gets 4 or 8 bytes of allocation due to padding
in the following? (I.e. s is either 4 or 8 characters long)

Each element in the structure has the size you specified in the
declaration, padding adds space between elements to adjust alignment
or at the end. s has only one character.

For example there might be 2 bytes of space between k and m so that m
starts on an 32bit alignes address. You could avoid this by keeping
the integers together:

typedef struct some_struct
{
int i;
int m;
short k,
char s[1];

} some_struct_t;

This might result in only one extra byte at the end.

You can always use sizeof(struct some_struct) to know the size of the
structure and guess how many bytes were added as padding.
offsetof(struct some_struct, s) will return the offset of s in the
structure (you will need: #include <stddef.h>). You can use offsetof()
to find out the position of every element in the struct and determine
where the struct has been padded.
 
D

danmath06

But strcpy(my->struct->s, "AB"); is OK if there is
padding. Isn't it not? Please go easy on me here.
I just want to understand how this really works...

It will, but if you add a new elemnt at the end of the structure later
on in might not, you might overwrite the next element.

Why would you want to declare a 1 char array to store 2 anyway?
 
W

Why Tea

It will, but if you add a new elemnt at the end of the structure later
on in might not, you might overwrite the next element.

Why would you want to declare a 1 char array to store 2 anyway?

Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:

my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);

The same my_struct_t is used throughout the code for
signal sending. If s[] is used to carry binary data, the
size is specified by an int preceding s[]. I'd be
interested to hear comments from the experts about
this approach.
 
I

Ian Collins

Why said:
It will, but if you add a new elemnt at the end of the structure later
on in might not, you might overwrite the next element.

Why would you want to declare a 1 char array to store 2 anyway?

Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:

my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);

The same my_struct_t is used throughout the code for
signal sending. If s[] is used to carry binary data, the
size is specified by an int preceding s[]. I'd be
interested to hear comments from the experts about
this approach.

There isn't much to say, it's known as the "struct hack" and fairly
common. There's probably some decent reference on the web if you google
for it.
 
B

Ben Bacarisse

Why Tea said:
It will, but if you add a new elemnt at the end of the structure later
on in might not, you might overwrite the next element.

Why would you want to declare a 1 char array to store 2 anyway?

Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:

my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);

This is called the "struct hack". It has been formalised in C99 so if
you can use C99 then all will be well.
The same my_struct_t is used throughout the code for
signal sending. If s[] is used to carry binary data, the
size is specified by an int preceding s[]. I'd be
interested to hear comments from the experts about
this approach.

It is considered to be "a bit dodgy" (that is the technical term) but
it generally works. I am not sure there is really much more to say
about it though I get the feeling I will be proved very much wrong
about that!
 
W

Why Tea

Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:
my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);
The same my_struct_t is used throughout the code for
signal sending. If s[] is used to carry binary data, the
size is specified by an int preceding s[]. I'd be
interested to hear comments from the experts about
this approach.

     You're full of questions, Why Tea, and that's a good thing.
But has it occurred to you that other people have asked some of
these same questions?  Has it occurred to you to look for a FAQ
at some likely-sounding site like, oh, <http://www.c-faq.com/>?
If you get lucky and find some useful material at such a site,
I'd suggest not limiting yourself only to reading Question 2.6
(for instance), but perusing some of the others as well.

Eric, thanks for the gentle nudge. I did google, but I didn't
know it was called "struct hack". It's hard to find the
right thing if you don't know the right term. Thanks to all
who took the time to reply.
 
J

jameskuyper

Why said:
It will, but if you add a new elemnt at the end of the structure later
on in might not, you might overwrite the next element.

Why would you want to declare a 1 char array to store 2 anyway?

Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:

my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);

The same my_struct_t is used throughout the code for
signal sending. If s[] is used to carry binary data, the
size is specified by an int preceding s[]. I'd be
interested to hear comments from the experts about
this approach.

In C90, this was technically illegal, but as a practical matter it
worked on most (all?) implementations. It was sufficiently useful that
a modified version of the concept was added as an extension to many
complilers, but with the difference that the array was declared with a
size of 0, rather than 1. In C99, a modified version of the concept
was made finally standardized, under the name "flexible arrays". For
the C99 version, you should use:

typedef struct some_struct
{
int i;
short k,
int m;
char s[];
} some_struct_t;

With all three versions of this concept, it's required that the array
be declared at the end of the structure. The best way to handle the
allocation is as follows:

my_struct = malloc(offsetof(some_struct_t, s) +
MY_PAYLOAD_STRING_SIZE*sizeof my_struct.s[0]);

Using offsetof() rather than sizeof() makes a difference for the C90
version, because the size of the struct includes enough room for one
element of the array, whereas offsetof() does not. That means that
with sizeof you'd be reserving room for at least 1 more array element
than you need to (unless MY_PAYLOAD_STRING_SIZE does not include the
terminating null character). For C99 sizeof(some_struct_t) and
offsetof(some_struct_t, s) should give the same result.
Using sizeof mystruct.s[0] protects against the possibility that you
might change the element type of s. It also helps makes it easier to
verify that allocation is correct. If the length of the flexible array
is stored in the struct, as is usually the case, I'd recommend filling
in that member, and using the value of that member instead of
MY_PAYLOAD_STRING_SIZE. Again, the main advantage of this is that it
makes it easier for a reader to verify that the code is correct.
 
K

Keith Thompson

Why Tea said:
typedef struct some_struct
{
int i;
short k,
int m;
char s[1];
} some_struct_t;

Assuming 16 bit or 32-bit alignment, can I assume that
s always gets 4 or 8 bytes of allocation due to padding
in the following? (I.e. s is either 4 or 8 characters long)

some_struct_t *my_struct;
my_struct = malloc(sizeof(some_struct_t));

BTW, this is on a PowerPC architecture.

You can assume that s itself will be allocated exactly 1 byte. Any
padding following it is part of the struct, but not part of s.

In general, compilers are allowed to insert arbitrary padding between
members or after the last member of a struct. The purpose of this
padding is generally to meet alignment requirements, but the standard
doesn't place any restrictions on how little or how much padding is
used, as long as the members can be accessed.

So no, you can't portably make any assumptions about how much padding
is added after s. But you generally shouldn't need to. If you'll let
us know what you're trying to do, we can probably help you do it
*without* making any (or too many) non-portable assumptions. If
you're trying to use the "struct hack", see question 2.6 in the
comp.lang.c FAQ, <http://c-faq.com/>.
 
K

Keith Thompson

Why Tea said:
Eric, thanks for the gentle nudge. I did google, but I didn't
know it was called "struct hack". It's hard to find the
right thing if you don't know the right term. Thanks to all
who took the time to reply.

Fair enough. I *know* it's called the "struct hack", and I've read
question 2.6 before, but I had trouble finding it myself, since the
answer to 2.6 doesn't use the phrase "struct hack". I happened to
remember that the URL includes the string "structhack", so I was able
to use a Google advanced search to find it.
 
K

Keith Thompson

Lowell Gilbert said:
Why Tea said:
On Oct 3, 12:48 pm, (e-mail address removed) wrote: [...]
Why would you want to declare a 1 char array to store 2 anyway?

Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:

my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);

or even more likely, something more like
my_struct = malloc(sizeof(my_struct_t) + strlen(my->struct->s));

+ 1. The length returned by strlen() doesn't include the terminating '\0'.
 
K

Keith Thompson

In C90, this was technically illegal, but as a practical matter it
worked on most (all?) implementations. It was sufficiently useful that
a modified version of the concept was added as an extension to many
complilers, but with the difference that the array was declared with a
size of 0, rather than 1. In C99, a modified version of the concept
was made finally standardized, under the name "flexible arrays". For
the C99 version, you should use:

typedef struct some_struct
{
int i;
short k,
int m;
char s[];
} some_struct_t;

With all three versions of this concept, it's required that the array
be declared at the end of the structure. The best way to handle the
allocation is as follows:

my_struct = malloc(offsetof(some_struct_t, s) +
MY_PAYLOAD_STRING_SIZE*sizeof my_struct.s[0]);

Using offsetof() rather than sizeof() makes a difference for the C90
version, because the size of the struct includes enough room for one
element of the array, whereas offsetof() does not.
[...]

I'm not convinced that's safe. If the string is very short, you might
allocate fewer than sizeof(some_struct_t) bytes. That could, in some
circumstances, result in accessing memory that's within range of a
some_struct_t object, but outside the range of what was actually
allocated.

On the other hand, if you're doing this then you're not going to be
accessing the some_struct_t object as a whole. On the other other
hand, the compiler might be allowed to generate code that does so.
It's probably not going to cause any visible problems in practice,
especially since the memory allocated by malloc() will probably be big
enough to contain a my_struct_t object anyway, but it makes me
nervous. I'd rather risk allocating a byte or so too much than too
little, unless memory space is *really* critical and I'm willing to be
unwarrantedly chummy with the compiler (paraphrasing Dennis Ritchie).
 
W

Why Tea

Why Tea said:
typedef struct some_struct
{
    int i;
    short k,
    int m;
    char s[1];
} some_struct_t;
Assuming 16 bit or 32-bit alignment, can I assume that
s always gets 4 or 8 bytes of allocation due to padding
in the following? (I.e. s is either 4 or 8 characters long)
some_struct_t *my_struct;
my_struct = malloc(sizeof(some_struct_t));
BTW, this is on a PowerPC architecture.

You can assume that s itself will be allocated exactly 1 byte.  Any
padding following it is part of the struct, but not part of s.

In general, compilers are allowed to insert arbitrary padding between
members or after the last member of a struct.  The purpose of this
padding is generally to meet alignment requirements, but the standard
doesn't place any restrictions on how little or how much padding is
used, as long as the members can be accessed.

So no, you can't portably make any assumptions about how much padding
is added after s.  But you generally shouldn't need to.  If you'll let
us know what you're trying to do, we can probably help you do it
*without* making any (or too many) non-portable assumptions.  If
you're trying to use the "struct hack", see question 2.6 in the
comp.lang.c FAQ, <http://c-faq.com/>.

Thanks Keith. I'll try the c-faq first next time. I just had
a look at 2.6 of c-faq, I'm surprised that what's written there
was exactly a discussion I had with a colleague. Taking the
code from the faq:

#include <stdlib.h>
#include <string.h>

#define MAXSIZE 100

struct name {
int namelen;
char namestr[MAXSIZE];
};

struct name *makename(char *newname)
{
struct name *ret =
malloc(sizeof(struct name)-MAXSIZE+strlen(newname)+1);
/* +1 for \0 */
if(ret != NULL) {
ret->namelen = strlen(newname);
strcpy(ret->namestr, newname);
}

return ret;
}

The argument from my colleague was about the extra
byte (+1 for \0), why is it needed as padding is
always there? He assumed 32-bit alignment and
MAXSIZE=1 in our discussion. Is his argument always
right?
 
L

lawrence.jones

Antoninus Twink said:
Yes, of course, if there are two or more padding bytes at the end of the
struct then that's your memory to write to, whether it's on the stack if
my_struct is an automatic variable, or on the heap if you got the memory
from malloc().

No, the padding bytes are most definitely *not* yours to write to.
Although most implementations will let you get away with it, there are
implementations that do careful memory bounds checking and won't.
 
C

CBFalconer

Why said:
Fred said:
Why Tea said:
typedef struct some_struct {
int i;
short k,
int m;
char s[1];
} some_struct_t;
.... snip ...
but writing to s[1] is an error.

But strcpy(my->struct->s, "AB"); is OK if there is
padding. Isn't it not? Please go easy on me here.
I just want to understand how this really works...

Because you declared s to hold one byte. You have no idea what the
system is doing with any extra memory that it had to supply, or
even if it had to supply anything extra.
 
K

Keith Thompson

Why Tea said:
Why Tea said:
typedef struct some_struct
{
    int i;
    short k,
    int m;
    char s[1];
} some_struct_t;
Assuming 16 bit or 32-bit alignment, can I assume that
s always gets 4 or 8 bytes of allocation due to padding
in the following? (I.e. s is either 4 or 8 characters long)
some_struct_t *my_struct;
my_struct = malloc(sizeof(some_struct_t));
BTW, this is on a PowerPC architecture.

You can assume that s itself will be allocated exactly 1 byte.  Any
padding following it is part of the struct, but not part of s.

In general, compilers are allowed to insert arbitrary padding between
members or after the last member of a struct.  The purpose of this
padding is generally to meet alignment requirements, but the standard
doesn't place any restrictions on how little or how much padding is
used, as long as the members can be accessed.

So no, you can't portably make any assumptions about how much padding
is added after s.  But you generally shouldn't need to.  If you'll let
us know what you're trying to do, we can probably help you do it
*without* making any (or too many) non-portable assumptions.  If
you're trying to use the "struct hack", see question 2.6 in the
comp.lang.c FAQ, <http://c-faq.com/>.

Thanks Keith. I'll try the c-faq first next time. I just had
a look at 2.6 of c-faq, I'm surprised that what's written there
was exactly a discussion I had with a colleague. Taking the
code from the faq:

#include <stdlib.h>
#include <string.h>

#define MAXSIZE 100

struct name {
int namelen;
char namestr[MAXSIZE];
};

struct name *makename(char *newname)
{
struct name *ret =
malloc(sizeof(struct name)-MAXSIZE+strlen(newname)+1);
/* +1 for \0 */
if(ret != NULL) {
ret->namelen = strlen(newname);
strcpy(ret->namestr, newname);
}

return ret;
}

The argument from my colleague was about the extra
byte (+1 for \0), why is it needed as padding is
always there? He assumed 32-bit alignment and
MAXSIZE=1 in our discussion. Is his argument always
right?

It may happen to be true that it will work on all implementations. A
compiler will *probably* add enough padding to the end of the struct
to make it work. Even if not, a runtime library's malloc()
implementation will *probably* add enough padding to the allocated
space. And even if not, you're likely (but by no means certain) to be
able to get away with accessing memory just one byte beyond what's
been allocated, depending on what happens to be there. Finally, most
C implementations won't go out of their way to prevent you from
accessing memory beyond the declared bounds of an array (and if yours
does, you can't use the struct hack in the first place).

But the line of reasoning that lets you assume that you don't need to
explicitly allocate enough space for the terminating '\0' is
convoluted, weak, and system-specific. If I were writing the code,
I'd just allocate the byte and be done with it. Otherwise, every time
a problem shows up, I'd have to spend time confirming that the failure
to allocate that byte isn't the cause. It doesn't cost much to do it
right.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,059
Latest member
cryptoseoagencies

Latest Threads

Top