Calculating storage alignment

D

Darren Cubitt

Hi CLC,

For a while I've been using a pool allocator I wrote for fixed sized
objects. Basically, it allocates chunks of 32 (or however many bits in
an int) objects of a given size, and returns pointers to the individual
objects as required.

One problem I needed to deal with is, when a user calls the free
function, the allocator needs to be able to work out where the object is
with respect to the allocated chunk.

I got around this by allocating an extra int per object. This int stores
the index of this object in the chuck, from which the chuck address can
be calculated. This will give you the general idea:

int * ptr = malloc(sizeof_object + sizeof(int));
ptr[0] = index_of_object;
return ptr + 1;

Thanks to reading this group, I now know this is a Bad Idea(tm) due to
memory alignment issues - something I have been ignorant of in the past.

It seems what I need to do is instead of using sizeof(int) I need to
calculate the "universal alignment" value - for lack of a better name.
Obviously, there must be one or malloc() couldn't work.

So if I create a union of all the intrinsic types, and can then
calculate the alignment for that union, I should be set, right?

In short, would this code be correct and portable:


#include "intrinsic_types.h" /* intrinsic_types_t */

size_t a = offsetof(struct {char a; instrinsic_types_t b;}, b);

Also, is using a struct definition in offsetof() ok? It compiles fine in
MSVC, but that is hardly reassuring.
 
D

Darren Cubitt

Malcolm McLean wrote:

You don't need an integer to obtain the index. Subtract the base from
the pointer instead.

This is the problem though - I don't know what the base is.

The allocation function returns a pointer to the object, which in
reality will be a pointer to somewhere inside the chunk.

When the free function is called, it gets this pointer back - but it
doesn't know the base address. It could be the first object in the
chunk, or the third, or eighth, or anything!

I store an integer offset since the code uses a set of int flags to mark
which objects within each chunk have already been allocated and this
flag need to be cleared - this is why the number of objects per chunk is
tied to the number of bits in an int. It just makes things easier. I
could just as easily store a direct pointer to the base, but then I'd
have to calculate the offset and divde by the size to get the flag bit :p
Which means you don't need to align. A union of all types is one way to
obtain a universal alignment, however. You can take the size rather than
relying on the offsetof hack.

Hey, offsetof is great! I've written some cool code with that.

But I'm not sure if taking the size would work. Just because the largest
type is X chars, doesn't mean universal alignment is X chars.

The reason I choose offsetof() is that any struct needs to be compatible
with a malloc()ed address. malloc() guarantees alignment, so if you make
a struct with a char followed by the types-union, the compiler should
insert sufficient padding to guarantee alignment of the union, which
means offsetof(a, b) will give you the magic value.

Unless there is an error in my logic, of course. It wouldn't be the
first time.
> Because of C's array rules size also
> equals maximum alignment strictness.

I'm not sure I'm following with that bit about array size rules. Maybe
this is another part of the standard I need to farmiliarise myself with.
Could you explain further or give me an RTFM link?

Thanks.
 
D

Darren Cubitt

Darren said:
The reason I choose offsetof() is that any struct needs to be compatible
with a malloc()ed address. malloc() guarantees alignment, so if you make
a struct with a char followed by the types-union, the compiler should
insert sufficient padding to guarantee alignment of the union, which
means offsetof(a, b) will give you the magic value.

Unless there is an error in my logic, of course. It wouldn't be the
first time.

Actually, there is an error in my logic which I'll correct before anyone
else does.

I should use:

struct { int a; intrinsic_types_t b; }

instead of:

struct { char a; intrinsic_types_t b; }

Since it is an int that I want to reserve space for and there is no
guarantee that the alignment would be less than sizeof(int).
 
F

Flash Gordon

Darren said:
Malcolm McLean wrote:



This is the problem though - I don't know what the base is.

The allocation function returns a pointer to the object, which in
reality will be a pointer to somewhere inside the chunk.

When the free function is called, it gets this pointer back - but it
doesn't know the base address. It could be the first object in the
chunk, or the third, or eighth, or anything!

I'm guessing you have multiple chunks and you, of course, don't know
which chunk the object the pointer refers to is in.
I store an integer offset since the code uses a set of int flags to mark
which objects within each chunk have already been allocated and this
flag need to be cleared - this is why the number of objects per chunk is
tied to the number of bits in an int. It just makes things easier. I
could just as easily store a direct pointer to the base, but then I'd
have to calculate the offset and divde by the size to get the flag bit :p

Still possible. The best method depends on all sorts of things,
including how portable the code needs to be. There are solutions which
would work on 95% of all systems currently in production (remembering
that 97% of all statistics are made up) which might be good enough, and
other methods which use up more memory but are completely portable, and
a range in between.
Hey, offsetof is great! I've written some cool code with that.

But I'm not sure if taking the size would work. Just because the largest
type is X chars, doesn't mean universal alignment is X chars.

The reason I choose offsetof() is that any struct needs to be compatible
with a malloc()ed address. malloc() guarantees alignment, so if you make
a struct with a char followed by the types-union, the compiler should
insert sufficient padding to guarantee alignment of the union, which
means offsetof(a, b) will give you the magic value.

Unless there is an error in my logic, of course. It wouldn't be the
first time.

It will give you an alignment that will work for the types in the union,
but it might be larger than required. The same applies to the sizeof
method Malcolm suggests.
I'm not sure I'm following with that bit about array size rules. Maybe
this is another part of the standard I need to farmiliarise myself with.
Could you explain further or give me an RTFM link?

Basically there are no gaps between elements in an array. So, for
example, if as far as the processor is concerned a long double takes 10
byte but it has to be aligned on a 4 byte boundary the C implementation
would have to make sizeof(long double) 12 and have two padding bytes
which are part of the long double as far as C is concerned. Obviously
using 12 as your alignment would be very wasteful in this situation!

Another option you could use is a hash table. You could then look up the
pointer passed to your "free" function and find all the other data you
need. Obviously this has some memory overheads. There are also some
portability problems in that there could be multiple pointer
representations could actually refer to the same pointer value, and
there is no guarantee you get back the same representation your passed!

Another option with different portability problems is to compare the
pointer passed to your "free" function to the start/end of each of your
chunks. However, C leave pointer comparisons (other than for equality)
undefined when the pointers do not refer to locations within the same
object.

Another potentially good way is to have a header file with
implementation specifics such as the alignment value. Then you just have
to set all the appropriate values when porting to a new system and know
exactly where to find them!

No solution is perfect, you just have to decide on the best compromise.
 
D

Darren Cubitt

Flash said:
Basically there are no gaps between elements in an array. So, for
example, if as far as the processor is concerned a long double takes 10
byte but it has to be aligned on a 4 byte boundary the C implementation
would have to make sizeof(long double) 12 and have two padding bytes
which are part of the long double as far as C is concerned. Obviously
using 12 as your alignment would be very wasteful in this situation!

Oh, I see where I am going wrong - sizeof returns the size of the object
including alignment padding.

Otherwise, if we take you example of a 10-byte long double aligned to
4-bytes, something like this woudln't work:

long double * ptr = &a_double_array_with_at_least_2_elements;

*++ptr = 0;

So sizeof(long double) would need to return 12, not 10? I am assuming
that ++ptr is equivelant to:

ptr = (long double *)((char *)ptr + sizeof(long double));

This is why I coudln't see how Malcom's solution would work. I knew I
was missing something. Something obvious, too, now that I think about it.
Another option you could use is a hash table. You could then look up the
pointer passed to your "free" function and find all the other data you
need. Obviously this has some memory overheads. There are also some
portability problems in that there could be multiple pointer
representations could actually refer to the same pointer value, and
there is no guarantee you get back the same representation your passed!

I used to some programming way back in IMB PC real mode. The
segment/offset point system had some advantages... but not enough to
make you want to rip your hair out.
Another option with different portability problems is to compare the
pointer passed to your "free" function to the start/end of each of your
chunks. However, C leave pointer comparisons (other than for equality)
undefined when the pointers do not refer to locations within the same
object.

Another potentially good way is to have a header file with
implementation specifics such as the alignment value. Then you just have
to set all the appropriate values when porting to a new system and know
exactly where to find them!

No solution is perfect, you just have to decide on the best compromise.

All good ideas, but I'm trying to keep it simple (this coming from a
person who always over-complicates things). The whole point of the code
is to make things more efficient! I was hoping for a magic bullet
solution but, alas, it just might not exist.

Thanks for your help.
 
F

Flash Gordon

Darren said:
Oh, I see where I am going wrong - sizeof returns the size of the object
including alignment padding.

Padding can be for reasons other than alignment, but yes.
Otherwise, if we take you example of a 10-byte long double aligned to
4-bytes, something like this woudln't work:

long double * ptr = &a_double_array_with_at_least_2_elements;

*++ptr = 0;

So sizeof(long double) would need to return 12, not 10? I am assuming
that ++ptr is equivelant to:

ptr = (long double *)((char *)ptr + sizeof(long double));

Yes, you are correct.

Another reason it works like this is that array indexing is defined in
terms of pointer arithmetic.
This is why I coudln't see how Malcom's solution would work. I knew I
was missing something. Something obvious, too, now that I think about it.

A lot of things are obvious once you understand why they are done that
way :)
I used to some programming way back in IMB PC real mode. The
segment/offset point system had some advantages... but not enough to
make you want to rip your hair out.

These things come and go. The hashing could also hit problems due to
padding bits in the pointer which are not guaranteed to be maintained.
All good ideas, but I'm trying to keep it simple (this coming from a
person who always over-complicates things). The whole point of the code
is to make things more efficient! I was hoping for a magic bullet
solution but, alas, it just might not exist.

The simplest and most efficient solutions, in my opinion, is the header
file that specifies the alignment. A value of 4 is probably appropriate
for your current targets, and people targeting other environments can
easily change it.

Keeping system specifics in one (or a small number) of files to allow
easy porting is a standard technique. I've got commercial software
distributed as source where the file is called port.h to make it easy :)
Thanks for your help.

No problem.
 
G

gw7rib

One problem I needed to deal with is, when a user calls the free
function, the allocator needs to be able to work out where the object is
with respect to the allocated chunk.

I got around this by allocating an extra int per object. This int stores
the index of this object in the chuck, from which the chuck address can
be calculated. This will give you the general idea:

        int * ptr = malloc(sizeof_object + sizeof(int));
        ptr[0] = index_of_object;
        return ptr + 1;

You may be overlooking the obvious here. Firstly, is there some reason
you can't simply add the extra int to the object structure? Or, if the
object definition is fixed for some reason, why you can't create a new
structure:

struct myobject {
struct object obj;
int index; };

which includes both the desired object and your new int?

Hope that helps.
Paul.
 
C

CBFalconer

Darren said:
.... snip ...

All good ideas, but I'm trying to keep it simple (this coming from
a person who always over-complicates things). The whole point of
the code is to make things more efficient! I was hoping for a
magic bullet solution but, alas, it just might not exist.

I gather you want to make efficient ways of allocating single
integers. Lets assume those integers are 4 bytes long, and that
anything of any other size will use malloc.

To control allocation we need one bit per item. If we limit the
count of items to something line 250, and each integer has 16 bits,
we need a 16 integer array to hold the bits. If we round total
storage used up to about 1k bytes, that will allow for about 240
integers.

So we have to allocate such a package early in program operation.
Once allocated, a request for less or more than 1 integer will
divert to the original malloc, as will all requests when the
package is totally used. Now we just have to find an integer whose
allocation bit is 0, set that allocation bit, and return the ints
address. Free just has to reset the allocation bit. Somewhere a
single integer will keep the count of items allocated.

Note that the system will be highly sensitive to misuse.
 
C

Chris M. Thomasson

Darren Cubitt said:
Hi CLC,

For a while I've been using a pool allocator I wrote for fixed sized
objects. Basically, it allocates chunks of 32 (or however many bits in an
int) objects of a given size, and returns pointers to the individual
objects as required.

One problem I needed to deal with is, when a user calls the free function,
the allocator needs to be able to work out where the object is with
respect to the allocated chunk.

I got around this by allocating an extra int per object. This int stores
the index of this object in the chuck, from which the chuck address can be
calculated. This will give you the general idea:

int * ptr = malloc(sizeof_object + sizeof(int));
ptr[0] = index_of_object;
return ptr + 1;

Thanks to reading this group, I now know this is a Bad Idea(tm) due to
memory alignment issues - something I have been ignorant of in the past.

You say the pool allocator works with only a single object size. Well, why
not do something simple like... Round all object sizes up to a multiple of
`sizeof(void*)'. Then allocate a large chunk of memory, say 4096 bytes of
object space + size of the chunk header + extra for alignment purposes, and
align the object space on a 4096 byte boundary. Object space must be a
multiple of the alignment requirements of at least a void*, or the given
object, which ever one is greater. Header alignment must be at least big
enough to accommodate its strictest type. Now, create a simple bump pointer
and a free list in the header. Initial allocations use the bump pointer. If
the pointer is at the end, it uses the freelist. Frees simply link objects
as-is right into the free list. To find the chunk header of any object, you
simple round its address down to 4096 boundary and subtract size of the
header. This is a fairly common technique, and it does not require an extra
wasteful int per object. The wasted space only gets detected when the user
associates the pool with an object whose sizeof value is less than that of
sizeof(void*).

I can whip up some simple working code if your interested. Many
state-of-the-art multi-threaded allocators use this technique. Intel TBB,
StreamFlow, ect...


Another simple technique is to use a bitmap; Here is simple example of
region allocator than can actually free individual objects:

http://groups.google.com/group/comp.lang.c++/msg/e903e1726d7e5d8f



It seems what I need to do is instead of using sizeof(int) I need to
calculate the "universal alignment" value - for lack of a better name.
Obviously, there must be one or malloc() couldn't work.

So if I create a union of all the intrinsic types, and can then calculate
the alignment for that union, I should be set, right?

In short, would this code be correct and portable:


#include "intrinsic_types.h" /* intrinsic_types_t */

size_t a = offsetof(struct {char a; instrinsic_types_t b;}, b);

Also, is using a struct definition in offsetof() ok? It compiles fine in
MSVC, but that is hardly reassuring.

This "offsetof" trick works fine:

http://groups.google.com/group/comp.lang.c/browse_frm/thread/38c5c2c5b0185cc1




I used this to create a simple conservative region allocator:

http://groups.google.com/group/comp.lang.c/browse_frm/thread/97a1c978a0f20195


This can carve objects out of a given buffer, and maintain alignment without
resorting to a so called max alignment value. It works well in space
confined environments.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Save mp3s to local storage 1
struct alignment 14
Another question wrt storage alignment. 3
Alignment in Struct 7
POST local storage - angular 0
gcc alignment options 19
alignment shame (?) 14
Memory alignment 53

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top