Calculating storage alignment

Discussion in 'C Programming' started by Darren Cubitt, Mar 15, 2009.

  1. Hi CLC,

    For a while I've been using a pool allocator I wrote for fixed sized
    objects. Basically, it allocates chunks of 32 (or however many bits in
    an int) objects of a given size, and returns pointers to the individual
    objects as required.

    One problem I needed to deal with is, when a user calls the free
    function, the allocator needs to be able to work out where the object is
    with respect to the allocated chunk.

    I got around this by allocating an extra int per object. This int stores
    the index of this object in the chuck, from which the chuck address can
    be calculated. This will give you the general idea:

    int * ptr = malloc(sizeof_object + sizeof(int));
    ptr[0] = index_of_object;
    return ptr + 1;

    Thanks to reading this group, I now know this is a Bad Idea(tm) due to
    memory alignment issues - something I have been ignorant of in the past.

    It seems what I need to do is instead of using sizeof(int) I need to
    calculate the "universal alignment" value - for lack of a better name.
    Obviously, there must be one or malloc() couldn't work.

    So if I create a union of all the intrinsic types, and can then
    calculate the alignment for that union, I should be set, right?

    In short, would this code be correct and portable:


    #include "intrinsic_types.h" /* intrinsic_types_t */

    size_t a = offsetof(struct {char a; instrinsic_types_t b;}, b);

    Also, is using a struct definition in offsetof() ok? It compiles fine in
    MSVC, but that is hardly reassuring.

    --
    Darren (aka: qarnos)
     
    Darren Cubitt, Mar 15, 2009
    #1
    1. Advertising

  2. Malcolm McLean wrote:


    > You don't need an integer to obtain the index. Subtract the base from
    > the pointer instead.


    This is the problem though - I don't know what the base is.

    The allocation function returns a pointer to the object, which in
    reality will be a pointer to somewhere inside the chunk.

    When the free function is called, it gets this pointer back - but it
    doesn't know the base address. It could be the first object in the
    chunk, or the third, or eighth, or anything!

    I store an integer offset since the code uses a set of int flags to mark
    which objects within each chunk have already been allocated and this
    flag need to be cleared - this is why the number of objects per chunk is
    tied to the number of bits in an int. It just makes things easier. I
    could just as easily store a direct pointer to the base, but then I'd
    have to calculate the offset and divde by the size to get the flag bit :p

    > Which means you don't need to align. A union of all types is one way to
    > obtain a universal alignment, however. You can take the size rather than
    > relying on the offsetof hack.


    Hey, offsetof is great! I've written some cool code with that.

    But I'm not sure if taking the size would work. Just because the largest
    type is X chars, doesn't mean universal alignment is X chars.

    The reason I choose offsetof() is that any struct needs to be compatible
    with a malloc()ed address. malloc() guarantees alignment, so if you make
    a struct with a char followed by the types-union, the compiler should
    insert sufficient padding to guarantee alignment of the union, which
    means offsetof(a, b) will give you the magic value.

    Unless there is an error in my logic, of course. It wouldn't be the
    first time.

    > Because of C's array rules size also
    > equals maximum alignment strictness.


    I'm not sure I'm following with that bit about array size rules. Maybe
    this is another part of the standard I need to farmiliarise myself with.
    Could you explain further or give me an RTFM link?

    Thanks.

    --
    Darren (aka: qarnos)
     
    Darren Cubitt, Mar 15, 2009
    #2
    1. Advertising

  3. Darren Cubitt wrote:

    > The reason I choose offsetof() is that any struct needs to be compatible
    > with a malloc()ed address. malloc() guarantees alignment, so if you make
    > a struct with a char followed by the types-union, the compiler should
    > insert sufficient padding to guarantee alignment of the union, which
    > means offsetof(a, b) will give you the magic value.
    >
    > Unless there is an error in my logic, of course. It wouldn't be the
    > first time.


    Actually, there is an error in my logic which I'll correct before anyone
    else does.

    I should use:

    struct { int a; intrinsic_types_t b; }

    instead of:

    struct { char a; intrinsic_types_t b; }

    Since it is an int that I want to reserve space for and there is no
    guarantee that the alignment would be less than sizeof(int).

    --
    Darren (aka: qarnos)
     
    Darren Cubitt, Mar 15, 2009
    #3
  4. Darren Cubitt

    Flash Gordon Guest

    Darren Cubitt wrote:
    > Malcolm McLean wrote:
    >
    >
    >> You don't need an integer to obtain the index. Subtract the base from
    >> the pointer instead.

    >
    > This is the problem though - I don't know what the base is.
    >
    > The allocation function returns a pointer to the object, which in
    > reality will be a pointer to somewhere inside the chunk.
    >
    > When the free function is called, it gets this pointer back - but it
    > doesn't know the base address. It could be the first object in the
    > chunk, or the third, or eighth, or anything!


    I'm guessing you have multiple chunks and you, of course, don't know
    which chunk the object the pointer refers to is in.

    > I store an integer offset since the code uses a set of int flags to mark
    > which objects within each chunk have already been allocated and this
    > flag need to be cleared - this is why the number of objects per chunk is
    > tied to the number of bits in an int. It just makes things easier. I
    > could just as easily store a direct pointer to the base, but then I'd
    > have to calculate the offset and divde by the size to get the flag bit :p


    Still possible. The best method depends on all sorts of things,
    including how portable the code needs to be. There are solutions which
    would work on 95% of all systems currently in production (remembering
    that 97% of all statistics are made up) which might be good enough, and
    other methods which use up more memory but are completely portable, and
    a range in between.

    >> Which means you don't need to align. A union of all types is one way
    >> to obtain a universal alignment, however. You can take the size rather
    >> than relying on the offsetof hack.

    >
    > Hey, offsetof is great! I've written some cool code with that.
    >
    > But I'm not sure if taking the size would work. Just because the largest
    > type is X chars, doesn't mean universal alignment is X chars.
    >
    > The reason I choose offsetof() is that any struct needs to be compatible
    > with a malloc()ed address. malloc() guarantees alignment, so if you make
    > a struct with a char followed by the types-union, the compiler should
    > insert sufficient padding to guarantee alignment of the union, which
    > means offsetof(a, b) will give you the magic value.
    >
    > Unless there is an error in my logic, of course. It wouldn't be the
    > first time.


    It will give you an alignment that will work for the types in the union,
    but it might be larger than required. The same applies to the sizeof
    method Malcolm suggests.

    > > Because of C's array rules size also
    > > equals maximum alignment strictness.

    >
    > I'm not sure I'm following with that bit about array size rules. Maybe
    > this is another part of the standard I need to farmiliarise myself with.
    > Could you explain further or give me an RTFM link?


    Basically there are no gaps between elements in an array. So, for
    example, if as far as the processor is concerned a long double takes 10
    byte but it has to be aligned on a 4 byte boundary the C implementation
    would have to make sizeof(long double) 12 and have two padding bytes
    which are part of the long double as far as C is concerned. Obviously
    using 12 as your alignment would be very wasteful in this situation!

    Another option you could use is a hash table. You could then look up the
    pointer passed to your "free" function and find all the other data you
    need. Obviously this has some memory overheads. There are also some
    portability problems in that there could be multiple pointer
    representations could actually refer to the same pointer value, and
    there is no guarantee you get back the same representation your passed!

    Another option with different portability problems is to compare the
    pointer passed to your "free" function to the start/end of each of your
    chunks. However, C leave pointer comparisons (other than for equality)
    undefined when the pointers do not refer to locations within the same
    object.

    Another potentially good way is to have a header file with
    implementation specifics such as the alignment value. Then you just have
    to set all the appropriate values when porting to a new system and know
    exactly where to find them!

    No solution is perfect, you just have to decide on the best compromise.
    --
    Flash Gordon
     
    Flash Gordon, Mar 15, 2009
    #4
  5. Flash Gordon wrote:

    > Basically there are no gaps between elements in an array. So, for
    > example, if as far as the processor is concerned a long double takes 10
    > byte but it has to be aligned on a 4 byte boundary the C implementation
    > would have to make sizeof(long double) 12 and have two padding bytes
    > which are part of the long double as far as C is concerned. Obviously
    > using 12 as your alignment would be very wasteful in this situation!


    Oh, I see where I am going wrong - sizeof returns the size of the object
    including alignment padding.

    Otherwise, if we take you example of a 10-byte long double aligned to
    4-bytes, something like this woudln't work:

    long double * ptr = &a_double_array_with_at_least_2_elements;

    *++ptr = 0;

    So sizeof(long double) would need to return 12, not 10? I am assuming
    that ++ptr is equivelant to:

    ptr = (long double *)((char *)ptr + sizeof(long double));

    This is why I coudln't see how Malcom's solution would work. I knew I
    was missing something. Something obvious, too, now that I think about it.

    > Another option you could use is a hash table. You could then look up the
    > pointer passed to your "free" function and find all the other data you
    > need. Obviously this has some memory overheads. There are also some
    > portability problems in that there could be multiple pointer
    > representations could actually refer to the same pointer value, and
    > there is no guarantee you get back the same representation your passed!


    I used to some programming way back in IMB PC real mode. The
    segment/offset point system had some advantages... but not enough to
    make you want to rip your hair out.

    > Another option with different portability problems is to compare the
    > pointer passed to your "free" function to the start/end of each of your
    > chunks. However, C leave pointer comparisons (other than for equality)
    > undefined when the pointers do not refer to locations within the same
    > object.
    >
    > Another potentially good way is to have a header file with
    > implementation specifics such as the alignment value. Then you just have
    > to set all the appropriate values when porting to a new system and know
    > exactly where to find them!
    >
    > No solution is perfect, you just have to decide on the best compromise.


    All good ideas, but I'm trying to keep it simple (this coming from a
    person who always over-complicates things). The whole point of the code
    is to make things more efficient! I was hoping for a magic bullet
    solution but, alas, it just might not exist.

    Thanks for your help.

    --
    Darren (aka: qarnos)
     
    Darren Cubitt, Mar 15, 2009
    #5
  6. Darren Cubitt

    Flash Gordon Guest

    Darren Cubitt wrote:
    > Flash Gordon wrote:
    >
    >> Basically there are no gaps between elements in an array. So, for
    >> example, if as far as the processor is concerned a long double takes
    >> 10 byte but it has to be aligned on a 4 byte boundary the C
    >> implementation would have to make sizeof(long double) 12 and have two
    >> padding bytes which are part of the long double as far as C is
    >> concerned. Obviously using 12 as your alignment would be very wasteful
    >> in this situation!

    >
    > Oh, I see where I am going wrong - sizeof returns the size of the object
    > including alignment padding.


    Padding can be for reasons other than alignment, but yes.

    > Otherwise, if we take you example of a 10-byte long double aligned to
    > 4-bytes, something like this woudln't work:
    >
    > long double * ptr = &a_double_array_with_at_least_2_elements;
    >
    > *++ptr = 0;
    >
    > So sizeof(long double) would need to return 12, not 10? I am assuming
    > that ++ptr is equivelant to:
    >
    > ptr = (long double *)((char *)ptr + sizeof(long double));


    Yes, you are correct.

    Another reason it works like this is that array indexing is defined in
    terms of pointer arithmetic.

    > This is why I coudln't see how Malcom's solution would work. I knew I
    > was missing something. Something obvious, too, now that I think about it.


    A lot of things are obvious once you understand why they are done that
    way :)

    >> Another option you could use is a hash table. You could then look up
    >> the pointer passed to your "free" function and find all the other data
    >> you need. Obviously this has some memory overheads. There are also
    >> some portability problems in that there could be multiple pointer
    >> representations could actually refer to the same pointer value, and
    >> there is no guarantee you get back the same representation your passed!

    >
    > I used to some programming way back in IMB PC real mode. The
    > segment/offset point system had some advantages... but not enough to
    > make you want to rip your hair out.


    These things come and go. The hashing could also hit problems due to
    padding bits in the pointer which are not guaranteed to be maintained.

    >> Another option with different portability problems is to compare the
    >> pointer passed to your "free" function to the start/end of each of
    >> your chunks. However, C leave pointer comparisons (other than for
    >> equality) undefined when the pointers do not refer to locations within
    >> the same object.
    >>
    >> Another potentially good way is to have a header file with
    >> implementation specifics such as the alignment value. Then you just
    >> have to set all the appropriate values when porting to a new system
    >> and know exactly where to find them!
    >>
    >> No solution is perfect, you just have to decide on the best compromise.

    >
    > All good ideas, but I'm trying to keep it simple (this coming from a
    > person who always over-complicates things). The whole point of the code
    > is to make things more efficient! I was hoping for a magic bullet
    > solution but, alas, it just might not exist.


    The simplest and most efficient solutions, in my opinion, is the header
    file that specifies the alignment. A value of 4 is probably appropriate
    for your current targets, and people targeting other environments can
    easily change it.

    Keeping system specifics in one (or a small number) of files to allow
    easy porting is a standard technique. I've got commercial software
    distributed as source where the file is called port.h to make it easy :)

    > Thanks for your help.


    No problem.
    --
    Flash Gordon
     
    Flash Gordon, Mar 15, 2009
    #6
  7. Darren Cubitt

    Guest

    On 15 Mar, 09:21, Darren Cubitt <> wrote:
    > One problem I needed to deal with is, when a user calls the free
    > function, the allocator needs to be able to work out where the object is
    > with respect to the allocated chunk.
    >
    > I got around this by allocating an extra int per object. This int stores
    > the index of this object in the chuck, from which the chuck address can
    > be calculated. This will give you the general idea:
    >
    >         int * ptr = malloc(sizeof_object + sizeof(int));
    >         ptr[0] = index_of_object;
    >         return ptr + 1;


    You may be overlooking the obvious here. Firstly, is there some reason
    you can't simply add the extra int to the object structure? Or, if the
    object definition is fixed for some reason, why you can't create a new
    structure:

    struct myobject {
    struct object obj;
    int index; };

    which includes both the desired object and your new int?

    Hope that helps.
    Paul.
     
    , Mar 15, 2009
    #7
  8. Darren Cubitt

    CBFalconer Guest

    Darren Cubitt wrote:
    >

    .... snip ...
    >
    > All good ideas, but I'm trying to keep it simple (this coming from
    > a person who always over-complicates things). The whole point of
    > the code is to make things more efficient! I was hoping for a
    > magic bullet solution but, alas, it just might not exist.


    I gather you want to make efficient ways of allocating single
    integers. Lets assume those integers are 4 bytes long, and that
    anything of any other size will use malloc.

    To control allocation we need one bit per item. If we limit the
    count of items to something line 250, and each integer has 16 bits,
    we need a 16 integer array to hold the bits. If we round total
    storage used up to about 1k bytes, that will allow for about 240
    integers.

    So we have to allocate such a package early in program operation.
    Once allocated, a request for less or more than 1 integer will
    divert to the original malloc, as will all requests when the
    package is totally used. Now we just have to find an integer whose
    allocation bit is 0, set that allocation bit, and return the ints
    address. Free just has to reset the allocation bit. Somewhere a
    single integer will keep the count of items allocated.

    Note that the system will be highly sensitive to misuse.

    --
    [mail]: Chuck F (cbfalconer at maineline dot net)
    [page]: <http://cbfalconer.home.att.net>
    Try the download section.
     
    CBFalconer, Mar 16, 2009
    #8
  9. "Darren Cubitt" <> wrote in message
    news:rM3vl.28771$...
    > Hi CLC,
    >
    > For a while I've been using a pool allocator I wrote for fixed sized
    > objects. Basically, it allocates chunks of 32 (or however many bits in an
    > int) objects of a given size, and returns pointers to the individual
    > objects as required.
    >
    > One problem I needed to deal with is, when a user calls the free function,
    > the allocator needs to be able to work out where the object is with
    > respect to the allocated chunk.
    >
    > I got around this by allocating an extra int per object. This int stores
    > the index of this object in the chuck, from which the chuck address can be
    > calculated. This will give you the general idea:
    >
    > int * ptr = malloc(sizeof_object + sizeof(int));
    > ptr[0] = index_of_object;
    > return ptr + 1;
    >
    > Thanks to reading this group, I now know this is a Bad Idea(tm) due to
    > memory alignment issues - something I have been ignorant of in the past.


    You say the pool allocator works with only a single object size. Well, why
    not do something simple like... Round all object sizes up to a multiple of
    `sizeof(void*)'. Then allocate a large chunk of memory, say 4096 bytes of
    object space + size of the chunk header + extra for alignment purposes, and
    align the object space on a 4096 byte boundary. Object space must be a
    multiple of the alignment requirements of at least a void*, or the given
    object, which ever one is greater. Header alignment must be at least big
    enough to accommodate its strictest type. Now, create a simple bump pointer
    and a free list in the header. Initial allocations use the bump pointer. If
    the pointer is at the end, it uses the freelist. Frees simply link objects
    as-is right into the free list. To find the chunk header of any object, you
    simple round its address down to 4096 boundary and subtract size of the
    header. This is a fairly common technique, and it does not require an extra
    wasteful int per object. The wasted space only gets detected when the user
    associates the pool with an object whose sizeof value is less than that of
    sizeof(void*).

    I can whip up some simple working code if your interested. Many
    state-of-the-art multi-threaded allocators use this technique. Intel TBB,
    StreamFlow, ect...


    Another simple technique is to use a bitmap; Here is simple example of
    region allocator than can actually free individual objects:

    http://groups.google.com/group/comp.lang.c /msg/e903e1726d7e5d8f




    > It seems what I need to do is instead of using sizeof(int) I need to
    > calculate the "universal alignment" value - for lack of a better name.
    > Obviously, there must be one or malloc() couldn't work.
    >
    > So if I create a union of all the intrinsic types, and can then calculate
    > the alignment for that union, I should be set, right?
    >
    > In short, would this code be correct and portable:
    >
    >
    > #include "intrinsic_types.h" /* intrinsic_types_t */
    >
    > size_t a = offsetof(struct {char a; instrinsic_types_t b;}, b);
    >
    > Also, is using a struct definition in offsetof() ok? It compiles fine in
    > MSVC, but that is hardly reassuring.


    This "offsetof" trick works fine:

    http://groups.google.com/group/comp.lang.c/browse_frm/thread/38c5c2c5b0185cc1




    I used this to create a simple conservative region allocator:

    http://groups.google.com/group/comp.lang.c/browse_frm/thread/97a1c978a0f20195


    This can carve objects out of a given buffer, and maintain alignment without
    resorting to a so called max alignment value. It works well in space
    confined environments.
     
    Chris M. Thomasson, Mar 16, 2009
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Asad
    Replies:
    2
    Views:
    455
    Matt Berther
    Apr 27, 2004
  2. Sparky Arbuckle

    Calculating a Subtotal for Shopping Cart

    Sparky Arbuckle, Mar 6, 2005, in forum: ASP .Net
    Replies:
    6
    Views:
    1,660
    Sparky Arbuckle
    Mar 6, 2005
  3. Nathan Sokalski

    Calculating the width of an element

    Nathan Sokalski, May 30, 2005, in forum: ASP .Net
    Replies:
    1
    Views:
    472
    Herfried K. Wagner [MVP]
    May 30, 2005
  4. sarathy
    Replies:
    2
    Views:
    689
    sarathy
    Jul 17, 2006
  5. qarnos

    Another question wrt storage alignment.

    qarnos, Apr 30, 2009, in forum: C Programming
    Replies:
    3
    Views:
    299
    Eric Sosman
    Apr 30, 2009
Loading...

Share This Page