alignment using malloc

digz · Sep 20, 2007

I saw the following code and comments in a source file, I am not able
to appreciate
what the author is trying to do in ALIGNED_ALLOC , can some one plz
help me understand why CACHE_LINE_SIZE * 2 is first added to the
parameters to malloc and then CACHE_LINE_SIZE -1 is added to the
return address and after that its bit anded with ( CACHE_LINE_SIZE
-1 ) , how is this helping alignment etc ?

/*
* Allow us to efficiently align and pad structures so that shared
fields
* don't cause contention on thread-local or read-only fields.
*/
#define CACHE_LINE_SIZE 64
#define CACHE_PAD(_n) char __pad ## _n [CACHE_LINE_SIZE]
#define ALIGNED_ALLOC(_s) \
((void *)(((unsigned long)malloc((_s)+CACHE_LINE_SIZE*2) + \
CACHE_LINE_SIZE - 1) & ~(CACHE_LINE_SIZE-1)))

Thx
Digz

Richard · Sep 20, 2007

digz said:
I saw the following code and comments in a source file, I am not able
to appreciate
what the author is trying to do in ALIGNED_ALLOC , can some one plz
help me understand why CACHE_LINE_SIZE * 2 is first added to the
parameters to malloc and then CACHE_LINE_SIZE -1 is added to the
return address and after that its bit anded with ( CACHE_LINE_SIZE
-1 ) , how is this helping alignment etc ?

/*
* Allow us to efficiently align and pad structures so that shared
fields
* don't cause contention on thread-local or read-only fields.
*/
#define CACHE_LINE_SIZE 64
#define CACHE_PAD(_n) char __pad ## _n [CACHE_LINE_SIZE]
#define ALIGNED_ALLOC(_s) \
((void *)(((unsigned long)malloc((_s)+CACHE_LINE_SIZE*2) + \
CACHE_LINE_SIZE - 1) & ~(CACHE_LINE_SIZE-1)))

Thx
Digz

because you can not tell malloc (afaik) how to align.

You need to ensure that the malloc area is large enough to allow a NEW
value to be used whichi points to somewhere in the malloced block but is
aligned on a linesize boundary. This is critical to avoid oft accessed
structures/data running over a cache line size. I'm not sure how, if at
all, the memory can then be freed.

Jack Klein · Sep 20, 2007

I saw the following code and comments in a source file, I am not able
to appreciate
what the author is trying to do in ALIGNED_ALLOC , can some one plz
help me understand why CACHE_LINE_SIZE * 2 is first added to the
parameters to malloc and then CACHE_LINE_SIZE -1 is added to the
return address and after that its bit anded with ( CACHE_LINE_SIZE
-1 ) , how is this helping alignment etc ?

/*
* Allow us to efficiently align and pad structures so that shared
fields
* don't cause contention on thread-local or read-only fields.
*/
#define CACHE_LINE_SIZE 64
#define CACHE_PAD(_n) char __pad ## _n [CACHE_LINE_SIZE]
#define ALIGNED_ALLOC(_s) \
((void *)(((unsigned long)malloc((_s)+CACHE_LINE_SIZE*2) + \
CACHE_LINE_SIZE - 1) & ~(CACHE_LINE_SIZE-1)))

Ask the person who wrote the code, or a group supporting the compiler
and platform it was written for. It produces undefined behavior in
several ways.

On the other hand, replace the call to malloc() with a call to a
function of your own that returns different unsigned long values each
time it is called, and outputs the value with printf() before
returning it.

Then write a main() function that invokes the macro, and prints the
final result. Compare the value returned by your malloc() replacement
function and the final value after mishandling by the macro.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.club.cc.cmu.edu/~ajo/docs/FAQ-acllc.html

Barry Schwarz · Sep 21, 2007

I saw the following code and comments in a source file, I am not able
to appreciate
what the author is trying to do in ALIGNED_ALLOC , can some one plz
help me understand why CACHE_LINE_SIZE * 2 is first added to the
parameters to malloc and then CACHE_LINE_SIZE -1 is added to the
return address and after that its bit anded with ( CACHE_LINE_SIZE
-1 ) , how is this helping alignment etc ?

/*
* Allow us to efficiently align and pad structures so that shared
fields
* don't cause contention on thread-local or read-only fields.
*/
#define CACHE_LINE_SIZE 64
#define CACHE_PAD(_n) char __pad ## _n [CACHE_LINE_SIZE]
#define ALIGNED_ALLOC(_s) \
((void *)(((unsigned long)malloc((_s)+CACHE_LINE_SIZE*2) + \
CACHE_LINE_SIZE - 1) & ~(CACHE_LINE_SIZE-1)))

As bad an idea as this is, here is one more problem. How would you
ever free the allocated memory?

Remove del for email

christian.bau · Sep 22, 2007

digz said:
I saw the following code and comments in a source file, I am not able
to appreciate
what the author is trying to do in ALIGNED_ALLOC , can some one plz
help me understand why CACHE_LINE_SIZE * 2 is first added to the
parameters to malloc and then CACHE_LINE_SIZE -1 is added to the
return address and after that its bit anded with ( CACHE_LINE_SIZE
-1 ) , how is this helping alignment etc ?

/*
* Allow us to efficiently align and pad structures so that shared
fields
* don't cause contention on thread-local or read-only fields.
*/
#define CACHE_LINE_SIZE 64
#define CACHE_PAD(_n) char __pad ## _n [CACHE_LINE_SIZE]
#define ALIGNED_ALLOC(_s) \
((void *)(((unsigned long)malloc((_s)+CACHE_LINE_SIZE*2) + \
CACHE_LINE_SIZE - 1) & ~(CACHE_LINE_SIZE-1)))

What this code tries to do is create a pointer that is aligned to a
byte address that is a multiple of CACHE_LINE_SIZE, which is a power
of two.

Here CACHE_LINE_SIZE is a power of two (two to the sixth power).
Therefore CACHE_LINE_SIZE - 1 is a value with the lowest six bits set,
and all other bits zero. X & (CACHE_LINE_SIZE - 1) equals X with the
lowest six bits cleared, so the result is X rounded down to the next
multiple of CACHE_LINE_SIZE,

That result would have the correct line size, but it wouldn't be in
the area allocated by malloc. By adding CACHE_LINE_SIZE - 1 = 63
first, we get a result that is not smaller than what malloc returned.
However, the caller wanted a pointer to _s bytes, and up to
CACHE_LINE_SIZE - 1 bytes have been skipped. To make sure that _s
bytes are actually available, CACHE_LINE_SIZE - 1 should be added to
the argument of malloc.

There are a few problems with this code: First, it wastes memory by
adding CACHE_LINE_SIZE*2 bytes instead of CACHE_LINE_SIZE - 1. That's
not a serious problem, but the next problem is serious: The author
casts a pointer to unsigned long, modifies it, then casts the result
back to a pointer. This will go horribly wrong if unsigned long = 32
bit and a pointer is 64 bit, for example. There is also no guarantee
whatsoever that casting to unsigned long, doing arithmetic, and
casting back will give the result wanted. It is much safer to
calculate how many bytes to add, then cast the pointer to char* and
add the number of bytes wanted. The third problem is that after using
the macro, we don't know the value that malloc returned and there is
no way to recover it. Therefore, it is impossible to ever free the
memory that was allocated.

One harmless and two rather serious problems in a very short bit of
code. Oh well.

digz · Sep 24, 2007

digzwrote:

I saw the following code and comments in a source file, I am not able
to appreciate
what the author is trying to do in ALIGNED_ALLOC , can some one plz
help me understand why CACHE_LINE_SIZE * 2 is first added to the
parameters to malloc and then CACHE_LINE_SIZE -1 is added to the
return address and after that its bit anded with ( CACHE_LINE_SIZE
-1 ) , how is this helping alignment etc ?

Click to expand...

/*
* Allow us to efficiently align and pad structures so that shared
fields
* don't cause contention on thread-local or read-only fields.
*/
#define CACHE_LINE_SIZE 64
#define CACHE_PAD(_n) char __pad ## _n [CACHE_LINE_SIZE]
#define ALIGNED_ALLOC(_s) \
((void *)(((unsigned long)malloc((_s)+CACHE_LINE_SIZE*2) + \
CACHE_LINE_SIZE - 1) & ~(CACHE_LINE_SIZE-1)))

Click to expand...

What this code tries to do is create a pointer that is aligned to a
byte address that is a multiple of CACHE_LINE_SIZE, which is a power
of two.

Here CACHE_LINE_SIZE is a power of two (two to the sixth power).
Therefore CACHE_LINE_SIZE - 1 is a value with the lowest six bits set,
and all other bits zero. X & (CACHE_LINE_SIZE - 1) equals X with the
lowest six bits cleared, so the result is X rounded down to the next
multiple of CACHE_LINE_SIZE,

That result would have the correct line size, but it wouldn't be in
the area allocated by malloc. By adding CACHE_LINE_SIZE - 1 = 63
first, we get a result that is not smaller than what malloc returned.
However, the caller wanted a pointer to _s bytes, and up to
CACHE_LINE_SIZE - 1 bytes have been skipped. To make sure that _s
bytes are actually available, CACHE_LINE_SIZE - 1 should be added to
the argument of malloc.

There are a few problems with this code: First, it wastes memory by
adding CACHE_LINE_SIZE*2 bytes instead of CACHE_LINE_SIZE - 1. That's
not a serious problem, but the next problem is serious: The author
casts a pointer to unsigned long, modifies it, then casts the result
back to a pointer. This will go horribly wrong if unsigned long = 32
bit and a pointer is 64 bit, for example. There is also no guarantee
whatsoever that casting to unsigned long, doing arithmetic, and
casting back will give the result wanted.

The third problem is that after using

the macro, we don't know the value that malloc returned and there is
no way to recover it. Therefore, it is impossible to ever free the
memory that was allocated.

One harmless and two rather serious problems in a very short bit of
code. Oh well.

When you say
It is much safer to calculate how many bytes to add, then cast the
pointer to char* and
add the number of bytes wanted.

do u mean something like ,
size_t alligned_size = _s + ( CACHE_LINE_SIZE - ( _s %
CACHE_LINE_SIZE ) )
aligned_ptr = (char * ) malloc( aligned_size )

Thx
Digz

digz · Sep 24, 2007

digzwrote:

I saw the following code and comments in a source file, I am not able
to appreciate
what the author is trying to do in ALIGNED_ALLOC , can some one plz
help me understand why CACHE_LINE_SIZE * 2 is first added to the
parameters to malloc and then CACHE_LINE_SIZE -1 is added to the
return address and after that its bit anded with ( CACHE_LINE_SIZE
-1 ) , how is this helping alignment etc ?

Click to expand...

/*
* Allow us to efficiently align and pad structures so that shared
fields
* don't cause contention on thread-local or read-only fields.
*/
#define CACHE_LINE_SIZE 64
#define CACHE_PAD(_n) char __pad ## _n [CACHE_LINE_SIZE]
#define ALIGNED_ALLOC(_s) \
((void *)(((unsigned long)malloc((_s)+CACHE_LINE_SIZE*2) + \
CACHE_LINE_SIZE - 1) & ~(CACHE_LINE_SIZE-1)))

Click to expand...

What this code tries to do is create a pointer that is aligned to a
byte address that is a multiple of CACHE_LINE_SIZE, which is a power
of two.

Here CACHE_LINE_SIZE is a power of two (two to the sixth power).
Therefore CACHE_LINE_SIZE - 1 is a value with the lowest six bits set,
and all other bits zero. X & (CACHE_LINE_SIZE - 1) equals X with the
lowest six bits cleared, so the result is X rounded down to the next
multiple of CACHE_LINE_SIZE,

That result would have the correct line size, but it wouldn't be in
the area allocated by malloc. By adding CACHE_LINE_SIZE - 1 = 63
first, we get a result that is not smaller than what malloc returned.
However, the caller wanted a pointer to _s bytes, and up to
CACHE_LINE_SIZE - 1 bytes have been skipped. To make sure that _s
bytes are actually available, CACHE_LINE_SIZE - 1 should be added to
the argument of malloc.

There are a few problems with this code: First, it wastes memory by
adding CACHE_LINE_SIZE*2 bytes instead of CACHE_LINE_SIZE - 1. That's
not a serious problem, but the next problem is serious: The author
casts a pointer to unsigned long, modifies it, then casts the result
back to a pointer. This will go horribly wrong if unsigned long = 32
bit and a pointer is 64 bit, for example. There is also no guarantee
whatsoever that casting to unsigned long, doing arithmetic, and
casting back will give the result wanted.

The third problem is that after using

the macro, we don't know the value that malloc returned and there is
no way to recover it. Therefore, it is impossible to ever free the
memory that was allocated.

One harmless and two rather serious problems in a very short bit of
code. Oh well.

When you say
It is much safer to calculate how many bytes to add, then cast the
pointer to char* and
add the number of bytes wanted.

do u mean something like ,
size_t alligned_size = _s + ( CACHE_LINE_SIZE - ( _s %
CACHE_LINE_SIZE ) )
aligned_ptr = (char * ) malloc( aligned_size )

Thx
Digz

digz · Sep 24, 2007

digzwrote:

I saw the following code and comments in a source file, I am not able
to appreciate
what the author is trying to do in ALIGNED_ALLOC , can some one plz
help me understand why CACHE_LINE_SIZE * 2 is first added to the
parameters to malloc and then CACHE_LINE_SIZE -1 is added to the
return address and after that its bit anded with ( CACHE_LINE_SIZE
-1 ) , how is this helping alignment etc ?

Click to expand...

/*
* Allow us to efficiently align and pad structures so that shared
fields
* don't cause contention on thread-local or read-only fields.
*/
#define CACHE_LINE_SIZE 64
#define CACHE_PAD(_n) char __pad ## _n [CACHE_LINE_SIZE]
#define ALIGNED_ALLOC(_s) \
((void *)(((unsigned long)malloc((_s)+CACHE_LINE_SIZE*2) + \
CACHE_LINE_SIZE - 1) & ~(CACHE_LINE_SIZE-1)))

Click to expand...

What this code tries to do is create a pointer that is aligned to a
byte address that is a multiple of CACHE_LINE_SIZE, which is a power
of two.

Here CACHE_LINE_SIZE is a power of two (two to the sixth power).
Therefore CACHE_LINE_SIZE - 1 is a value with the lowest six bits set,
and all other bits zero. X & (CACHE_LINE_SIZE - 1) equals X with the
lowest six bits cleared, so the result is X rounded down to the next
multiple of CACHE_LINE_SIZE,

That result would have the correct line size, but it wouldn't be in
the area allocated by malloc. By adding CACHE_LINE_SIZE - 1 = 63
first, we get a result that is not smaller than what malloc returned.
However, the caller wanted a pointer to _s bytes, and up to
CACHE_LINE_SIZE - 1 bytes have been skipped. To make sure that _s
bytes are actually available, CACHE_LINE_SIZE - 1 should be added to
the argument of malloc.

There are a few problems with this code: First, it wastes memory by
adding CACHE_LINE_SIZE*2 bytes instead of CACHE_LINE_SIZE - 1. That's
not a serious problem, but the next problem is serious: The author
casts a pointer to unsigned long, modifies it, then casts the result
back to a pointer. This will go horribly wrong if unsigned long = 32
bit and a pointer is 64 bit, for example. There is also no guarantee
whatsoever that casting to unsigned long, doing arithmetic, and
casting back will give the result wanted.

The third problem is that after using

the macro, we don't know the value that malloc returned and there is
no way to recover it. Therefore, it is impossible to ever free the
memory that was allocated.

One harmless and two rather serious problems in a very short bit of
code. Oh well.

When you say
It is much safer to calculate how many bytes to add, then cast the
pointer to char* and
add the number of bytes wanted.

do u mean something like ,
size_t alligned_size = _s + ( CACHE_LINE_SIZE - ( _s %
CACHE_LINE_SIZE ) )
aligned_ptr = (char * ) malloc( aligned_size )

Thx
Digz

digz · Sep 24, 2007

digzwrote:

I saw the following code and comments in a source file, I am not able
to appreciate
what the author is trying to do in ALIGNED_ALLOC , can some one plz
help me understand why CACHE_LINE_SIZE * 2 is first added to the
parameters to malloc and then CACHE_LINE_SIZE -1 is added to the
return address and after that its bit anded with ( CACHE_LINE_SIZE
-1 ) , how is this helping alignment etc ?

Click to expand...

/*
* Allow us to efficiently align and pad structures so that shared
fields
* don't cause contention on thread-local or read-only fields.
*/
#define CACHE_LINE_SIZE 64
#define CACHE_PAD(_n) char __pad ## _n [CACHE_LINE_SIZE]
#define ALIGNED_ALLOC(_s) \
((void *)(((unsigned long)malloc((_s)+CACHE_LINE_SIZE*2) + \
CACHE_LINE_SIZE - 1) & ~(CACHE_LINE_SIZE-1)))

Click to expand...

What this code tries to do is create a pointer that is aligned to a
byte address that is a multiple of CACHE_LINE_SIZE, which is a power
of two.

Here CACHE_LINE_SIZE is a power of two (two to the sixth power).
Therefore CACHE_LINE_SIZE - 1 is a value with the lowest six bits set,
and all other bits zero. X & (CACHE_LINE_SIZE - 1) equals X with the
lowest six bits cleared, so the result is X rounded down to the next
multiple of CACHE_LINE_SIZE,

That result would have the correct line size, but it wouldn't be in
the area allocated by malloc. By adding CACHE_LINE_SIZE - 1 = 63
first, we get a result that is not smaller than what malloc returned.
However, the caller wanted a pointer to _s bytes, and up to
CACHE_LINE_SIZE - 1 bytes have been skipped. To make sure that _s
bytes are actually available, CACHE_LINE_SIZE - 1 should be added to
the argument of malloc.

There are a few problems with this code: First, it wastes memory by
adding CACHE_LINE_SIZE*2 bytes instead of CACHE_LINE_SIZE - 1. That's
not a serious problem, but the next problem is serious: The author
casts a pointer to unsigned long, modifies it, then casts the result
back to a pointer. This will go horribly wrong if unsigned long = 32
bit and a pointer is 64 bit, for example. There is also no guarantee
whatsoever that casting to unsigned long, doing arithmetic, and
casting back will give the result wanted.

The third problem is that after using

the macro, we don't know the value that malloc returned and there is
no way to recover it. Therefore, it is impossible to ever free the
memory that was allocated.

One harmless and two rather serious problems in a very short bit of
code. Oh well.

When you say
It is much safer to calculate how many bytes to add, then cast the
pointer to char* and
add the number of bytes wanted.

do u mean something like ,
size_t alligned_size = _s + ( CACHE_LINE_SIZE - ( _s %
CACHE_LINE_SIZE ) )
aligned_ptr = (char * ) malloc( aligned_size )

Thx
Digz

malloc and alignment	13	Jan 24, 2009
Alignment problems	22	Jul 8, 2010
malloc() and alignment	10	Apr 2, 2007
Naive Custom Malloc Implementation	8	Nov 11, 2009
determining alignment of objects	5	Dec 11, 2009
Question regarding memory alignment	2	Nov 12, 2007
Alignment, Cast	27	Aug 28, 2007
Alignment of a structure.	6	Jan 23, 2008

alignment using malloc

digz

Richard

Jack Klein

Barry Schwarz

christian.bau

digz

digz

digz

digz

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads