malloc() and alignment

Y

Yevgen Muntyan

Hey,

Consider the following code:

#include <stdlib.h>
#define MAGIC_NUMBER 64

void *my_malloc (size_t n)
{
char *result = malloc (n + MAGIC_NUMBER);
return result ? result + MAGIC_NUMBER : NULL;
}

void my_free (void *ptr)
{
if (ptr)
free ((char*) ptr - MAGIC_NUMBER);
}

It substitutes library malloc() and free() so that
calling free() on pointer returned by my_malloc()
is invalid (and results in nice abort here so I can detect
certain bugs, but that's not important).

Question is: does standard say that there is a value for MAGIC_NUMBER
such that the code above is valid? E.g. 64 works fine for me here,
17 doesn't.

Standard says that "The pointer returned if the allocation succeeds is
suitably aligned so that it may be assigned to a pointer to any type of
object and then used to access such an object or an array of such
objects in the space allocated", but it doesn't say if it's possible to
get another such a pointer from the malloc() result.

Best regards,
Yevgen
 
J

Jean-Marc Bourguet

Yevgen Muntyan said:
Consider the following code:

#include <stdlib.h>
#define MAGIC_NUMBER 64

void *my_malloc (size_t n)
{
char *result = malloc (n + MAGIC_NUMBER);
return result ? result + MAGIC_NUMBER : NULL;
}

void my_free (void *ptr)
{
if (ptr)
free ((char*) ptr - MAGIC_NUMBER);
} [...]
Question is: does standard say that there is a value for MAGIC_NUMBER
such that the code above is valid?

#ifdef __STDC_VERSION__ >= 199901L
#include <stdint.h>
#endif

union align {
void* vptrvar;
long double ldvar;
#ifdef __STDC_VERSION__ >= 199901L
intmax_t intvar;
#else
long intvar;
#endif
};

sizeof(union align) should be a pretty save bet. Adding non standard type
(long long while in C90 mode for example) could be usefull. Adding other
types to it would make it saver in theory but probably not in practice.

Your,
 
F

Flash Gordon

Yevgen said:
Hey,

Consider the following code:

#include <stdlib.h>
#define MAGIC_NUMBER 64

void *my_malloc (size_t n)
{
char *result = malloc (n + MAGIC_NUMBER);
return result ? result + MAGIC_NUMBER : NULL;
}

void my_free (void *ptr)
{
if (ptr)
free ((char*) ptr - MAGIC_NUMBER);
}

It substitutes library malloc() and free() so that
calling free() on pointer returned by my_malloc()
is invalid (and results in nice abort here so I can detect
certain bugs, but that's not important).

There is no guarantee that it will cause an abort even if for you it
does most of the time.
Question is: does standard say that there is a value for MAGIC_NUMBER
such that the code above is valid? E.g. 64 works fine for me here,
17 doesn't.

Standard says that "The pointer returned if the allocation succeeds is
suitably aligned so that it may be assigned to a pointer to any type of
object and then used to access such an object or an array of such
objects in the space allocated", but it doesn't say if it's possible to
get another such a pointer from the malloc() result.

Standard provides no other way. An implementation could do the following:
All types apart from structures have an alignment of 1
The Nth structure type you define has an alignment of 2^(N+1)
Where ^ is raise to the power.

Of course, no implementation is really going to be that awkward.
 
S

Scorpio

Yevgen said:
Hey,

Consider the following code:

#include <stdlib.h>
#define MAGIC_NUMBER 64

void *my_malloc (size_t n)
{
char *result = malloc (n + MAGIC_NUMBER);
return result ? result + MAGIC_NUMBER : NULL;
}

void my_free (void *ptr)
{
if (ptr)
free ((char*) ptr - MAGIC_NUMBER);
}

It substitutes library malloc() and free() so that
calling free() on pointer returned by my_malloc()
is invalid (and results in nice abort here so I can detect
certain bugs, but that's not important).

Question is: does standard say that there is a value for MAGIC_NUMBER
such that the code above is valid? E.g. 64 works fine for me here,
17 doesn't.

The standard says nothing as such, and you're lucky that nothing bad
happened with your code, what you have just seen is an undefined
behaviour.

The standard says:
#if the argument does not match a pointer earlier returned by the
calloc, malloc, or realloc #function, or if the space has been
deallocated by a call to free or realloc, the behavior is #undefined.
Standard says that "The pointer returned if the allocation succeeds is
suitably aligned so that it may be assigned to a pointer to any type of
object and then used to access such an object or an array of such
objects in the space allocated", but it doesn't say if it's possible to
get another such a pointer from the malloc() result.

No.
 
Y

Yevgen Muntyan

Scorpio said:
The standard says nothing as such, and you're lucky that nothing bad
happened with your code, what you have just seen is an undefined
behaviour.

The standard says:
#if the argument does not match a pointer earlier returned by the
calloc, malloc, or realloc #function, or if the space has been
deallocated by a call to free or realloc, the behavior is #undefined.

I guess you are talking about free(my_malloc(n)) - yes, that's why I
do this thing, it results in abort here. But my_free(my_malloc(n))
is good.

Regards,
Yevgen
 
K

Keith Thompson

Scorpio said:
The standard says nothing as such, and you're lucky that nothing bad
happened with your code, what you have just seen is an undefined
behaviour.

The standard says:
#if the argument does not match a pointer earlier returned by the
calloc, malloc, or realloc #function, or if the space has been
deallocated by a call to free or realloc, the behavior is #undefined.

Right, but assuming that my_malloc() calls are matched by my_free()
calls, that's not a problem; the value he's passing to free() is the
same value he obtained from malloc().

There's no guarantee that the result of my_malloc() will be properly
aligned for any particular type, and no standard way to make such a
guarantee reliably. (But I don't know of any system where 64 bytes
isn't a strict enough alignment for all types.)
 
Y

Yevgen Muntyan

Flash said:
There is no guarantee that it will cause an abort even if for you it
does most of the time.

Yep, note word "here".
Standard provides no other way. An implementation could do the following:
All types apart from structures have an alignment of 1
The Nth structure type you define has an alignment of 2^(N+1)
Where ^ is raise to the power.

Of course, no implementation is really going to be that awkward.

Speaking of awkward implementations, does aligment really mean anything?
Say, an implementation could have bunch of boxes with paper inside as
memory, and pointers could be numbers of those boxes, and aligment as
in "multiple of byte address" wouldn't exist there at all.
Standard *seems* to mention alignment here and there to emphasize that
you can't store one thing in memory allocated for another thing, but
it's usually already stated elsewhere. Or not stated, but information
about alignment doesn't do any good anyway. I looked for "align" in
standard, and found like ten places, and all of them do not seem to
add any information that can be actually used.
For example, it's impossible to compute alignment for given type, isn't
it? So the code above can't be portable even for one fixed type, i.e.
for allocate_one_object_of_this_type(). Or can it?

Regards,
Yevgen
 
T

Tak-Shing Chan

Yep, note word "here".


Speaking of awkward implementations, does aligment really mean anything?
Say, an implementation could have bunch of boxes with paper inside as
memory, and pointers could be numbers of those boxes, and aligment as
in "multiple of byte address" wouldn't exist there at all.
Standard *seems* to mention alignment here and there to emphasize that
you can't store one thing in memory allocated for another thing, but
it's usually already stated elsewhere. Or not stated, but information
about alignment doesn't do any good anyway. I looked for "align" in
standard, and found like ten places, and all of them do not seem to
add any information that can be actually used.
For example, it's impossible to compute alignment for given type, isn't
it? So the code above can't be portable even for one fixed type, i.e.
for allocate_one_object_of_this_type(). Or can it?

It is possible to calculate a standard-compliant MAGIC_NUMBER
if and only if you are willing to artificially restrict the
number of types that you use my_malloc() on. This can be done by
putting a *useful subset* of all possible types in a union, say
my_union, then define MAGIC_NUMBER as sizeof(my_union):

/* A list of my_malloc()able types */
union my_union {
char c;
int i;
long l;
float f;
double d;
void *vp;
};

#define MAGIC_NUMBER sizeof(my_union)

Tak-Shing
 
S

SM Ryan

# Question is: does standard say that there is a value for MAGIC_NUMBER
# such that the code above is valid? E.g. 64 works fine for me here,
# 17 doesn't.

MAGIC_NUMBER = n * a
where a = alignment constraint

On most machines today, a=4 bytes or 32 bits, but that is
implementation dependent. If you want to do this portably,
you need to adapt the code for each implementation. Often
thats when you see things like
#if define(__INTEL__)
enum {minalign=1};
#elseif define(__PPCL__)
enum {minalign=4};
#elseif define(__KREMVAX__)
enum {minalign=9};
#else
#warning minalign is not known - guessing 4
enum {minalign=4};
#endif
 
L

Laurent Deniau

Yevgen said:
Hey,

Consider the following code:

#include <stdlib.h>
#define MAGIC_NUMBER 64

void *my_malloc (size_t n)
{
char *result = malloc (n + MAGIC_NUMBER);
return result ? result + MAGIC_NUMBER : NULL;
}

void my_free (void *ptr)
{
if (ptr)
free ((char*) ptr - MAGIC_NUMBER);
}

It substitutes library malloc() and free() so that
calling free() on pointer returned by my_malloc()
is invalid (and results in nice abort here so I can detect
certain bugs, but that's not important).

Question is: does standard say that there is a value for MAGIC_NUMBER
such that the code above is valid? E.g. 64 works fine for me here,
17 doesn't.

The following should fit the requirement:

struct align_struct {
char c;
union {
_Bool b;
char c;
short s;
int i;
long l;
long long ll;
float f;
double d;
long double ld;
void *vp;
void (*fp)(void);
struct _ *sp;
} u;
};

enum { align_value = offsetof(struct align_struct, u) };

On x86-32 it shows that alignment constraint is 4 bytes while the union
size is 12 bytes.

a+, ld.
 
E

Eric Sosman

Yevgen said:
[...]
Speaking of awkward implementations, does aligment really mean anything?
Say, an implementation could have bunch of boxes with paper inside as
memory, and pointers could be numbers of those boxes, and aligment as
in "multiple of byte address" wouldn't exist there at all.
Standard *seems* to mention alignment here and there to emphasize that
you can't store one thing in memory allocated for another thing, but
it's usually already stated elsewhere. Or not stated, but information
about alignment doesn't do any good anyway. I looked for "align" in
standard, and found like ten places, and all of them do not seem to
add any information that can be actually used.

When you ask "does alignment really mean anything?" I assume
you are not asking about the hardware platforms C runs on: some
machines require special alignment for some of their data types
in order to access them efficiently, correctly, or at all. If C
is to run (or run well) on such machines, it must respect the
hardware's conventions.

So I guess you're asking whether the Standard actually needs
to talk about "alignment," whether all that talk adds anything to
the language definition. You mention that it's already forbidden
to "store one thing in memory allocated for another thing," and you
may be wondering whether that restriction by itself is strong
enough that the Standard does not need to mention "alignment."

The answer is "No," and the code you posted shows why. The
memory obtained from malloc() is not "for" any identifiable data
type; you just ask for N bytes without saying how you intend to
use them. malloc() returns a pointer to "typeless" memory, and it
is up to you to fill it appropriately. You can store anything at
all in that memory (if it fits), so the language about not storing
a double in a short's memory doesn't stop you.

The "alignment" language is what tells you that

char *p = malloc(1 + sizeof(double)); // assume success
*p++ = 'X';
*(double*)p = 2.71828;

is not guaranteed to work, even though the allocated memory is
large enough to hold both the char and the double. There might
be other, more abstract ways of expressing equivalent restrictions,
but "alignment" is as good a term as any and better than some.
As a thought exercise, read the relevant sections of the Standard,
mentally replacing "alignment" with "color;" you'll get the same
restrictions but with less explanatory power.
For example, it's impossible to compute alignment for given type, isn't
it?

You can come close. From the properties of arrays you can
deduce that a type's alignment must be a divisor of its size, and
using sizeof(T) as an estimate for alignof(T) will always work --
although it might be wasteful:

typedef struct { char a; char b[9999]; } T;

You can get a possibly tighter estimate this way:

struct fake { char junk; T t; };
size_t align_of_T = offsetof(struct fake, t);

This is not guaranteed to give a minimal answer, but on "sane"
implementations it probably will -- and it will never produce a
result that's too small.
So the code above can't be portable even for one fixed type, i.e.
for allocate_one_object_of_this_type(). Or can it?

Given the type, you can use one of the dodges above. But
given the type, there's no reason for so much obfuscation:

struct my_atom {struct my_header h; T t; };

T *allocate_one(void) {
struct my_atom *ptr = malloc(sizeof *ptr);
return (ptr == NULL) ? NULL : &ptr->t;
}

void free_one(T *ptr) {
if (ptr != NULL)
free((char*)ptr - offsetof(struct my_atom, t));
}

The problem for malloc() work-alikes is that they are not
given the type: they must work correctly for all types from a
potentially very large set the programmer can define.
 
Y

Yevgen Muntyan

Eric said:
Yevgen said:
[...]
Speaking of awkward implementations, does aligment really mean anything?
Say, an implementation could have bunch of boxes with paper inside as
memory, and pointers could be numbers of those boxes, and aligment as
in "multiple of byte address" wouldn't exist there at all.
Standard *seems* to mention alignment here and there to emphasize that
you can't store one thing in memory allocated for another thing, but
it's usually already stated elsewhere. Or not stated, but information
about alignment doesn't do any good anyway. I looked for "align" in
standard, and found like ten places, and all of them do not seem to
add any information that can be actually used.


When you ask "does alignment really mean anything?" I assume
you are not asking about the hardware platforms C runs on: some
machines require special alignment for some of their data types
in order to access them efficiently, correctly, or at all. If C
is to run (or run well) on such machines, it must respect the
hardware's conventions.

So I guess you're asking whether the Standard actually needs
to talk about "alignment," whether all that talk adds anything to
the language definition. You mention that it's already forbidden
to "store one thing in memory allocated for another thing," and you
may be wondering whether that restriction by itself is strong
enough that the Standard does not need to mention "alignment."

The answer is "No," and the code you posted shows why. The
memory obtained from malloc() is not "for" any identifiable data
type; you just ask for N bytes without saying how you intend to
use them. malloc() returns a pointer to "typeless" memory, and it
is up to you to fill it appropriately. You can store anything at
all in that memory (if it fits), so the language about not storing
a double in a short's memory doesn't stop you.

The "alignment" language is what tells you that

char *p = malloc(1 + sizeof(double)); // assume success
*p++ = 'X';
*(double*)p = 2.71828;

is not guaranteed to work, even though the allocated memory is
large enough to hold both the char and the double. There might
be other, more abstract ways of expressing equivalent restrictions,
but "alignment" is as good a term as any and better than some.
As a thought exercise, read the relevant sections of the Standard,
mentally replacing "alignment" with "color;" you'll get the same
restrictions but with less explanatory power.

I guess I had too bad opinion about standard; while it allows
weird implementations, an allocated block of memory must work as
if it was a sequence of bytes, and pointer arithmetic means arithmetic
on byte indices inside a block. And alignment indeed makes perfect
sense, at least it seems so now :)
For example, it's impossible to compute alignment for given type, isn't
it?


You can come close. From the properties of arrays you can
deduce that a type's alignment must be a divisor of its size, and
using sizeof(T) as an estimate for alignof(T) will always work --
although it might be wasteful:

typedef struct { char a; char b[9999]; } T;

You can get a possibly tighter estimate this way:

struct fake { char junk; T t; };
size_t align_of_T = offsetof(struct fake, t);

This is not guaranteed to give a minimal answer, but on "sane"
implementations it probably will -- and it will never produce a
result that's too small.

Is it guaranteed that you can use such a number for that malloc()
mangling? I.e. if you know that offset of a member of type T in
some structure is N, can you put a value of type T at address
malloc(N + sizeof(T)) + N?
The answer should be "yes", naturally, but is it? I got from
standard only that if N is *not* multiple of "alignment for type T"
(or how it's called), then you can *not* do that; not in other
direction.

Thanks,
Yevgen
 
E

Eric Sosman

Yevgen said:
Eric said:
Yevgen said:
For example, it's impossible to compute alignment for given type, isn't
it?
>>
You can come close. From the properties of arrays you can
deduce that a type's alignment must be a divisor of its size, and
using sizeof(T) as an estimate for alignof(T) will always work --
although it might be wasteful:

typedef struct { char a; char b[9999]; } T;

You can get a possibly tighter estimate this way:

struct fake { char junk; T t; };
size_t align_of_T = offsetof(struct fake, t);

This is not guaranteed to give a minimal answer, but on "sane"
implementations it probably will -- and it will never produce a
result that's too small.

Is it guaranteed that you can use such a number for that malloc()
mangling? I.e. if you know that offset of a member of type T in
some structure is N, can you put a value of type T at address
malloc(N + sizeof(T)) + N?
The answer should be "yes", naturally, but is it? I got from
standard only that if N is *not* multiple of "alignment for type T"
(or how it's called), then you can *not* do that; not in other
direction.

Yes, you can store and retrieve a T object at the position
N bytes into an allocated area. Consider these two fragments:

struct s { char junk; T t; };

/* fragment #1 */
struct s *p = malloc(sizeof *p); // assume success
p->junk = 'X';
p->t = some_T_value;

/* fragment #2 */
char *q = malloc(sizeof(struct s)); // assume success
(char*)(q + offsetof(struct s, junk)) = 'X';
(T*)(q + offsetof(struct s, t)) = some_T_value;

Fragment #1 must work, because you've allocated memory that
is big enough for and properly arranged for a struct s object.
Since the memory is right for a struct s, you can manipulate the
elements of that struct s instance in the allocated memory.

But Fragment #2 must also work, because it does exactly the
same thing! The spelling is obfuscated a bit, but the offsets
within the allocated block are exactly the same as those of
Fragment #1. Since they are both aligned properly in #1, they
are also aligned properly in #2.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,062
Latest member
OrderKetozenseACV

Latest Threads

Top