pointer alignment property

A

aegis

The following was mentioned by Eric Sosman
from
http://groups.google.com/group/comp.lang.c/msg/b696b28f59b9dac4?dmode=source

"The alignment requirement for any type T must be a
divisor of sizeof(T). (Proof: In `T array[2];' both
`array[0]' and `array[1]' must be correctly aligned.)
Since `sizeof(unsigned char)' is divisible only by one
and since one is a divisor of every type's size, every
type is suitably aligned for `unsigned char'."


Now if on a system where sizeof(long *) == sizeof(int *)
the implication is that a pointer to int can hold any values
that a pointer to long can and vice versa, however this
does not necessarily mean dereferencing a pointer to
int will yield a valid value or dereferencing a pointer to long
will yield a valid value. Example:

long *p;
int *p1;
long value = LONG_MAX;

p = &value;
p1 = p;

*p1; /* possibly a trap representation -- correct? */

It is possible, for either, to yield a trap
representation given disparate sizes between long and int or
even through modifying the padding bits of an object of type int
(if sizeof (long) >= sizeof(int) and not all bits were used to
represent values in an int),
if such padding bits existed for said object type.

And the special case for pointer to object types is
pointer to void, pointer to char, pointer to unsigned char,
pointer to signed char.

These pointer types are correctly aligned to hold any pointer
value that represents the location of an object.
However, you may not necessarily expect valid results
when dereferencing any of those pointer types(pointer to void
excluded), because under c89/90 a pointer to char or
pointer to signed char can, through dereferencing,
invoke undefined behavior? I am not too sure here.
I know under c99 trap representations were introduced
along with the raising of an implementation defined signal.
So in the case of c99, it is clear that the latter two can
happen if you were to, say, do the following:

sizeof(long *) == sizeof(char *)
sizeof(char) < sizeof(long)

long *p;
char *p1;
long a = 10;

p1 = p = &a;

*p1; /*
* undefined behavior due to:
* a) trap representation
* b) implementation defined signal raised
*/


Is my explanation apt? What is the case under c89
being that concepts such as trap representation
do not exist under c89?
 
K

kuyper

aegis said:
The following was mentioned by Eric Sosman
from
http://groups.google.com/group/comp.lang.c/msg/b696b28f59b9dac4?dmode=source

"The alignment requirement for any type T must be a
divisor of sizeof(T). (Proof: In `T array[2];' both
`array[0]' and `array[1]' must be correctly aligned.)
Since `sizeof(unsigned char)' is divisible only by one
and since one is a divisor of every type's size, every
type is suitably aligned for `unsigned char'."


Now if on a system where sizeof(long *) == sizeof(int *)
the implication is that a pointer to int can hold any values
that a pointer to long can and vice versa, however this
does not necessarily mean dereferencing a pointer to
int will yield a valid value or dereferencing a pointer to long
will yield a valid value. Example:

long *p;
int *p1;
long value = LONG_MAX;

p = &value;

It "can hold any values" only in the following sense:
memcpy(&i, &p1, sizeof(p));

Knowing that sizeof(long*)==sizeof(int*) is NOT sufficient to guarantee
that every value of type 'long*' is correctly aligned to allow
conversion to 'int*'. It's perfectly legal (though unlikely) for an
implementation to have stricter alignment requirements for 'int' than
for 'long', so there's no gurantee that following statement has defined
behavior:

Even if 'p' is correctly aligned, the ONLY thing that the standard
guarantees for you about the value of 'p1' is that if this value is
converted back to 'long*', it will point at 'value'. The standard says
nothing about which piece of memory 'p1' points at. Therefore, the
following statement:
*p1; /* possibly a trap representation -- correct? */

has undefined behavior. Of course, in practice, by far the most likely
thing is that it points at the first sizeof(int) bytes of 'value'.
However, even if it does point at that location, you have no gurantees
about which bits from 'value' are stored in those bytes, nor in what
order those bits are stored. You also have no guarantees from the
standard which order the bits of an 'int' are stored in. As signed
types, both 'int' and 'long' are allowed to have padding bits, and the
standard provides no guarantees about the location of those padding
bits, either. In short, nothing useful is guaranteed by the standard
about the results of evaluating *p1.
And the special case for pointer to object types is
pointer to void, pointer to char, pointer to unsigned char,
pointer to signed char.

These pointer types are correctly aligned to hold any pointer
value that represents the location of an object.
However, you may not necessarily expect valid results
when dereferencing any of those pointer types(pointer to void
excluded), because under c89/90 a pointer to char or
pointer to signed char can, through dereferencing,
invoke undefined behavior?

You're guranteed to be able to access the bytes of an object as an
array of unsigned char (6.2.6.1p4) - the standard gives memcpy() as an
example of one way of copying it, but doesn't specify that this is the
only way. Since memcpy() is defined as accessing the memory it copies
as an array of unsigned char, directly accessing the object as an array
of char must also work.

6.2.6.1p5 allows undefined behavior due to the object having a trap
representation only when the lvalue being used does not have character
type.
 
A

aegis

aegis said:
The following was mentioned by Eric Sosman
from
http://groups.google.com/group/comp.lang.c/msg/b696b28f59b9dac4?dmode=source

"The alignment requirement for any type T must be a
divisor of sizeof(T). (Proof: In `T array[2];' both
`array[0]' and `array[1]' must be correctly aligned.)
Since `sizeof(unsigned char)' is divisible only by one
and since one is a divisor of every type's size, every
type is suitably aligned for `unsigned char'."


Now if on a system where sizeof(long *) == sizeof(int *)
the implication is that a pointer to int can hold any values
that a pointer to long can and vice versa, however this
does not necessarily mean dereferencing a pointer to
int will yield a valid value or dereferencing a pointer to long
will yield a valid value. Example:

long *p;
int *p1;
long value = LONG_MAX;

p = &value;

It "can hold any values" only in the following sense:
memcpy(&i, &p1, sizeof(p));

Knowing that sizeof(long*)==sizeof(int*) is NOT sufficient to guarantee
that every value of type 'long*' is correctly aligned to allow
conversion to 'int*'. It's perfectly legal (though unlikely) for an
implementation to have stricter alignment requirements for 'int' than
for 'long', so there's no gurantee that following statement has defined
behavior:

Even if 'p' is correctly aligned, the ONLY thing that the standard
guarantees for you about the value of 'p1' is that if this value is
converted back to 'long*', it will point at 'value'. The standard says
nothing about which piece of memory 'p1' points at. Therefore, the
following statement:

If you were to convert the value of p1 back to p and produce
the original value, then this implies that pointer objects must
be wide enough to hold pointer values of any pointer object type, yes?

Then the only issue with respect to alignment, is whether or
not the object types pointed at share the same alignment
requirements. So if int and long did share the same alignment
requirements, then the above dereference of p1, would be legal?
has undefined behavior. Of course, in practice, by far the most likely
thing is that it points at the first sizeof(int) bytes of 'value'.
However, even if it does point at that location, you have no gurantees
about which bits from 'value' are stored in those bytes, nor in what
order those bits are stored. You also have no guarantees from the
standard which order the bits of an 'int' are stored in. As signed
types, both 'int' and 'long' are allowed to have padding bits, and the
standard provides no guarantees about the location of those padding
bits, either. In short, nothing useful is guaranteed by the standard
about the results of evaluating *p1.


You're guranteed to be able to access the bytes of an object as an
array of unsigned char (6.2.6.1p4) - the standard gives memcpy() as an
example of one way of copying it, but doesn't specify that this is the
only way. Since memcpy() is defined as accessing the memory it copies
as an array of unsigned char, directly accessing the object as an array
of char must also work.

6.2.6.1p5 allows undefined behavior due to the object having a trap
representation only when the lvalue being used does not have character
type.

But I was curious about what c89/90 does. I thought trap representation
was introduced in c99?
 
K

kuyper

aegis said:
....
If you were to convert the value of p1 back to p and produce
the original value, then this implies that pointer objects must
be wide enough to hold pointer values of any pointer object type, yes?

No. The conversion has defined behavior only for pointers that are
correctly aligned for the target type. You can only draw useful
conclusions from the cases which have defined behavior. The fact that
the conversion is reversible implies that the pointer type has enough
width to distinguish every memory location that is correctly aligned
for that type; it need not be wide enough to distinguish pointer values
that aren't correctly aligned; they needn't be distinguisheable from
each other, and they needn't be distinguisheable from correctly aligned
pointers.

My other point is that, even if a pointer type is wide enough that it
could distinguish those pointer values, since the behavior is
undefined, the implementation is under no obligation to actually allow
the conversion of those values to that pointer type work.
Then the only issue with respect to alignment, is whether or
not the object types pointed at share the same alignment
requirements. So if int and long did share the same alignment
requirements, then the above dereference of p1, would be legal?

Um - the dereference of p1 is below, not above.

No, even if the alignment requirements are the same, the standard says
absolutely nothing about what location p1 points at. Of course, on
most, and probably all, real implementations, p1 points at the same
location in memory as p. But even using that assumptiion isn't enough
to make that code safe. Since 'int' is allowed to have trap
representations, and the standard provides no guarantees about which
bits of 'value' are in the locations that would be accessed by
dereferencing p1, it's possible that those bits define a trap
representation.
....
But I was curious about what c89/90 does. I thought trap representation
was introduced in c99?

I'm sorry - while you mentioned C89/90, I thought that you were asking
for a comparison of the current rules with the C89/90 rules. I don't
have a copy of those standards, so I can't comment on them, which is
why I ignored that part of your question.

I believe that my most important point is just as true in C89 as in
C99: the standard says absolutely nothing about the location pointed at
by p1; it says only that a pointer at that location, when converted
back into a pointer to long, will point at 'value'.
 
A

aegis

No. The conversion has defined behavior only for pointers that are
correctly aligned for the target type. You can only draw useful
conclusions from the cases which have defined behavior. The fact that
the conversion is reversible implies that the pointer type has enough
width to distinguish every memory location that is correctly aligned
for that type; it need not be wide enough to distinguish pointer values
that aren't correctly aligned; they needn't be distinguisheable from
each other, and they needn't be distinguisheable from correctly aligned
pointers.

What is suitable criteria then for determining if a pointer to object
type
is properly aligned for some other pointer to object type?
 
C

Christian Bau

"aegis said:
What is suitable criteria then for determining if a pointer to object
type
is properly aligned for some other pointer to object type?

You can use a method that you know works on your implementation, or on
several implementations. Just don't expect it to be portable to all
implementations.

A method that works in cases where a pointer p2 is known to be derived
from a pointer p1 that was allocated by malloc: Test whether ((char *)
p2 - (char *) p1) % sizeof (T) == 0 or not.
 
K

kuyper

aegis wrote:
....
What is suitable criteria then for determining if a pointer to object
type
is properly aligned for some other pointer to object type?

There is no portable way to check an arbitrary pointer to determine
whether it's suitably aligned for conversion to an arbitrary pointer
type. That's partly because there's seldom a legitimate portable need
to perform such a conversion. If you think that you do need it, either
you're trying to do something which is intrinsically non-portable, or
there's a good chance that your code can be re-designed your code to
remove that need.

However, there are some special cases where it is possible to determine
that a pointer is suitably aligned:

1) If the alignment requirement for type T is an integer multiple of
the alignment requirement for type U, then every pointer of type T is
guaranteed to be suitably aligned for type U. The only way to be
absolutely sure what the alignment requirement for a given type is, is
to check the documentation for the implementation you are using.
However, sizeof(U) must be a multiple of the alignment requirement for
U, and can be used instead of the alignment requirement if you're
willing to accept that you may fail to identify some suitably aligned
pointers.
Note, in particular, that any array of T, and any struct or union
containing a member of type T, must have an alignment requirement that
is a multiple of the alignment requirement for T.

2) If you have a pointer to an object of type U, which is a member of
an aggregate object, and a pointer to type T which points into that
same object, then you can convert those two pointers to a character
type, and calculate their difference. If that difference is a multiple
of sizeof(U), the T* pointer value can be safely converted to a U*
type.

3) If sizeof(U)==1, then every pointer value is guranteed to be
suitably aligned for conversion to a U*.
 
A

aegis

aegis wrote:
...

There is no portable way to check an arbitrary pointer to determine
whether it's suitably aligned for conversion to an arbitrary pointer
type. That's partly because there's seldom a legitimate portable need
to perform such a conversion. If you think that you do need it, either
you're trying to do something which is intrinsically non-portable, or
there's a good chance that your code can be re-designed your code to
remove that need.

However, there are some special cases where it is possible to determine
that a pointer is suitably aligned:

1) If the alignment requirement for type T is an integer multiple of
the alignment requirement for type U, then every pointer of type T is
guaranteed to be suitably aligned for type U. The only way to be
absolutely sure what the alignment requirement for a given type is, is
to check the documentation for the implementation you are using.
However, sizeof(U) must be a multiple of the alignment requirement for
U, and can be used instead of the alignment requirement if you're
willing to accept that you may fail to identify some suitably aligned
pointers.
Note, in particular, that any array of T, and any struct or union
containing a member of type T, must have an alignment requirement that
is a multiple of the alignment requirement for T.

Where does the standard say that? I was under the impression
that a struct is aligned according to the alignment of its strictest
member.

2) If you have a pointer to an object of type U, which is a member of
an aggregate object, and a pointer to type T which points into that
same object, then you can convert those two pointers to a character
type, and calculate their difference. If that difference is a multiple
of sizeof(U), the T* pointer value can be safely converted to a U*
type.

You are saying this would work?

int foo[10];
int *p = &foo[8];
char *p1 = &foo[0];

(((char *)p - p1) % sizeof foo[0]) == 0

3) If sizeof(U)==1, then every pointer value is guranteed to be
suitably aligned for conversion to a U*.

Do you have section numbers in either c89/90 or c99 standard
that either explicitly or implicitly allows the above three?

Thank you for your very informing responses!
 
C

Clark S. Cox III

Where does the standard say that? I was under the impression
that a struct is aligned according to the alignment of its strictest
member.

Well, if the alignment of types are all multiples of each other, then
you are essentially correct; but remember that they are not required to
be. Imagine this situation:

- sizeof(short) == 3 and short must be aligned to 3-byte boundaries
- sizeof(int) == 4 and int must be aligned to 4 byte boundaries

struct Foo
{
short s;
int i;
};

struct Foo must be aligned to 12-byte (or some multiple thereof)
boundaries. Otherwise, it would be impossible to make an array of
(struct Foo) without violating the alignment requirements of its
members.
 
F

Flash Gordon

Clark said:
Well, if the alignment of types are all multiples of each other, then
you are essentially correct;

Not always. An implementation might decide to align all structs the same so:
struct foo {
char c;
char c;
};
might be aligned on a 8 byte boundary for simplicity because some other
structs require it. I can certainly see an argument for the it being
done on a machine with 64 bit word that simulate 8 bit bytes by storing
the offset within the word in the high bits of the pointer.
> but remember that they are not required to
be. Imagine this situation:

- sizeof(short) == 3 and short must be aligned to 3-byte boundaries
- sizeof(int) == 4 and int must be aligned to 4 byte boundaries

struct Foo
{
short s;
int i;
};

struct Foo must be aligned to 12-byte (or some multiple thereof)
boundaries. Otherwise, it would be impossible to make an array of
(struct Foo) without violating the alignment requirements of its members.

Yes, that is another reason it might be done.
 
K

kuyper

aegis said:
Where does the standard say that? I was under the impression
that a struct is aligned according to the alignment of its strictest
member.

a) The offset between the start of a structure and any particular
member of that structure depends only upon the structure type and the
member, it is the same for all objects of that structure type. This is
implied by the existence of the offsetof() function-like macro, which
would be useless if that offset weren't constant.

b) An array of any type T consists of a continguous series of objects
of type T, each of which must be correctly aligned for type T.

c) Therefore, if you have an array of a given structure type, for each
member of the structure type,

(char*)&array[i+1].member_name - (char*)&array.member_name ==
sizeof(array).

Since both instances of member_name must be correctly aligned for their
type, sizeof(array) must be a multiple of the alignment requirement
of that type. Therefore, the alignment requirement for a structure must
be, simultaneously, a multiple of the alignment requirements of every
member of that structure. However, there's nothing in the standard
which requires it to be the smallest such multiple. Therefore, on a
system where type T has an alignment requirement of 3, and type U has
an alignment requirement of 5, then a struct containing members of both
of those types would have to have an alignment requirement that is a
multiple of 15.

Now, on most real systems, all of the alignment requirments are powers
of two. In that situation, the alignment requirement of the member
with the strictest alignment requirement is a multiple of the alignment
requirement of every other member in the struct. Therefore, on such a
system, the alignment requirement for the struct itself is required to
be a multiple of the alignment requirement of it's most strictly
aligned member, and is usually the same as that member's alignement
requirement.

....
You are saying this would work?

int foo[10];
int *p = &foo[8];
char *p1 = &foo[0];

(((char *)p - p1) % sizeof foo[0]) == 0
Yes.
3) If sizeof(U)==1, then every pointer value is guranteed to be
suitably aligned for conversion to a U*.

Do you have section numbers in either c89/90 or c99 standard
that either explicitly or implicitly allows the above three?

If sizeof(U)==1, then since arrays of U are required to be contiguous,
the alignment requirement of U must also be 1. Therefore, it's not
possible to have a pointer that is misaligned for conversion to U*.

I can give section numbers for each of the assertions above; however,
it's simpler if you identify which assertions strike you as
controversial, and why.
 
A

aegis

aegis said:
Where does the standard say that? I was under the impression
that a struct is aligned according to the alignment of its strictest
member.

a) The offset between the start of a structure and any particular
member of that structure depends only upon the structure type and the
member, it is the same for all objects of that structure type. This is
implied by the existence of the offsetof() function-like macro, which
would be useless if that offset weren't constant.

b) An array of any type T consists of a continguous series of objects
of type T, each of which must be correctly aligned for type T.

c) Therefore, if you have an array of a given structure type, for each
member of the structure type,

(char*)&array[i+1].member_name - (char*)&array.member_name ==
sizeof(array).

Since both instances of member_name must be correctly aligned for their
type, sizeof(array) must be a multiple of the alignment requirement
of that type. Therefore, the alignment requirement for a structure must
be, simultaneously, a multiple of the alignment requirements of every
member of that structure. However, there's nothing in the standard
which requires it to be the smallest such multiple. Therefore, on a
system where type T has an alignment requirement of 3, and type U has
an alignment requirement of 5, then a struct containing members of both
of those types would have to have an alignment requirement that is a
multiple of 15.

Now, on most real systems, all of the alignment requirments are powers
of two. In that situation, the alignment requirement of the member
with the strictest alignment requirement is a multiple of the alignment
requirement of every other member in the struct. Therefore, on such a
system, the alignment requirement for the struct itself is required to
be a multiple of the alignment requirement of it's most strictly
aligned member, and is usually the same as that member's alignement
requirement.

...
You are saying this would work?

int foo[10];
int *p = &foo[8];
char *p1 = &foo[0];

(((char *)p - p1) % sizeof foo[0]) == 0
Yes.
3) If sizeof(U)==1, then every pointer value is guranteed to be
suitably aligned for conversion to a U*.

Do you have section numbers in either c89/90 or c99 standard
that either explicitly or implicitly allows the above three?

If sizeof(U)==1, then since arrays of U are required to be contiguous,
the alignment requirement of U must also be 1. Therefore, it's not
possible to have a pointer that is misaligned for conversion to U*.

I can give section numbers for each of the assertions above; however,
it's simpler if you identify which assertions strike you as
controversial, and why.


It isn't that I find any of what you listed controversial but rather
to read as a supplement to what I have read here.
 
K

kuyper

aegis said:
(e-mail address removed) wrote: ....

It isn't that I find any of what you listed controversial but rather
to read as a supplement to what I have read here.

OK - here's a list of relevant sections:

Definition of alignment: 3.2p1

Requirement of contiguity of arrays: 6.2.5p20

Size of an object is implementation-defined constant depending only
upon the type of the object: 6.2.6.1p4. Also implied by description of
sizeof operator:6.5.3.4p2

Alignment of structure members: 6.2.7.1p12

Description of offsetof() implies that it it an implementation-specific
constant depending only upon the struct type and member identifier:
7.17p3. There really should be a more direct statement of that fact in
6.2.7.1, but I couldn't find it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,565
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top