Reading Struct not Located at Four-boundary

T

thomas

Hi,

I have a struct A(undefined, can be any form) located at memory
pointed by "char *ptr".
I want to read it with "(A*)ptr".

Now I wonder if the position of ptr may affect the behavior when
accessing the struct.

Consider the following condition:
-----------------------------------------------
_ _ _ _ | _ _ _ _ |
0 1 2 3 4 5 6 7
position of pointer "ptr" = 2.

If the first element of struct A is an int-type one, it will span
positions 2~5.
The CPU will load 0~3, and then 4~7 to get the data of the int-type
member.
But we C++ programmers don't need to care the position of ptr, right?

(I remember some accesses to unbounded memory positions may get system
crash, but I cannot remember in which case. A little confused.)
 
T

thomas

(e-mail address removed):








Accessing unaligned data is UB, depending on the platform this may crash
or produce wrong results. On some platforms it may works (notably x86),
but with a performance penalty.

For portable code one should memcpy the data in a suitable aligned buffer
before casting it to int*.


We should care that our programs work (not only at the moment, but also
on other systems and 20 years from now).

Regards
Paavo

OK.. One thing I want to make sure is whether "new/malloc" will
guarantee that the address of the allocated memory will be aligned
properly?
 
F

Francesco S. Carta

OK.. One thing I want to make sure is whether "new/malloc" will
guarantee that the address of the allocated memory will be aligned
properly?

I don't know about "malloc", but for "new" the answer is yes, it will be
properly allocated for the type with which "new" is called.

You can't obviously allocate via "new" for a type then cast the result
to another type and finally expect it to be correctly aligned for the
latter type: some cases will work, some others won't, but no guarantee
there - if I recall correctly.
 
F

Francesco S. Carta

I don't know about "malloc", but for "new" the answer is yes, it will be
properly allocated for the type with which "new" is called.

(the above should read "properly _aligned_ for" etcetera)
 
Ö

Öö Tiib

OK.. One thing I want to make sure is whether "new/malloc" will
guarantee that the address of the allocated memory will be aligned
properly?

Allocation functions return pointers to storage that is appropriately
aligned for objects of any type. You can provide your own allocation
functions, then these are also assumed to do the same.
 
F

Francesco S. Carta

Allocation functions return pointers to storage that is appropriately
aligned for objects of any type.

I'm not sure I understand this correctly, and if I do, I was not aware
of this: does that mean that "new" always allocates using the stricter
alignment requirements for any possible type _regardless_ of the type it
is called with?

I strongly suspect I misunderstood your statement.
 
T

thomas

Allocation functions return pointers to storage that is appropriately
aligned for objects of any type. You can provide your own allocation
functions, then these are also assumed to do the same.

But if I use placement new, I can specify the address of a struct (may
be not properly aligned).
It's programmers' own risk to do this right?
 
I

Ian Collins

But if I use placement new, I can specify the address of a struct (may
be not properly aligned).
It's programmers' own risk to do this right?

Very much so.
 
I

Ian Collins

I'm not sure I understand this correctly, and if I do, I was not aware
of this: does that mean that "new" always allocates using the stricter
alignment requirements for any possible type _regardless_ of the type it
is called with?

operator new only knows the size of an object, not its type.
 
F

Francesco S. Carta

operator new only knows the size of an object, not its type.

OK, for the matter of correct alignment knowing the size is enough.

My question, somewhat, still stands, but I'm pretty sure I completely
misunderstood the original statement so never mind, sorry for the
additional noise.
 
I

Ian Collins

OK, for the matter of correct alignment knowing the size is enough.

My question, somewhat, still stands, but I'm pretty sure I completely
misunderstood the original statement so never mind, sorry for the
additional noise.

No, for alignment, the size is irrelevant. As Öö Tiib said, allocation
functions return pointers to storage that is appropriately aligned for
objects of any type.

See 5.3.4 New, paragraphs 10 and 12.
 
F

Francesco S. Carta

No, for alignment, the size is irrelevant. As Öö Tiib said, allocation
functions return pointers to storage that is appropriately aligned for
objects of any type.

See 5.3.4 New, paragraphs 10 and 12.

Thanks for the reference, after reading it I think I understand how it
works, but now your first sentence above seems to clash with that
understanding.

For example, if I write "new char", the alignment requirements are
different from, say, "new long", how come that the size does not play
into this?

As I understand it now, the size of the requested type plays a
fundamental role in order to decide the minimal alignment requirements
to respect - exception made for char arrays and unsigned char arrays,
where the minimal requirement amounts to the largest possible type
fitting into that array.

What am I missing now?
 
G

Goran Pusic

Hi,

I have a struct A(undefined, can be any form) located at memory
pointed by "char *ptr".

Take a good look at your design: I bet you that type is not
"undefined", you have a finite set of types, don't you? and so, your
char* is actually

union meh
{
TYPE1* p1;
TYPE2* p2;
TYPE3* p3;
// etc.
}

(and void* is better than char*).

(But that is tangential).
I want to read it with "(A*)ptr".

Now I wonder if the position of ptr may affect the behavior when
accessing the struct.

Consider the following condition:

Is it possible that your actual code is e.g.:

void parseBinary(void* in)
{
switch(*reinterpret_cast<uint16*>(in))
{
case T1:
TYPE1* p = reinterpret_cast<TYPE1*>(uintptr_t+sizeof(uint16));
use(*p);
break;
case T2:
TYPE2* p = reinterpret_cast<TYPE2*>(uintptr_t+sizeof(uint16));
use(*p);
break;
}
}

If it is, say so! ;-)

As noted here, if this data is e.g. coming over the wire, you must
know whether your hardware platform supports alignment of raw data as
it is on the wire, and whether you can create appropriate packing with
the compiler (normally yes, but that's compiler-specific). If hardware
supports unaligned data, you can simply cast and avoid some copying.
If not, you must write appropriate conversion routines to convert raw
primitive types hidden in the "wire format" and copy that to your
actual data types.

Goran.
 
F

Francesco S. Carta

Thanks for the reference, after reading it I think I understand how it
works, but now your first sentence above seems to clash with that
understanding.

For example, if I write "new char", the alignment requirements are
different from, say, "new long", how come that the size does not play
into this?

As I understand it now, the size of the requested type plays a
fundamental role in order to decide the minimal alignment requirements
to respect - exception made for char arrays and unsigned char arrays,
where the minimal requirement amounts to the largest possible type
fitting into that array.

Keep in mind that "new char" and operator new(1) are two different
things. "new char" calls operator new to get memory, then constructs the
char object (which in the case of char does nothing). operator new
allocates the requested amount of memory and returns a void* that points
to it. It doesn't know the type that you're allocating.

The requested size isn't a reliable guide to a type's alignment. For
example:

struct a
{
double d;
};

struct b
{
char data[sizeof double];
};

Both structs are typically the same size, but the first might have
alignment requirements that are different from those of the second. Char
arrays usually have no alignment restrictions, while some architectures
require doubles to be aligned on 8-byte boundaries.

I believe you're sincerely trying to help me wrap my head around this,
but you're confusing me even more.

The standard says, in [expr.new] 5.3.4p10, that char arrays do have such
restrictions, but it says that related to char arrays created using
"new", that would mean that "new char[sizeof(double)]" should return an
address properly aligned for any object of the same size of a double.

Are you saying that the requirements are different for automatic and
static char arrays as opposed to dynamically created ones?
 
F

Francesco S. Carta

On 2010-08-20 08:04:43 -0400, Francesco S. Carta said:


Thanks for the reference, after reading it I think I understand how it
works, but now your first sentence above seems to clash with that
understanding.

For example, if I write "new char", the alignment requirements are
different from, say, "new long", how come that the size does not play
into this?

As I understand it now, the size of the requested type plays a
fundamental role in order to decide the minimal alignment requirements
to respect - exception made for char arrays and unsigned char arrays,
where the minimal requirement amounts to the largest possible type
fitting into that array.


Keep in mind that "new char" and operator new(1) are two different
things. "new char" calls operator new to get memory, then constructs the
char object (which in the case of char does nothing). operator new
allocates the requested amount of memory and returns a void* that points
to it. It doesn't know the type that you're allocating.

The requested size isn't a reliable guide to a type's alignment. For
example:

struct a
{
double d;
};

struct b
{
char data[sizeof double];
};

Both structs are typically the same size, but the first might have
alignment requirements that are different from those of the second. Char
arrays usually have no alignment restrictions, while some architectures
require doubles to be aligned on 8-byte boundaries.

I believe you're sincerely trying to help me wrap my head around this,
but you're confusing me even more.

The standard says, in [expr.new] 5.3.4p10, that char arrays do have
such restrictions, but it says that related to char arrays created
using "new", that would mean that "new char[sizeof(double)]" should
return an address properly aligned for any object of the same size of
a double.

Are you saying that the requirements are different for automatic and
static char arrays as opposed to dynamically created ones?

Sorry, I made a bit of a muddle here. The key point is that alignment is
defined by the hardware; whatever the standard says about alignment is
pretty much handwaving, to avoid putting unnecessary restrictions on
implementations.

No problem, the important thing is that the pieces are lining up neatly
in my mind, now.
Those words in [expr.new] try to put some logic into what typically
hasn't been explicitly laid out in the past. The underlying principle is
that the size of the requested allocation might eliminate some types,
because they're too large to fit in the requested space; in that case,
the alignment of the returned pointer doesn't have to be appropriate for
those types, because the memory block is too small to hold them. For
example, the result of calling operator new(4) doesn't have to be
aligned appropriately for an 8-bit double, only for types that occupy
four bits or less.

Because of the special role of char, I shouldn't have used it in my
example. Change the second struct to:

struct b
{
unsigned short data[sizeof(double) / sizeof(unsigned short)];
};

(and assume that the size of double is 8 bytes, and that the size of
unsigned short is 4 or 2, which are typical).

So both structs are the same size, and from a hardware perspective, both
structs can have different alignment requirements. But since operator
new has been asked for 8 bytes and doesn't have any more information, it
has to return a pointer that will work for either of those types (as
well as all other 8-byte types, but that's just a difference in degree,
not in kind).

OK, let's see if I got this straight. Since the size of those two
structs is the same, they will both fit correctly aligned in a space
allocated via "new char[sizeof(a)]" (or via "new char[sizeof(b)]", for
that matter).

But "new short[sizeof(double) / sizeof(short)]", alone and by itself,
has to satisfy the alignment requirements of "short", regardless of the
size of the array itself.

So in some sense, both the type and the size of that type do play an
important role in the alignment of the address returned by dynamic
allocators - although that role is implementation defined within the
limits mandated by the standard.

Thanks a lot for your explanations Pete.
 
J

James Kanze

I have a struct A(undefined, can be any form) located at memory
pointed by "char *ptr".

Where did you get the char* from?
I want to read it with "(A*)ptr".

If the pointer was originally an A*, then it should work.
Otherwise, you're looking for problems.

Note that you should be using reinterpret_cast here, so that
future readers of your code know that you're playing with fire.
Now I wonder if the position of ptr may affect the behavior when
accessing the struct.
Consider the following condition:
If the first element of struct A is an int-type one, it will span
positions 2~5.
The CPU will load 0~3, and then 4~7 to get the data of the int-type
member.
But we C++ programmers don't need to care the position of ptr, right?

If the pointer initially came from an A*, it won't have position
2.

How did the data get into this array? If the data is from some
external source, alignment isn't the only problem you have to
worry about.
(I remember some accesses to unbounded memory positions may
get system crash, but I cannot remember in which case.
A little confused.)

Out of bounds or misaligned are undefined behavior, and can
cause your program to crash.
 
J

James Kanze

I don't know about "malloc", but for "new" the answer is yes, it will be
properly allocated for the type with which "new" is called.

If he was comparing to malloc, he probably meant the operator
new function. Both malloc and the operator new function are
guaranteed to return memory suitably aligned for any object.
You can't obviously allocate via "new" for a type then cast
the result to another type and finally expect it to be
correctly aligned for the latter type: some cases will work,
some others won't, but no guarantee there - if I recall
correctly.

I you use a new expression to allocate, then go casting to some
unrelated type, you don't have any guarantees, except that you
can reinterpret_cast to char* and get a byte dump of the object
(and you can reinterpret_cast the char* back to the original
type, and use that pointer).
 
F

Francesco S. Carta

If he was comparing to malloc, he probably meant the operator
new function. Both malloc and the operator new function are
guaranteed to return memory suitably aligned for any object.

Suitably aligned for any object that fits into the allocated space, as I
seem to have understood, so far, from the other branches of this thread.

In any case, I was surely overlooking (once more) the difference between
using plain "new" and calling "the operator new function". How were they
commonly referred to? "new expression" and "operator new"? I never
recall this correctly and I have to resort to some weird periphrasis
that don't really fit well.
I you use a new expression to allocate, then go casting to some
unrelated type, you don't have any guarantees, except that you
can reinterpret_cast to char* and get a byte dump of the object
(and you can reinterpret_cast the char* back to the original
type, and use that pointer).

Yep, I recalled correctly, "no guarantee there".
 
I

Ian Collins

OK, let's see if I got this straight. Since the size of those two
structs is the same, they will both fit correctly aligned in a space
allocated via "new char[sizeof(a)]" (or via "new char[sizeof(b)]", for
that matter).

I think you are still missing the point that size and alignment are
orthogonal.
But "new short[sizeof(double) / sizeof(short)]", alone and by itself,
has to satisfy the alignment requirements of "short", regardless of the
size of the array itself.

So in some sense, both the type and the size of that type do play an
important role in the alignment of the address returned by dynamic
allocators - although that role is implementation defined within the
limits mandated by the standard.

No, the size is still irrelevant to the guarantee. Whether you are
allocating a double or an array of sizeof(double) char or a single char,
the alignment of the pointer will be the same. The pointer returned
will always be appropriately aligned for objects of any type.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,832
Latest member
GlennSmall

Latest Threads

Top