char data[0]

A

Andrey Tarasevich

jmcgill said:
Take deep-copy semantics into account early in your design.

Frivolous and trigger-happy introduction of absolutely unnecessary and
unnatural levels of indirection in data structures lead to design
errors. Especially if done early in the design.
 
G

Guest

Keith said:
No. The standard does say that

All pointers to structure types shall have the same representation
and alignment requirements as each other.

but that refers to the alignment of pointer objects, not to the
alignment of structure objects.

That was actually in part what I was thinking of, but for a different
reason. How can pointers to structure types with different alignment
requirements have the same representation? The standard doesn't say the
corresponding signed and unsigned integer types have the same
representation, so I find it hard to believe it's meant to say that
only the representations for values valid for both types are the same.
It does say the representation of all integer types in #if expressions
are the same as that of intmax_t or uintmax_t, and for that, it's clear
that it refers to which values can be held by the types.
For example, this type:

struct foo {
char c;
};

can reasonably have a size and alignment of 1 byte.

Let's say that it does, and that the size and alignment of int, or a
structure containing only an int, is 2 bytes.

#include <stdlib.h>
#include <string.h>
struct c {
char c;
};
struct i {
int i;
};

int main(void) {
struct c c;
struct c *pc1 = &c;
struct c *pc2 = &c + 1;
/* if struct i is allowed to have, and has, stricter alignment
requirements */
/* than struct c, either pc1 is not properly aligned for it, or pc2
is not */

struct i *pi1, *pi2;

memcpy(&pi1, &pc1, sizeof pc1);
memcpy(&pi2, &pc2, sizeof pc2);
printf("%p\n", (void *) pi1, (void *) pi2);
/* 1 */

pi1 = (struct i *) pc1;
pi2 = (struct i *) pc2;
/* 2 */
printf("%p\n", (void *) pi1, (void *) pi2);
}

At /*1*/, has there been undefined behaviour? If not, how about at
/*2*/? I believe the answers with the given alignment requirements are
no for /*1*/, and yes for /*2*/, but if that's the case, what's the
benefit of making /*2*/ undefined in the first place?
 
C

CBFalconer

Harald said:
.... snip ...

All structs are required to have the same alignment, right? Since
the initial member of any struct can be of any type, isn't it then
required that any struct is aligned properly for any type? And if
that is, isn't one past the end of any struct required to be
aligned properly for any type as well, in order for arrays of
structs to work?

(I know I'm probably overlooking something, but I don't know what.)

Using the struct hack:

struct foo {
T bar;
char foobar[1];
}

"sizeof (struct foo)" will reflect that alignment and will have to
make anything assigned after the struct suitably aligned for a T.
However offsetof(struct foo, foobar) will not. Thus you can't use
the foobar component for things that may require anything other
than char alignment. This doesn't apply to the C99 use of "char
foobar[]".

--
Some informative links:
< <http://www.geocities.com/nnqweb/>
<http://www.catb.org/~esr/faqs/smart-questions.html>
<http://www.caliburn.nl/topposting.html>
<http://www.netmeister.org/news/learn2quote.html>
<http://cfaj.freeshell.org/google/>
 
B

Bill Reid

Keith Thompson said:
Bill Reid said:
Keith Thompson said:
Of course, you can allocate a "data" pointer contiguously in a struct
in any event, right?

typedef struct {
unsigned data_type;
unsigned data_size;
unsigned *data;
} contiguous_data_struct;

contiguous_data_struct *my_contiguous_data_struct;

void create_struct(unsigned data_type,unsigned data_size) {
my_contiguous_data_struct=
malloc(sizeof(contiguous_data_struct)+(data_size*sizeof(unsigned));
my_contiguous_data_struct->data_type=data_type;
my_contiguous_data_struct->data_size=data_size;
my_contiguous_data_struct->data=
my_contiguous_data_struct+sizeof(contiguous_data_struct);
}

And now you may write the data to my_contiguous_data_struct->data(++).
[...]

I see two problems with this.
Actually, as it turns out, there were at least a couple more, but
who's counting?
First, there's no guarantee that
my_contiguous_data_struct+sizeof(contiguous_data_struct)

s/be (void *)my_contiguous_data_struct+1
is properly aligned.

It took me a while to realize that "s/be" meant "should be". It's
worth the effort to use whole words.
THAT'S a pretty common usage...
And it should really be (void*)(my_contiguous_data_struct+1);
otherwise you're adding 1 to a value of type void*, which is illegal
(but gcc will accept it with no warning by default -- another reason
why that extension is a bad idea).
Ah yes, in another post my actual working code was listed as:

(*csv_efileb)->buffer=(void *)(*csv_efileb+1);

when dealing with a pointer to a pointer passed as an argument...
It could really happen. I'll construct an example similar to what you
wrote above:
Well, in the interim I answered my own question, and more. The
issue for most systems today is "self-alignment" for all types, and the
bottom line as I take it is that you can't have a larger size for your
"data" than the largest size in the struct. (If you had a couple of
unsigned chars in the struct, and a "data buffer" of doubles, you'd
be hosed.) Soooo, I'm quite sure the following would create a
big mess on my machine (though all the examples I've given would
work on most all machines):
#include <stdlib.h>

struct contiguous_data {
unsigned data_type;
unsigned data_size;
double *data;
};
Yeah, your data is bigger than the biggest size in the struct itself.
NG (that means "No(t) Good" or "No Go").
struct contiguous_data *create_struct(unsigned data_type, unsigned data_size)
{
struct contiguous_data *result
= malloc(sizeof(struct contiguous_data) +
data_size * sizeof(double));
if (result == NULL) {
return NULL;
}
result->data_type = data_type;
result->data_size = data_size;
result->data = (double*)(result + 1);
return result;
}

Suppose types unsigned and (double*) are 4 bytes, requiring 4-byte
alignment, and double is 8 bytes, requiring 8-byte alignment.
Assuming struct contiguous_data has no gaps, its size is 12 bytes;
let's say it requires 4-byte alignment. And suppose we call
create_struct() with data_size == 2.

Then create_struct() will malloc() 12 + 2*8 bytes, or 28 bytes. The
base address of the malloc()ed block is guaranteed to be properly
aligned for any type. We treat the first 12 bytes as a struct
contiguous_data object, which is fine. We then treat the last 16
bytes, starting at offset 12, as an array of 2 doubles -- but since 12
is not a multiple of 8, it's not properly aligned to hold doubles.

Misalignment might be less likely in your original example, but
it's still possible.
As I take it, if you are dealing with chars as "data", you're pretty
much OK in like 99% of the cases...
If you used the "struct hack", you'd declare:

struct contiguous_data {
unsigned data_type;
unsigned data_size;
double data[1];
};

(or "double data[];" if you use a C99 flexible array member).

Well, as it turns out, I learned many things today, and one of them
is that my compiler, although obstensibly NOT "C99" compliant, offers
the "flexible array member" (SHOULD BE: "indeterminate array as the
last member of a struct") feature "as a special extension to the ANSI
standard". I should have known, because they have all kinds of goofy
stuff like that in there, but I first experimented by changing some working
code as follows:

typedef struct {
unsigned type;
unsigned long size;
unsigned cols;
unsigned rows;
char *buffer;
} CSV_EFILEB;

to

typedef struct {
unsigned type;
unsigned long size;
unsigned cols;
unsigned rows;
char buffer[];
} CSV_EFILEB;

And VOILA (that's like Italian or something, sorry), the whole thing
worked slicker than a bannana slug trail!
The
compiler knows the required alignment of type double, so it inserts
whatever padding is necessary. (We dropped the pointer, so it happens
to be aligned anyway, but we could easily have an example where
padding is necessary.)
Yeah, but for the sake of pure expediancy and correctness, I guess
I should just use "indeterminate arrays as a final member of struct",
aside from any potential "backwards-compatibility" issues...
In your example, you placed the follow-on data manually without
allowing for alignment issues. The compiler didn't have a chance to
align it properly.


Sure, but the struct hack is more convenient, even if it's of somewhat
questionable validity.
Yes, it does...and I'll be experimenting a little more to see just how
"flexible" it really is...stuff like arrays of void pointers to be cast into
different struct types, you know...
Maybe. Probably.

Question 2.6 in the FAQ says:

Despite its popularity, the technique is also somewhat notorious:
Dennis Ritchie has called it ``unwarranted chumminess with the C
implementation,'' and an official interpretation has deemed that
it is not strictly conforming with the C Standard, although it
does seem to work under all known implementations. (Compilers
which check array bounds carefully might issue warnings.)
I think they also allowed possible "pathological" POTENTIAL
alignment issues, though again, not with type char as "data"...
If you don't trust the struct hack, you can always just allocate the
data separately:

#include <stdlib.h>

struct contiguous_data {
unsigned data_type;
unsigned data_size;
double *data;
};

struct contiguous_data *create_struct(unsigned data_type, unsigned data_size)
{
struct contiguous_data *result = malloc(sizeof *result);
if (result == NULL) {
return NULL;
}
result->data_type = data_type;
result->data_size = data_size;
result->data = malloc(data_size * sizeof(double));
if (result->data == NULL) {
free(result);
return NULL;
}
return result;
}

This will require two calls to free() to deallocate the allocated
memory.
That's SOOOOOOO much work...
 
B

Bill Reid

Andrey Tarasevich said:
No. Why? The only thing that is required is that each given struct is
properly aligned for its own and only its own members. The alignment
requirement of the struct itself is indeed determined by its first
member, since other members can be aligned independently by introducing
internal padding.
As I learned the hard way (reading the FAQ) last night, structs are
actually "self-aligned" (translation: "start at") "by their most restrictive
member", meaning the largest type in the struct (or this is how I
understand it).

Of course, the way you say it actually makes more sense, so maybe
I'm just confused AGAIN...on the other hand, the compiler COULD
easily adjust all the internal struct padding to conform to the above
"self-alignment" rule for any given struct, so maybe THAT makes
more sense...
 
C

Chris Dollin

Bill said:
As I learned the hard way (reading the FAQ) last night, structs are
actually "self-aligned" (translation: "start at") "by their most restrictive
member", meaning the largest type in the struct (or this is how I
understand it).

Which bit of the FAQ reads like that?
 
B

Bill Reid

Chris Dollin said:
Which bit of the FAQ reads like that?
2.12, "Additional Links"->Eric Raymond post:

On modern 32-bit machines like the SPARC or the Intel [34]86, or
any Motorola chip from the 68020 up, each data iten must usually be
``self-aligned'', beginning on an address that is a multiple of its type
size.
Thus, 32-bit types must begin on a 32-bit boundary, 16-bit types on a
16-bit boundary, 8-bit types may begin anywhere, struct/array/union
types have the alignment of their most restrictive member.

---end of excerpt

So as I take it, compilers on most 32-bit machines today set up
struct alignment based on the aligment of "their most restrictive member".

Discuss among yourselves...currently, I'm still wondering how to
set up a "contiguous data struct" with the idea of casting the "data"
to any number of different types, something I've wondered about
for a long time...
 
P

pete

Bill said:
Chris Dollin said:
Which bit of the FAQ reads like that?
2.12, "Additional Links"->Eric Raymond post:

On modern 32-bit machines like the SPARC or the Intel [34]86, or
any Motorola chip from the 68020 up, each data iten must usually be
``self-aligned'', beginning on an address that is a multiple of its type
size.
Thus, 32-bit types must begin on a 32-bit boundary, 16-bit types on a
16-bit boundary, 8-bit types may begin anywhere, struct/array/union
types have the alignment of their most restrictive member.

---end of excerpt

So as I take it, compilers on most 32-bit machines today set up
struct alignment based on
the aligment of "their most restrictive member".

Padding bytes, enable member alignment
to be independant of struct alignment,
except for the first member.
 
G

Guest

pete said:
Padding bytes, enable member alignment
to be independant of struct alignment,
except for the first member.

Well, the member must be aligned, it must be at a constant offset from
the start of the structure, and the result of malloc() must both be
properly aligned for the member type as well as for the whole struct.
 
P

pete

=?utf-8?B?SGFyYWxkIHZhbiBExLNr?= said:
Well, the member must be aligned, it must be at a constant offset from
the start of the structure, and the result of malloc() must both be
properly aligned for the member type as well as for the whole struct.

With suitable padding bytes,
a struct with a first member of type char,
could be aligned at any address,
no matter what type the other members were.

If you had 4 byte ints and a struct type
{
char X;
int Y;
}

you could have
X followed by 3 padding bytes
followed by Y
for a struct alignment where the struct was aligned for type int.

If the struct was aligned on the next byte then
X followed by 2 padding bytes
followed by Y
followed by 1 padding byte would work.

If the struct was aligned on the byte after that then
X followed by 1 padding byte
followed by Y
followed by 2 padding bytes would work.

And if the stuct was aligned on the next byte after that then
X contiguous with Y
followed by 3 padding bytes would work.
 
K

Keith Thompson

pete said:
With suitable padding bytes,
a struct with a first member of type char,
could be aligned at any address,
no matter what type the other members were.

If you had 4 byte ints and a struct type
{
char X;
int Y;
}

you could have
X followed by 3 padding bytes
followed by Y
for a struct alignment where the struct was aligned for type int.

If the struct was aligned on the next byte then
X followed by 2 padding bytes
followed by Y
followed by 1 padding byte would work.

If the struct was aligned on the byte after that then
X followed by 1 padding byte
followed by Y
followed by 2 padding bytes would work.

And if the stuct was aligned on the next byte after that then
X contiguous with Y
followed by 3 padding bytes would work.

Interesting idea. Normally, 4-byte alignment is a requirement that
address % 4 == 0. (There isn't really a "%" operator for pointers,
but you get the idea; the standard's definition of alignment,
"requirement that objects of a particular type be located on storage
boundaries with addresses that are particular multiples of a byte
address", implies something like this.)

What you're suggesting is that a type's alignment could require
alignment % 4 == 3.

Unfortunately, I think it would make malloc() impossible to implement
correctly.
 
D

Dave Thompson

typedef struct mall_li_header_ { <snip rest>
char data[0];
} mall_li_header_t;
Provided the structure is declared as above but with the 'data' array of any
non-zero size, the total amount of memory needed for a structure with a trailing
'data' array of size 'n' can be calculated as follows

size_t size;
mall_li_header_t* p;
...
size = offsetof(mall_li_header_t, data) + sizeof(*p->data) * n;
Right. (Although for char the sizeof multiplication could be omitted.)
Unless the compiler aligns differently for different array sizes,
which I think formally is permitted but would break so much code it
would be unacceptable -- and is explicitly prohibited for the
new-in-C99 FAM version which you described but I snipped.
or as

size = sizeof(*p) - sizeof(p->data) + sizeof(*p->data) * n;
But this may waste space. sizeof the whole struct can also include
trailing padding -- although in this particular example, snipped, it
_likely_ does not -- and if so, subtracting only sizeof the last
element gives you a too-high figure. Although that is still safe; the
accesses to both fixed and variable elements within the malloc'ed
space will be correctly positioned and work and the extra space
ignored, unless you do something like memcmp'ing for size to compare
the entire structs, and that isn't reliable in general anyway.


- David.Thompson1 at worldnet.att.net
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,021
Latest member
AkilahJaim

Latest Threads

Top