padding bytes in struct hack

  • Thread starter Hallvard B Furuseth
  • Start date
H

Hallvard B Furuseth

I know the struct hack is dubious because it exceeds the array bounds of
the named string in the struct:

struct hack { <whatever>; char string[1]; };
struct hack *x = malloc(sizeof(*x) + strlen(foo));
strcpy(x->string, foo);

However looking at C99 6.2.6.1p6 (Representations of types - General),
it also seems dubious because there may be padding bytes after string[]
in the struct and their contents become unspecified, no matter what the
program actually stores there. Is that right? I don't remember that at
all from the struct hack discussions I've seen here and in comp.std.c.

C99 6.2.6.1p6 (Representations of types, General):
When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object representation
that correspond to any padding bytes take unspecified values.[42]

Footnote 42) Thus, for example, structure assignment may be
implemented element-at-a-time or via memcpy.

Maybe there's nothing similar in C89, at least I can't find it, though
there is footnote 122 to memcmp (at least in the C89 draft I have here):

122. The contents of ``holes'' used as padding for purposes of
alignment within structure objects are indeterminate, unless the
contents of the entire object have been set explicitly, as by the
calloc or memset function. Strings shorter than their allocated space
and unions may also cause problems in comparison.
 
H

Hallvard B Furuseth

I said:
Maybe there's nothing similar in C89, at least I can't find it, though
there is footnote 122 to memcmp (at least in the C89 draft I have here):
(...)

I meant I can't find anything normative. Footnotes are not normative,
IIRC.
 
T

Tim Rentsch

Hallvard B Furuseth said:
I know the struct hack is dubious because it exceeds the array bounds of
the named string in the struct:

struct hack { <whatever>; char string[1]; };
struct hack *x = malloc(sizeof(*x) + strlen(foo));
strcpy(x->string, foo);

However looking at C99 6.2.6.1p6 (Representations of types - General),
it also seems dubious because there may be padding bytes after string[]
in the struct and their contents become unspecified, no matter what the
program actually stores there. Is that right? I don't remember that at
all from the struct hack discussions I've seen here and in comp.std.c.

C99 6.2.6.1p6 (Representations of types, General):
When a value is stored in an object of structure or union type,
including in a member object, the bytes of the object representation
that correspond to any padding bytes take unspecified values.[42]

Footnote 42) Thus, for example, structure assignment may be
implemented element-at-a-time or via memcpy.

Maybe there's nothing similar in C89, at least I can't find it, though
there is footnote 122 to memcmp (at least in the C89 draft I have here):

122. The contents of ``holes'' used as padding for purposes of
alignment within structure objects are indeterminate, unless the
contents of the entire object have been set explicitly, as by the
calloc or memset function. Strings shorter than their allocated space
and unions may also cause problems in comparison.

Based on the memcmp footnote, it seems likely that C89/C90 and C99
have different models for how padding bytes are treated. Namely,
member assignment doesn't change padding bytes in C89/C90, but may
(to unspecified values) in C99. In practice I expect this allowed
unspecified behavior (ie, to change end-of-struct padding bytes)
just won't be a problem, but if someone were worried about it, how
about just using a larger array so there is no padding at the end?
Something like:

typedef struct { ...stuff... char x[1]; } Blah;
struct uses_struct_hack {
...stuff...
char x[ sizeof (Blah) - offsetof( Blah, x ) ];
};

Of course this sequence isn't guaranteed to produce
a struct with no padding bytes, but that can be
checked with an assertion and adjusted appropriately.
(Two or three iterations like the above should almost
certainly converge on a stable answer.)
 
U

user923005

I know the struct hack is dubious because it exceeds the array bounds of
the named string in the struct:

    struct hack { <whatever>; char string[1]; };
    struct hack *x = malloc(sizeof(*x) + strlen(foo));
    strcpy(x->string, foo);

However looking at C99 6.2.6.1p6 (Representations of types - General),
it also seems dubious because there may be padding bytes after string[]
in the struct and their contents become unspecified, no matter what the
program actually stores there.  Is that right?  I don't remember that at
all from the struct hack discussions I've seen here and in comp.std.c.

C99 6.2.6.1p6 (Representations of types, General):
  When a value is stored in an object of structure or union type,
  including in a member object, the bytes of the object representation
  that correspond to any padding bytes take unspecified values.[42]

  Footnote 42) Thus, for example, structure assignment may be
  implemented element-at-a-time or via memcpy.

Maybe there's nothing similar in C89, at least I can't find it, though
there is footnote 122 to memcmp (at least in the C89 draft I have here):

  122. The contents of ``holes'' used as padding for purposes of
  alignment within structure objects are indeterminate, unless the
  contents of the entire object have been set explicitly, as by the
  calloc or memset function.  Strings shorter than their allocated space
  and unions may also cause problems in comparison.

Related to this C-FAQ:

2.6: I came across some code that declared a structure like this:

struct name {
int namelen;
char namestr[1];
};

and then did some tricky allocation to make the namestr array
act like it had several elements. Is this legal or portable?

A:This technique is popular, although Dennis Ritchie has called it
"unwarranted chumminess with the C implementation." An official
interpretation has deemed that it is not strictly conforming
with the C Standard, although it does seem to work under all
known implementations. (Compilers which check array bounds
carefully might issue warnings.)

Another possibility is to declare the variable-size element very
large, rather than very small; in the case of the above example:

...
char namestr[MAXSIZE];

where MAXSIZE is larger than any name which will be stored.
However, it looks like this technique is disallowed by a strict
interpretation of the Standard as well. Furthermore, either of
these "chummy" structures must be used with care, since the
programmer knows more about their size than the compiler does.

C99 introduces the concept of a "flexible array member", which
allows the size of an array to be omitted if it is the last
member in a structure, thus providing a well-defined solution.

References: Rationale Sec. 3.5.4.2; C9X Sec. 6.5.2.1.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top