Why Tea said:
Thanks Richard. I understood what said. I'd like to apologize for
asking for more questions. But I really would like to get to the
bottom of this.
If the memory is corrupted, wouldn't the system eventually crash
if you run it long enough? If so, then we can be sure that the
corruption will be noticeable.
Maybe, maybe not. There are no guarantees, one way or the other.
For example, if by "corrupting memory" you clobber the value of some
other variable, maybe it's a variable that isn't used again, so it has
no visible effect on the program. Or maybe you clobber memory that's
outside the object you're trying to access, but also outside any other
object. Or maybe you set some variable to a value that happens to be
correct.
There are any number of ways you can corrupt memory with no visible
effect. The risk is that the effect could become visible at the least
convenient possible time -- say, when your software has been deployed
to customers, or when you're demonstrating it to somebody important,
or years later when all the people who are familiar with the code have
left the company. Such is the nature of undefined behavior.
I went back to c-faq to read 2.6 many times again. I'll paste
the code here for easy reference.
#include <stdlib.h>
#include <string.h>
struct name {
int namelen;
char namestr[1];
};
struct name *makename(char *newname)
{
struct name *ret =
malloc(sizeof(struct name)-1 + strlen(newname)+1);
/* -1 for initial [1]; +1 for \0 */
if(ret != NULL) {
ret->namelen = strlen(newname);
strcpy(ret->namestr, newname);
}
return ret;
}
Although not specifically stated, padding is likely
to occur for namestr. So strcpy must have written into
the padding bytes. The faq says "... has deemed that
it is not strictly conforming with the C Standard,
although it does seem to work under all known
implementations...".
Proper use of the struct hack does *not* depend on padding bytes. It
writes outside the bounds of the array, and of the struct that
contains it, but *within* the bounds of the chunk of memory allocated
by malloc. For this to work, you need an implementation that doesn't
do bounds checking; almost all existing implementations qualify. (In
fact, since the struct hack is a common trick, a compiler that broke
it would probably fail in the marketplace.)
On the other hand, there's some risk that an optimizing compiler could
cause problems. Since violating array bounds invokes undefined
behavior, an optimizing compiler is allowed to *assume* that you
haven't done so, even if it doesn't generate code for explicit
run-time bounds checks. But again, the struct hack is common enough
that you should be ok.
I know it's bad and it shouldn't be done. But when
I look at tens of thousands of lines of code written
by someone else and many of them make use of this
hack. What can we conclude? Perhaps it does work,
just like the faq says.
The struct hack itself *probably* violates the rules of the language,
but it's generally supported -- and C99 explicitly supports it in a
different form. Code that assumes the presence of padding bytes, on
the other hand, is more dangerous. For example, if your declaration
changes from this:
struct name {
int namelen;
char namestr[1];
};
to this:
struct name {
int namelen;
short something;
unsigned char something_else;
char namestr[1];
};
then it's likely (given 4-byte int, 2-byte short, and, of course,
1-byte char) that the structure will be 8 bytes with *no* padding.
There might be some confusion here. I haven't gone back to the
original article, and I'm not certain that the code you originally
posted actually assumed the existence of padding bytes rather than
just making ordinary use of the struct hack.
[...]