Memory alignment

A

Antoninus Twink

only if "pragmatic" means "wrong". The problem is his program is now
non-portable. This non-portability *could* involve different compilers
on the same platform. Or different versions of the same compiler. Or
changes to flag settings of the compiler (particularly optimistation
flags).

The OP knows that perfectly well, and it isn't what he was asking:

The answer is NO, we can't be 100% sure that struct-hack-like code
caused the problem. In fact, we can be 99% sure that it didn't.
 
J

James Kuyper

Why Tea wrote:
....
If the memory is corrupted, wouldn't the system eventually crash
if you run it long enough? If so, then we can be sure that the
corruption will be noticeable.

No. Instead of crashing, the system could also get stuck in an infinite
loop. However, the truly insidious possibility is that your program will
continue apparently normally and exit without showing any obvious signs.
That doesn't mean that it worked correctly; it might produce output that
is subtly wrong in some way, or it might produce a catastrophic error,
but with only a 1% chance of triggering the catastrophe during any
particular run of the program. It could run a long time, generating lots
of subtly erroneous data that will require a lot of work to fix, before
you even notice the problem.

....
I know it's bad and it shouldn't be done. But when
I look at tens of thousands of lines of code written
by someone else and many of them make use of this
hack. What can we conclude? Perhaps it does work,
just like the faq says.

The struct hack does work, on most C90 systems, and I would not
recommend worrying too much about the possibility of it failing unless
and until you find that it fails on a particular system that you need to
port it to. However, don't make any important decisions based upon the
assumption that it can't fail - in principle it can, and you need to
remember that.

C99 flexible arrays will work on any fully conforming implementation of
C99, and on many implementations that fall short of full conformance.
I'd recommend using flexible arrays rather than the struct hack, if
you're able to restrict the portability of your program to those
implementations that support it, and to complain about the ones that
don't support it. As a practical matter, any implementation that lets
you declare a flexible array member without generating a diagnostic will
almost certainly support correct use of that member. Thus, you'll learn
at compile time, rather than run time, whether or not you'll be able to
use it safely.

However, flexible array members are still not universally supported. The
difference is, if the struct hack fails, the implementor can point to
sections of the standard that allow it to fail. If flexible arrays are
not supported, you can point to sections of C99 standard that require
them to be. That won't necessarily make it any easier to convince the
implementor to change their implementation, but it is a stronger argument.
 
J

James Kuyper

Nick said:
um. I thought the position of the struct hack was the same
on C99 as C90. What *did* change was the addiition of
VLAs that were intended to remove the need for TSH

No, the relevant change was not VLAs, but flexible array members. They
work almost exactly like the struct hack, except that instead of
declaring a specific length for the array, the length is left
unspecified. The key point is that the behavior of flexible array
members is defined by the standard, the behavior when using the struct
hack is not.
 
L

lawrence.jones

Keith Thompson said:
Fair enough. I *know* it's called the "struct hack", and I've read
question 2.6 before, but I had trouble finding it myself, since the
answer to 2.6 doesn't use the phrase "struct hack".

That's the kind of thing that makes creating a good index difficult.
It's also why the index in the C standard contains terms that don't
appear anywhere else in the document (including "struct hack", I'm
glad to say).
 
W

Why Tea

Why Tea said:

If the memory is corrupted, wouldn't the system eventually crash
if you run it long enough?

The C Standard does not guarantee this (either way).


If so, then we can be sure that the
corruption will be noticeable.
I went back to c-faq to read 2.6 many times again. I'll paste
the code here for easy reference.
#include <stdlib.h>
#include <string.h>
struct name {
  int namelen;
  char namestr[1];
};
struct name *makename(char *newname)
{
  struct name *ret =
      malloc(sizeof(struct name)-1 + strlen(newname)+1);
                     /* -1 for initial [1]; +1 for \0 */
  if(ret != NULL) {
    ret->namelen = strlen(newname);
    strcpy(ret->namestr, newname);
  }
  return ret;
}
Although not specifically stated, padding is likely
to occur for namestr. So strcpy must have written into
the padding bytes.

Well, actually strcpy has written into memory that you allocated via
malloc.

Does "strcpy(ret->namestr, newname);" really copy data
into the memory malloc'ed? I have a mental picture of
this for namestr:

Byte #1 is declared, 2-4 could be padding assuming
32 bit alignment.
1 2 3 4 5 6 7 8 9 10 11 ...
|<->|<---- malloc'ed ...--->

Doesn't 'strcpy(ret->namestr, "Richard");" become
this?
1 2 3 4 5 6 7 8 9
R i c h a r d \0
As far as I'm aware, nobody has ever found an implementation on which the
struct hack (i.e. the hack by which you allocate more storage than the
structure actually needs) doesn't work. That is not the same as saying
that it's okay to write into memory you don't own. It isn't and you
shouldn't. Whether you do or not is your concern, but you can't blame the
implementation if it all goes wrong.

Understood.
 
B

Ben Bacarisse

Why Tea said:
Why Tea said:
#include <stdlib.h>
#include <string.h>
struct name {
  int namelen;
  char namestr[1];
};
struct name *makename(char *newname)
{
  struct name *ret =
      malloc(sizeof(struct name)-1 + strlen(newname)+1);
                     /* -1 for initial [1]; +1 for \0 */
  if(ret != NULL) {
    ret->namelen = strlen(newname);
    strcpy(ret->namestr, newname);
  }
  return ret;
}
Although not specifically stated, padding is likely
to occur for namestr. So strcpy must have written into
the padding bytes.

Well, actually strcpy has written into memory that you allocated via
malloc.

Does "strcpy(ret->namestr, newname);" really copy data
into the memory malloc'ed?
Yes.

I have a mental picture of
this for namestr:

Byte #1 is declared, 2-4 could be padding assuming
32 bit alignment.
1 2 3 4 5 6 7 8 9 10 11 ...
|<->|<---- malloc'ed ...--->

It is all malloced, including some other bytes that are used for
namelen.
Doesn't 'strcpy(ret->namestr, "Richard");" become
this?
1 2 3 4 5 6 7 8 9
R i c h a r d \0

Yes, if by this you mean that the letters get put in the first (and
only) byte of namestr and then in consecutive following bytes. The
fact that some of these bytes might be due to sizeof *ret being >
sizeof ret->namelen + 1 does not mean they are not malloced or that you
can't write to them.

The main problems with the struct hack come from maintaining the
code. Everyone using the structure has to know that the size is "fake"
and that it can't be zeroed and copied like other structures.

BTW, I'd number these from 0 rather than 1 just to be consistent with
offsets and C's array indexes.
 
T

Tim Rentsch

No, the padding bytes are most definitely *not* yours to write to.
Although most implementations will let you get away with it, there are
implementations that do careful memory bounds checking and won't.

I'm sure someone with Larry Jones's credentials must be saying
something by this, but I'll be darned if I know what it is. Surely
padding bytes must be available for writing to, at least as unsigned
char, so functions with a qsort-like interface can be written.
 
T

Tim Rentsch

Keith Thompson said:
Tim Rentsch said:
(e-mail address removed) writes: [...]
No, the padding bytes are most definitely *not* yours to write to.
Although most implementations will let you get away with it, there are
implementations that do careful memory bounds checking and won't.

I'm sure someone with Larry Jones's credentials must be saying
something by this, but I'll be darned if I know what it is. Surely
padding bytes must be available for writing to, at least as unsigned
char, so functions with a qsort-like interface can be written.

The context was a discussion of the struct hack. For example, given:

struct h {
int i;
char arr[1];
};
struct h obj;

[...]

Ahhh, okay. Accessing obj.arr[1] is always undefined behavior,
whether (struct h) has padding bytes or not.
 
W

Why Tea

Treating the structure as an array of unsigned char, and accessing
those bytes, including any padding bytes, is ok.

Thanks Keith. I wonder why it took 60 messages for someone
to make a statement as concise as this :)
Assuming that there are one or more padding bytes is non-portable
(and, in my opinion, unnecessary and poor style).

Sure.
 
V

vippstar

[ Keith Thompson wrote this ]
Treating the structure as an array of unsigned char, and accessing
those bytes, including any padding bytes, is ok.

Thanks Keith. I wonder why it took 60 messages for someone
to make a statement as concise as this :)
Assuming that there are one or more padding bytes is non-portable
(and, in my opinion, unnecessary and poor style).

Sure.


Please don't snip attribution lines, ie the part that says "Keith
Thompson wrote:" or similar.
I restored it.
 
A

Antoninus Twink

Thanks Keith. I wonder why it took 60 messages for someone to make a
statement as concise as this :)

Because rather than giving a simple answer to a simple question, the
"regulars" would rather turn every thread into a drama by pretending to
misunderstand what everyone else means, and going off into word games
and angels-on-the-head-of-a-pin arguments for post after post.
 
K

Kenny McCormack

Because rather than giving a simple answer to a simple question, the
"regulars" would rather turn every thread into a drama by pretending to
misunderstand what everyone else means, and going off into word games
and angels-on-the-head-of-a-pin arguments for post after post.

Yes, and for most of them, it's the only thing resembling a life that
they will ever know.

Just once, I'd like a statement from the regs as to why they waste their
lives like this.
 
D

David Thompson

Lowell Gilbert said:
Why Tea said:
On Oct 3, 12:48 pm, (e-mail address removed) wrote: [...]
Why would you want to declare a 1 char array to store 2 anyway?

Good question. This is found in some real embedded
code to make more efficient of the memory. As I understood
it, the last s[1] is just a placeholder as you can
allocate more memory when needed. For example:

my_struct = malloc(sizeof(my_struct_t) + MY_PAYLOAD_STRING_SIZE);

or even more likely, something more like
my_struct = malloc(sizeof(my_struct_t) + strlen(my->struct->s));
Not my->struct->s; that's a syntax error.

Guessing you (PP) meant my_struct->s, no. You need to compute the
length BEFORE allocating; and THEN set my_struct, and fill in
my_struct->whatever. Something much more like:
my_struct = malloc (sizeof(my_struct_t) + strlen(source_str) );
or the equivalent but clc-preferred
my_struct = malloc (sizeof *my_struct + strlen(source_str) );
+ 1. The length returned by strlen() doesn't include the terminating '\0'.

But if you use the C89-struct-hack version, with s[1], the sizeof the
struct already includes room for at least one byte, maybe more. I have
been known to write, for clarity(?!):
... malloc (sizeof(struct_t) -1 +strlen(source_str) +1 )
and sometimes to get stupid compilers to optimize even:
... malloc (sizeof(struct_t) -1 +1 +strlen(source_str) )

Or as already noted elsethread you use the offsetof variant:
... malloc (offsetof(struct_t,s) +strlen(source_str) +1 )

And that's not even considering the case where you don't need .s to be
nullterminated, typically because its length is stored elsewhere.

- formerly david.thompson1 || achar(64) || worldnet.att.net
 
R

Richard

[ Keith Thompson wrote this ]
Treating the structure as an array of unsigned char, and accessing
those bytes, including any padding bytes, is ok.

Thanks Keith. I wonder why it took 60 messages for someone
to make a statement as concise as this :)
Assuming that there are one or more padding bytes is non-portable
(and, in my opinion, unnecessary and poor style).

Sure.


Please don't snip attribution lines, ie the part that says "Keith
Thompson wrote:" or similar.
I restored it.

*chuckle*

Vippstar probably doesn't even realise the irony and humour in his
reply. Good to see not much has changed the past few weeks!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,602
Members
45,182
Latest member
BettinaPol

Latest Threads

Top