Strict aliasing and Q2.6 in the FAQ

Conor F · Sep 19, 2011

(Trying this again as the velocityreviews site doesn't seem to forward
to NNTP - hope this doesn't appear twice!

Back to this old topic again. Sorry about this but I'm just not sure
if aliasing applies in this case. Question 2.6 in the FAQ describes
the case of using one malloc and piggy backing a char * onto it. It's
a pretty common idiom I would have thought, but I'm now having my
doubts:

struct name {
int namelen;
char *namep;
};

struct name *makename(char *newname)
{
char *buf = malloc(sizeof(struct name) + strlen(newname) + 1);

struct name *ret = (struct name *)buf;
ret->namelen = strlen(newname);
ret->namep = buf + sizeof(struct name);
strcpy(ret->namep, newname);

return ret;
}

I don't really believe there are aliasing issues here due to a char *
being reassigned to a struct name *.

But. If you did this instead:

struct name *ret = malloc(sizeof(struct name) + strlen(newname) +
1);
char *buf = (char *)(&ret[1]);

which also seems a perfectly reasonable way of going about it, and
avoids the sizeof(struct name) addition which can be a little tricky
in the cases where you have several leading structs. Ok, it's not
terrible but I always though the above was clearer.

Anyway - you've now taken an object of type struct name and converted
to a different type pointing to the same memory.

Isn't that an aliasing issue?

And if char * is a special case then what if I had used a wchar_t
instead? Or another type?

It's all a bit subtle for me.

Conor.

(Hum. Google won't let me post with my hotmail address any more. How
annoying...)

Harald van DÄ³k · Sep 19, 2011

struct name {
int namelen;
char *namep;

};

struct name *makename(char *newname)
{
char *buf = malloc(sizeof(struct name) + strlen(newname) + 1);

struct name *ret = (struct name *)buf;
ret->namelen = strlen(newname);
ret->namep = buf + sizeof(struct name);
strcpy(ret->namep, newname);

return ret;

}

I don't really believe there are aliasing issues here due to a char *
being reassigned to a struct name *.

That's fine, but for a different reason. You're accessing
sizeof(struct name) bytes as struct name, and you're accessing the
following strlen(newname)+1 bytes as char. You never access any data
as a type it isn't.

Because of that, your suggested alternative has no aliasing issues
either.

My preference would be to use neither, and instead use the C99
alternative

struct name {
int namelen;
char name[];
};

Eric Sosman · Sep 20, 2011

(Trying this again as the velocityreviews site doesn't seem to forward
to NNTP - hope this doesn't appear twice!

Back to this old topic again. Sorry about this but I'm just not sure
if aliasing applies in this case. Question 2.6 in the FAQ describes
the case of using one malloc and piggy backing a char * onto it. It's
a pretty common idiom I would have thought, but I'm now having my
doubts:

struct name {
int namelen;
char *namep;
};

struct name *makename(char *newname)
{
char *buf = malloc(sizeof(struct name) + strlen(newname) + 1);

struct name *ret = (struct name *)buf;
ret->namelen = strlen(newname);
ret->namep = buf + sizeof(struct name);
strcpy(ret->namep, newname);

return ret;
}

I don't really believe there are aliasing issues here due to a char *
being reassigned to a struct name *.

Nor do I.

But. If you did this instead:

struct name *ret = malloc(sizeof(struct name) + strlen(newname) +
1);
char *buf = (char *)(&ret[1]);

which also seems a perfectly reasonable way of going about it, and
avoids the sizeof(struct name) addition which can be a little tricky
in the cases where you have several leading structs. Ok, it's not
terrible but I always though the above was clearer.

Anyway - you've now taken an object of type struct name and converted
to a different type pointing to the same memory.

Isn't that an aliasing issue?

I don't see why. Personally, I prefer the latter form (although
I usually write `ret + 1' for `&ret[1]'). Both examples convert a
pointer value from one type to another.

And if char * is a special case then what if I had used a wchar_t
instead? Or another type?

Alignment problems could arise. They can be put to bed again,
but the code gets uglier.

Conor F · Sep 20, 2011

I don't really believe there are aliasing issues here due to a char *

That's fine, but for a different reason. You're accessing
sizeof(struct name) bytes as struct name, and you're accessing the
following strlen(newname)+1 bytes as char. You never access any data
as a type it isn't.

Because of that, your suggested alternative has no aliasing issues
either.

Ah! But I thought that it wouldn't matter where the data was. In other
words, gcc could decide that all that memory over there is now of type
"struct name" and if you pretend part of it isn't, gcc will play games
with you, like optimise all the references to buf out because you
didn't return it

I can't always tell what gcc might mess with. I've seen posts by Linus
Torvalds where advocates using the no-strict-aliasing flag to avoid
any subtle issues like this thread from a few years back:
https://lkml.org/lkml/2003/2/25/270

In that case, reordering the code make a difference. But if properly
assigned pointers point to different blocks of memory, I can't see how
anything would fail. Which is why I asked here

My preference would be to use neither, and instead use the C99
alternative

struct name {
int namelen;
char name[];

Oh absolutely. The case I mentioned was a simple one where the above
notation would suit much better. The Windows header files have those
notations all over the place (except using the pre C99 form of char
name[1]

.

char *buf = (char *)(ret + 1);

As Eric says, I also prefer the above form, especially when things get
a little hairier, like the classic array of pointers to char followed
by the char data:

char **strarray -> [ptrc0][ptrc1][ptrc2][NULL][string0][string1]
[string2]

dataptr = (char *)(strarray + nstrings + 1)

And then if it's an array of pointers to structures, then we hit
alignment issues. But at least those are easy to deal with (just round
up to the next even multiple of the structure size). And then compile
on a Sparc just to see if you are right!

Conor.

Harald van DÄ³k · Sep 20, 2011

Ah! But I thought that it wouldn't matter where the data was. In other
words, gcc could decide that all that memory over there is now of type
"struct name" and if you pretend part of it isn't, gcc will play games
with you, like optimise all the references to buf out because you
didn't return it

I can't always tell what gcc might mess with. I've seen posts by Linus
Torvalds where advocates using the no-strict-aliasing flag to avoid
any subtle issues like this thread from a few years back:
https://lkml.org/lkml/2003/2/25/270

In that case, reordering the code make a difference. But if properly
assigned pointers point to different blocks of memory, I can't see how
anything would fail. Which is why I asked here

Looking further in that thread, the problem comes from a subtle bug/
misfeature in the implementation of the kernel's own memcpy macro/
function. Generally speaking, when you're not writing a kernel, you
can assume memcpy behaves as required by the standard.

Conor F · Sep 20, 2011

So, to summarise

:

Strict aliasing would only apply if a type punned pointer pointed to
the same place in memory - which in my opinion is wild west code
anyway...

So, to be awkward and use a wchar_t instead simply to avoid the char *
case:

struct name { int namelen; wchar_t *namep; };

struct name *ret = malloc(sizeof(struct name) +
wcslen(newname) + 1);

wchar_t *buf = (wchar_t *)(ret + 1);

... copy to buf here ...

Would be fine simply because the type punning is to a different memory
location; but:

wchar_t *buf = (wchar_t *)(ret + 0);

Isn't fine. Ok, other than the fact that I made a mess of the example
I mean. Um, a better example would be the one in the wikipedia article
on type punning:

struct sockaddr_in sa = {0};
....
bind(sockfd, (struct sockaddr *)&sa, sizeof sa);

which is obviously bad. But if bind took a char * and they did this:

bind(sockfd, (char *)&sa, sizeof sa);

that would be ok I guess.

Plus any other type of inheritance creation like COM - where two
structures share initial sequences (IUnknown and all that) but are not
unioned are also out. But then I'd be aware that I'm messing around in
those circumstances and I'd use -fno-strict-aliasing...

Thanks,

Conor.

Harald van DÄ³k · Sep 20, 2011

So, to summarise :

Strict aliasing would only apply if a type punned pointer pointed to
the same place in memory - which in my opinion is wild west code
anyway...

Pretty much, yes.

So, to be awkward and use a wchar_t instead simply to avoid the char *
case:

struct name { int namelen; wchar_t *namep; };

struct name *ret = malloc(sizeof(struct name) +
wcslen(newname) + 1);

(wcslen(newname) + 1) * sizeof(wchar_t)

wchar_t *buf = (wchar_t *)(ret + 1);

... copy to buf here ...

Would be fine simply because the type punning is to a different memory
location; but:
Right.

wchar_t *buf = (wchar_t *)(ret + 0);

Isn't fine. Ok, other than the fact that I made a mess of the example
I mean.

Yes, that example doesn't work. Accessing the data as

struct name *ret = malloc(sizeof(wchar_t));
wchar_t *buf = (wchar *) (ret + 0);
*buf = L'x';

is no violation of the aliasing rules, because you're still only
accessing the data as wchar_t, even though you have a suspicious cast
now.

Um, a better example would be the one in the wikipedia article
on type punning:

struct sockaddr_in sa = {0};
....
bind(sockfd, (struct sockaddr *)&sa, sizeof sa);

which is obviously bad.

By C's aliasing rules, yes, you're right. Remember, though, that bind
is a non-standard function, and POSIX makes additional guarantees
about what compilers must permit, beyond what standard C does. It may
say that the above use must be given the "obvious" interpretation by a
conforming POSIX compiler. I don't know if it does so.

But if bind took a char * and they did this:

bind(sockfd, (char *)&sa, sizeof sa);

that would be ok I guess.

If bind is declared as taking a struct sockaddr *, and if bind
dereferences its parameter to get a struct sockaddr, then C's aliasing
rules don't allow you to pass a pointer to what is really a struct
sockaddr_in, not even via an intermediate char * cast. If bind takes a
char *, and accesses the memory byte by byte, then yes, that is the
special exception in the aliasing rules.

Plus any other type of inheritance creation like COM - where two
structures share initial sequences (IUnknown and all that) but are not
unioned are also out. But then I'd be aware that I'm messing around in
those circumstances and I'd use -fno-strict-aliasing...

Another case where COM pretty much ignores the aliasing rules is in
IUnknown's QueryInterface method, where its last argument's type is
void **, but will almost never really be a pointer to void *. Which is
okay if MS decides that COM compilers must allow this, even if C's
aliasing rules don't.

Conor F · Sep 20, 2011

struct name *ret = malloc(sizeof(struct name) +

(wcslen(newname) + 1) * sizeof(wchar_t)

Ooops. Erm, sorry. That's what I get for coding in a rush and then

changing my mind. That example was a total mess said:
Right.

Grand. That clarifies a lot. I used to do QA so I'm a tad pedantic
about these things (except the example I typed in). I just wanted to
be sure on that point.

Yes, that example doesn't work. Accessing the data as

struct name *ret = malloc(sizeof(wchar_t));
wchar_t *buf = (wchar *) (ret + 0);
*buf = L'x';

is no violation of the aliasing rules, because you're still only
accessing the data as wchar_t, even though you have a suspicious cast
now.

That's somewhat of a surprise. I guess you might get hosed as soon as
you access "ret", because the compiler would decide to optimise given
the assumption that ret and buf couldn't possibly point to the same
location.

By C's aliasing rules, yes, you're right. Remember, though, that bind
is a non-standard function, and POSIX makes additional guarantees
about what compilers must permit, beyond what standard C does. It may
say that the above use must be given the "obvious" interpretation by a
conforming POSIX compiler. I don't know if it does so.

Hmmm. I see - I believe I've seen that before with threading - Posix
makes assurances above what ISO C makes so that calls like
pthread_mutex_lock() don't get messed with. I'd guess the gcc
documentation might shed some light.

If bind is declared as taking a struct sockaddr *, and if bind
dereferences its parameter to get a struct sockaddr, then C's aliasing
rules don't allow you to pass a pointer to what is really a struct
sockaddr_in, not even via an intermediate char * cast. If bind takes a
char *, and accesses the memory byte by byte, then yes, that is the
special exception in the aliasing rules.

Yes, thank you. I had that feeling when I typed that bit that maybe it
would be bad to recast it back once cast to a char *. Doing some
googling shows that some coders (eg: Putty) have made changes to put
these in unions to avoid the problem.

Another case where COM pretty much ignores the aliasing rules is in
IUnknown's QueryInterface method, where its last argument's type is
void **, but will almost never really be a pointer to void *. Which is
okay if MS decides that COM compilers must allow this, even if C's
aliasing rules don't.

Although if I was using a compiler like mingw it would probably be
good to be aware of the possible issues and use the appropriate flags
if necessary. The Windows compilers would do the right thing
automagically of course.

Conor.

Strict aliasing and Q2.6 in the FAQ	0	Sep 19, 2011
Union and strict aliasing	4	Jul 28, 2012
strict-aliasing??	5	Apr 8, 2010
Strict aliasing rule: pointer to void vs. pointer to char and transitivity	3	Mar 24, 2014
Circumventing the -fno-strict-aliasing switch	26	May 25, 2011
warning of breaking strict-aliasing rules	9	Apr 10, 2012
Is the aliasing rule symmetric?	110	Jan 21, 2011
Strict aliasing and buffer handling	20	Jun 20, 2011

Strict aliasing and Q2.6 in the FAQ

Conor F

Harald van DÄ³k

Eric Sosman

Conor F

Harald van DÄ³k

Conor F

Harald van DÄ³k

Conor F

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads