warning of breaking strict-aliasing rules

N

Noob

Hello,

I'm supposed to "clean up" code that does things which the
standard frowns upon, such as

typedef long int LONG;
typedef unsigned long ULONG;
{
unsigned char csw[ 80 ] = { 0 };
fill_array(csw);
LONG sign = *(LONG*)&csw[0];
ULONG tag = *(ULONG *)&csw[4];
LONG residue = *(LONG*)&csw[8];
}

AFAIU, there are several problems with this code.

1. Since csw is a byte array, it might not be correctly
aligned for long accesses.

2. I think the compiler is allowed to assume that a given
object cannot be accessed through two incompatible pointers
(Is this what aliasing refers to?)

Strangely, my compiler (gcc) complains only about the
first dereference:

warning: dereferencing type-punned pointer will break strict-aliasing rules

AFAICT, all three lines have the same issue, right?

It seems the compiler should complain "equally" about the
three dereferences. Do you agree?

As for the fix, I think all is needed is, e.g.
LONG sign;
memcpy(&sign, csw+0, sizeof sign);
/* etc */
Do you agree?

Regards.
 
J

James Kuyper

Hello,

I'm supposed to "clean up" code that does things which the
standard frowns upon, such as

typedef long int LONG;
typedef unsigned long ULONG;
{
unsigned char csw[ 80 ] = { 0 };
fill_array(csw);
LONG sign = *(LONG*)&csw[0];
ULONG tag = *(ULONG *)&csw[4];
LONG residue = *(LONG*)&csw[8];
}

AFAIU, there are several problems with this code.

1. Since csw is a byte array, it might not be correctly
aligned for long accesses.
Correct.

2. I think the compiler is allowed to assume that a given
object cannot be accessed through two incompatible pointers
(Is this what aliasing refers to?)
Correct.

Strangely, my compiler (gcc) complains only about the
first dereference:

warning: dereferencing type-punned pointer will break strict-aliasing rules

AFAICT, all three lines have the same issue, right?

It seems the compiler should complain "equally" about the
three dereferences. Do you agree?

That seems reasonable, but the C standard only requires one diagnostic
for a program, no matter how many separate reasons there might be why a
diagnostic is required. The diagnostic doesn't have to contain any
useful information; it doesn't have to be in a language you (or anyone
else) know how to read. The requirement could be met by causing a light
to blink red. Providing anything more than that is a matter of "Quality
of Implementation" (QoI) which is outside the scope of the standard.

Only the gcc developers can actually change this.
As for the fix, I think all is needed is, e.g.
LONG sign;
memcpy(&sign, csw+0, sizeof sign);
/* etc */
Do you agree?

Yes.
 
N

Noob

James said:
Noob said:
typedef long int LONG;
typedef unsigned long ULONG;
{
unsigned char csw[ 80 ] = { 0 };
fill_array(csw);
LONG sign = *(LONG*)&csw[0];
ULONG tag = *(ULONG *)&csw[4];
LONG residue = *(LONG*)&csw[8];
}
[snip]
warning: dereferencing type-punned pointer will break strict-aliasing rules

It seems the compiler should complain "equally" about the
three dereferences. Do you agree?

That seems reasonable, but the C standard only requires one diagnostic
for a program, no matter how many separate reasons there might be why a
diagnostic is required.

When you say "one diagnostic for a program" does that mean if I fix
the first line, the compiler should complain about the second?
The diagnostic doesn't have to contain any
useful information; it doesn't have to be in a language you (or anyone
else) know how to read. The requirement could be met by causing a light
to blink red. Providing anything more than that is a matter of "Quality
of Implementation" (QoI) which is outside the scope of the standard.

Roger that.
 
J

James Kuyper

James said:
Noob said:
typedef long int LONG;
typedef unsigned long ULONG;
{
unsigned char csw[ 80 ] = { 0 };
fill_array(csw);
LONG sign = *(LONG*)&csw[0];
ULONG tag = *(ULONG *)&csw[4];
LONG residue = *(LONG*)&csw[8];
}
[snip]
warning: dereferencing type-punned pointer will break strict-aliasing rules

It seems the compiler should complain "equally" about the
three dereferences. Do you agree?

That seems reasonable, but the C standard only requires one diagnostic
for a program, no matter how many separate reasons there might be why a
diagnostic is required.

When you say "one diagnostic for a program" does that mean if I fix
the first line, the compiler should complain about the second?

Actually, I wasn't paying enough attention. This code contains serious
defects, and it's a good thing that gcc provides the option of giving
you warnings about them, and it would be more reasonable for it to warn
in all three places. However, what's defective about the code is that
the behavior is undefined. It's not a syntax error or a constraint
violation, so no diagnostic is required. Therefore my comment, while
true in isolation, doesn't apply in this context.
 
T

tom st denis

Only the gcc developers can actually change this.


Yes.

What? That's not how to load an integer type from a char array. He
will NEED to do something like

sign = csw[0] | (csw[1] << 8) | ... ;

Or whatever endianess the data is meant to be in.

Tom
 
J

James Kuyper

Only the gcc developers can actually change this.


Yes.

What? That's not how to load an integer type from a char array. He
will NEED to do something like

sign = csw[0] | (csw[1] << 8) | ... ;

Or whatever endianess the data is meant to be in.

I was assuming that he was reading in an integer object in the same
format used by the implementation that compiled the code, possibly even
by another part of the same program. Keep in mind that the code he's
updating is currently working; it just needs updating. If the data
source was using a different format, that wouldn't have been the case.

We've been given no information about what restrictions there are on the
input data format; they might be quite strict, or non-existent. Unless
they're very strict, the function needs additional information to
determine which conversions (such as the ones you describe) are needed.
 
K

Kaz Kylheku

Hello,

I'm supposed to "clean up" code that does things which the
standard frowns upon, such as

This is generally a waste of time unless there is an economic justification,
such as: a paying customer wants to use the program on platform X and it does
not work.

If the program works now, it's not going to translate to any tangible benefit
to the end user, and with any change, there is the risk of screwing it up
more.
warning: dereferencing type-punned pointer will break strict-aliasing rules

GCC supports type punning code which breaks strict aliasing rules.
You just have to give it the right option.

GCC implicitly enables -fstrict-aliasing at optimization levels -O2 and above,
but you can override that with -fno-strict-aliasing later on the command line.

This band-aid fix is way less effort than mucking around with the code.
 
N

Noob

Kaz said:
This is generally a waste of time unless there is an economic justification,
such as: a paying customer wants to use the program on platform X and it does
not work.

If the program works now, it's not going to translate to any tangible benefit
to the end user, and with any change, there is the risk of screwing it up
more.


GCC supports type punning code which breaks strict aliasing rules.
You just have to give it the right option.

GCC implicitly enables -fstrict-aliasing at optimization levels -O2 and above,
but you can override that with -fno-strict-aliasing later on the command line.

This band-aid fix is way less effort than mucking around with the code.

I think I agree with your advice. I'll just tweak the command-line.
 
N

Noob

Tom said:
Noob said:
As for the fix, I think all is needed is, e.g.
LONG sign;
memcpy(&sign, csw+0, sizeof sign);
/* etc */

What? That's not how to load an integer type from a char array.
He will NEED to do something like

sign = csw[0] | (csw[1] << 8) | ... ;

Or whatever endianess the data is meant to be in.

I will NEED no such thing.
It's a little-endian protocol on a little-endian CPU.
No byte-shuffling is required.
 
T

Tim Rentsch

Noob said:
Hello,

I'm supposed to "clean up" code that does things which the
standard frowns upon, such as

typedef long int LONG;
typedef unsigned long ULONG;
{
unsigned char csw[ 80 ] = { 0 };
fill_array(csw);
LONG sign = *(LONG*)&csw[0];
ULONG tag = *(ULONG *)&csw[4];
LONG residue = *(LONG*)&csw[8];
}

AFAIU, there are several problems with this code.

1. Since csw is a byte array, it might not be correctly
aligned for long accesses.

2. I think the compiler is allowed to assume that a given
object cannot be accessed through two incompatible pointers
(Is this what aliasing refers to?)

Strangely, my compiler (gcc) complains only about the
first dereference:

warning: dereferencing type-punned pointer will break strict-aliasing rules

AFAICT, all three lines have the same issue, right?

In fact, there are several issues, and the gcc complaint is
misleading in this regard. That said, the three lines all have
all the issues.

Point of information: the phrase 'strict-aliasing rules' is
specific to gcc, and is different from what the C Standard
requires (the 'effective type' rules). So gcc's warning
is, in a very important sense, disguising the real problem.

It seems the compiler should complain "equally" about the
three dereferences. Do you agree?

Yes, and it should complain in a way that is more
indicative of the problem as the C Standard views it.
As for the fix, I think all is needed is, e.g.
LONG sign;
memcpy(&sign, csw+0, sizeof sign);
/* etc */
Do you agree?

I have to disagree with Kaz's advice in his response. Choosing a
gcc option to disallow "strict aliasing" (whatever that may be,
since gcc defines it however they please, and there is no clear
statement of what the definition is) is NOT the right way to
solve this problem. Using memcpy(), as you show, is one
alternative approach. Here is another:

union {
unsigned char csw[80];
LONG longs[ 80 / sizeof (LONG) ];
ULONG ulongs[ 80 / sizeof (ULONG) ];
} stuff;

fill_array( stuff.csw );
LONG sign = stuff.longs[0];
ULONG tag = stuff.ulongs[1];
LONG residue = stuff.longs[2];

This makes the behavior well-defined (assuming of course the data
read into stuff.csw is represented appropriately). In particular
it takes care of alignment problems, and satisfies the effect
type rules (aka "aliasing rules" or "anti-aliasing rules"). Then
you can use -fstrict-aliasing, and if gcc complains, you will
know that it is gcc, and not the C Standard, that is causing the
problem. A benefit of this approach is that casts and using
'void *' are avoided. A downside of this approach is that
offsets like 4, 8, etc, will need to be converted to their
"in-type" offsets, but that should be able to be mostly automated
and not too bad. And if you ever want to move the code to
a platform where 'sizeof (long) != 4' the transition will be
much easier.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top