Union and strict aliasing

M

Maxim Fomin

Problem: there is need to allocate space for some object of unknown quantity and I know, that in "99% cases" it would be less than N items. In such cases it is better to use fixed-size array, however I don't want to impose limitations and cover all possible cases (when quantity is bigger that N). Asa trade-off I use union of fixed-size array with single element:

union u
{
type *arr[N]; /* type is opaque structure */
type *ptr;
};

If quantity is bigger than N, I assign allocated memory to ptr.

However, consider following example program (uhack.c):

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

struct data
{
int val;
char *align;
} d0 = {0, "hello"},
d1 = {1, "world"},
d2 = {2, "from"},
d3 = {3, "a"},
d4 = {4, "simple"},
d5 = {5, "c"},
d6 = {6, "program"};

union
{
struct data *arr[7];
struct data *ptr;
} u = {{&d0, &d1, &d2, &d3, &d4, &d5, &d6}};

int main(void)
{
int i;
struct data *new_ptr;
for (i = 0; i < 7; i++) {
printf("%s\n", u.ptr.align);
}
new_ptr = malloc(9 * sizeof *new_ptr);
memcpy(new_ptr, u.ptr, 7 * sizeof *new_ptr);
u.ptr = new_ptr;
for (i = 0; i < 7; i++) {
printf("%s\n", u.ptr.align);
}
return 0;
}

Compiling with gcc (gcc -Wall -Wextra -pedantic -std=c99 -O0 -o gccO0 -Wstrict-aliasing -fstrict-aliasing uhack.c) gives expected result, but raising optimization level (gcc -Wall -Wextra -pedantic -std=c99 -O1 -o gccO1 -Wstrict-aliasing -fstrict-aliasing uhack.c) gives binary which segfaults onmy machine.

I don't understand where "strict aliasing rules" are broken (assuming that raising optimization level for given program reveals such errors):
- union trick unlikely is a source of problem, since it overlaps compatibletypes (even same): an object and first element of fixed-size array of suchobjects
- union u is likely to be initialized right;
- padding and trailing bits are also unlikely to cause problem, since members are accessed through . and ->

In addition, clang (clang -Wall -Wextra -pedantic -std=c99 -O3 -o clangO3-Wstrict-aliasing -fstrict-aliasing uhack.c) for O3 produces runnable binary, so I suppose there is somewhere tricky UB which I cannot find.
 
I

Ike Naar

Problem: there is need to allocate space for some object of unknown
quantity and I know, that in "99% cases" it would be less than N
items. In such cases it is better to use fixed-size array, however I
don't want to impose limitations and cover all possible cases (when
quantity is bigger that N). As a trade-off I use union of fixed-size
array with single element:

union u
{
type *arr[N]; /* type is opaque structure */
type *ptr;
};

If quantity is bigger than N, I assign allocated memory to ptr.

However, consider following example program (uhack.c):

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

struct data
{
int val;
char *align;
} d0 = {0, "hello"},
d1 = {1, "world"},
d2 = {2, "from"},
d3 = {3, "a"},
d4 = {4, "simple"},
d5 = {5, "c"},
d6 = {6, "program"};

union
{
struct data *arr[7];
struct data *ptr;
} u = {{&d0, &d1, &d2, &d3, &d4, &d5, &d6}};

int main(void)
{
int i;
struct data *new_ptr;
for (i = 0; i < 7; i++) {
printf("%s\n", u.ptr.align);
}
new_ptr = malloc(9 * sizeof *new_ptr);
memcpy(new_ptr, u.ptr, 7 * sizeof *new_ptr);
u.ptr = new_ptr;
for (i = 0; i < 7; i++) {
printf("%s\n", u.ptr.align);
}
return 0;
}


Strange things happening here.
This program relies on the assumption that the variables
d0,d1,d2,d3,d4,d5,d6 are allocated contiguously and in that order.
The compiler is not obliged to honour this assumption.
See what happens when you swap, say, the lines

d1 = {1, "world"},
d2 = {2, "from"},

to get

d2 = {2, "from"},
d1 = {1, "world"},

In theory this should yield an equivalent program, but
you might find it changes the output printed by the program.

If, for whatever reason, the compiler decides to insert
padding between any of d0,d1,d2,d3,d4,d5,d6, the program
is likely to crash.

Note that the initialization

} u = {{&d0, &d1, &d2, &d3, &d4, &d5, &d6}};

is bogus (except for the initialization of the first element),
if the initialization were changed to, say,

} u = {{&d0, 0, 0, 0, 0, 0, 0}};

it would not change the working of the program, so why are
you taking the trouble to initialize u.arr[1..6] at all?
 
B

Ben Bacarisse

Maxim Fomin said:
Problem: there is need to allocate space for some object of unknown
quantity and I know, that in "99% cases" it would be less than N
items. In such cases it is better to use fixed-size array, however I
don't want to impose limitations and cover all possible cases (when
quantity is bigger that N). As a trade-off I use union of fixed-size
array with single element:

union u
{
type *arr[N]; /* type is opaque structure */
type *ptr;
};

If quantity is bigger than N, I assign allocated memory to ptr.

However, consider following example program (uhack.c):

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

struct data
{
int val;
char *align;
} d0 = {0, "hello"},
d1 = {1, "world"},
d2 = {2, "from"},
d3 = {3, "a"},
d4 = {4, "simple"},
d5 = {5, "c"},
d6 = {6, "program"};

union
{
struct data *arr[7];
struct data *ptr;
} u = {{&d0, &d1, &d2, &d3, &d4, &d5, &d6}};

int main(void)
{
int i;
struct data *new_ptr;
for (i = 0; i < 7; i++) {
printf("%s\n", u.ptr.align);
}


<screeching brakes>
Whoa!

Forget all the problems you might have later, you're off the rails
already. If I could easily draw a diagram, it would be simpler to
explain, but let me try in words.

u.ptr occupies the same space as u.arr[0] -- they overlap -- and since
they have the same type, you can use them in similar ways. This means
that writing u.ptr is the same as writing u.arr[0] which is not at
all what you thought (I image). As Ike has said, if d1 happens to
follow d0 in memory, u.ptr[1], u.arr[0][1], u.arr[1] and &d2 will all
have the same value and things will, deceptively, look OK.

I don't think you can do what you are trying to do starting from here.
Given the limited example, it's hard to say what's best. I'd re-write
you example just using a simple pointer that either points to a static
array or to an allocated one, but the fact that you are not already
doing that suggests that your real usage is more complicated than your
posted example. I feel more details are needed.
new_ptr = malloc(9 * sizeof *new_ptr);
memcpy(new_ptr, u.ptr, 7 * sizeof *new_ptr);
u.ptr = new_ptr;
for (i = 0; i < 7; i++) {
printf("%s\n", u.ptr.align);
}
return 0;
}


<snip>
 
P

Phil Carmody

Maxim Fomin said:
Problem: there is need to allocate space for some object of unknown quantity and I know, that in "99% cases" it would be less than N items. In such cases it is better to use fixed-size array, however I don't want to impose limitations and cover all possible cases (when quantity is bigger that N). As a trade-off I use union of fixed-size array with single element:

union u
{
type *arr[N]; /* type is opaque structure */
type *ptr;
};

What's wrong with just coding the possibility of needing to
look elsewhere for more?

struct block_of_N_typeptr {
type *array[N];
struct block_of_N_typeptr *next;
};

But even that's higher-tech than you really need - what's wrong with
just realloc()?

Phil
--
I'd argue that there is much evidence for the existence of a God.
Pics or it didn't happen.
-- Tom (/. uid 822)
 
M

Maxim Fomin

воÑкреÑенье, 29 Ð¸ÑŽÐ»Ñ 2012 г., 3:13:29 UTC+4 пользователь Ike Naar напиÑал:
Strange things happening here.

This program relies on the assumption that the variables

d0,d1,d2,d3,d4,d5,d6 are allocated contiguously and in that order.

The compiler is not obliged to honour this assumption.

<skip>

Agree, but I found another problem - using u.ptr.align
instead of u.arr->align. Because this expression was
probably considered as intentional, the code may be
reconsidered as conforming.
Note that the initialization



} u = {{&d0, &d1, &d2, &d3, &d4, &d5, &d6}};



is bogus (except for the initialization of the first element),

if the initialization were changed to, say,



} u = {{&d0, 0, 0, 0, 0, 0, 0}};



it would not change the working of the program, so why are

you taking the trouble to initialize u.arr[1..6] at all?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top