Arrays of Unions

J

john

Hi

I read that "
Unions may occur within arrays.
"

But, a Union is a variable that may hold (at different times)
objects of different types and sizes.
whereas in an array, each element should have the same
data type . So, how can an array hold a union that has different
ojects of different types during different scenarios ? And what must the
stride be when incrementing the array pointer ?

Thanks in advance.
 
S

Seebs

I read that "
Unions may occur within arrays.
"
But, a Union is a variable that may hold (at different times)
objects of different types and sizes.
Yes.

whereas in an array, each element should have the same
data type . So, how can an array hold a union that has different
ojects of different types during different scenarios ? And what must the
stride be when incrementing the array pointer ?

The union object itself has a size, which is definitely no smaller than
the largest element (but may be large), and that is the size of the "union"
object, and the size of each element of an array of unions.

-s
 
K

Keith Thompson

john said:
I read that "
Unions may occur within arrays.
"

But, a Union is a variable that may hold (at different times)
objects of different types and sizes.
whereas in an array, each element should have the same
data type . So, how can an array hold a union that has different
ojects of different types during different scenarios ? And what must the
stride be when incrementing the array pointer ?

In an array of unions, every element of the array does have the same
type. That type just happens to be a union type.

The size of a union object is (at least) the size of its largest member.

For example:

union u {
char tiny; /* 1 byte */
double medium; /* typically 8 bytes */
char huge[1024]; /* 1024 bytes */
};
union u arr[100];

The array object "arr" contains 100 elements, each of which is an
object of type "union u". Each element may hold either a char, a
double, or an array of 1024 chars. To allow this, each element is at
least 1024 bytes in size. Yes, this means that if you store a char
value in "tiny", you're wasting 1023 bytes. If that's a problem,
you should use something other than a union to hold your data.

The stride of the array is (at least) 1024 bytes. (It could be
even larger if the implementation adds padding at the end, which
it's permitted to do, but it's unlikely in this case.)

Note that the implementation doesn't keep track for you of which
member of the union is currently active. It's up to you do to that
yourself, or to deal with the consequences if you read one member
after writing a different one.
 
G

Gene

Hi

I read that "
Unions may occur within arrays.
"

But, a Union is a variable that may hold (at different times)
objects of different types and sizes.
whereas in an array, each element should have the same
data type . So, how can an array hold a union that has different
ojects of different types during different scenarios ? And what must the
stride be when incrementing the array pointer ?

Thanks in advance.

The union is itself a type, which when allocated as an array element
will always have the same fixed size, i.e. the return value of
sizeof. In practice, the union size will be the size of its largest
field, possibly with some padding bytes added.
 
P

Peter Nilsson

john said:
Hi

I read that "
Unions may occur within arrays.
"

But, a Union is a variable

No. A union is a type not an object.
that may hold (at different times) objects of different
types and sizes.

An object of union type may have a value of one of the
member types.
whereas in an array, each element should have the same
data type.

And an array of (specific) union type, is such an array.
So, how can an array hold a union that has different
ojects of different types during different scenarios?

Because a union is itself an object type.
And what must the stride be when incrementing the array
pointer ?

The next element always has an increment of 1, no matter
what the element type.

The size of a union type will always be large enough to
hold the largest member type. It may even be larger if
bizarre alignment restrictions require.
 
N

Nick

john said:
Hi

I read that "
Unions may occur within arrays.
"

But, a Union is a variable that may hold (at different times)
objects of different types and sizes.
whereas in an array, each element should have the same
data type . So, how can an array hold a union that has different
ojects of different types during different scenarios ? And what must the
stride be when incrementing the array pointer ?

The size of any one union is the maximum size of all the members of the
union. So with
union {
char c;
int a;
double x[10];
} u;

any "union u" will be big enough to hold ten doubles.

The stride in bytes will be that maximum size plus any padding
necessary. But you don't need to worry about that for incrementing
array pointers - C takes care of it: add 1 to the pointer and it will do
the hard work for you.
 
P

Peter Nilsson

Nick said:
The size of any one union is the maximum size of all the
members of the union.

A union may be larger than its largest element, just as a
struct may be larger than the sum of its element sizes.

For instance, an implementation is free to make the
following union 4 bytes in size...

union x { char c; };
 
P

Phil Carmody

john said:
Hi

I read that "
Unions may occur within arrays.
"

But, a Union is a variable that may hold (at different times)
objects of different types and sizes.
whereas in an array, each element should have the same
data type . So, how can an array hold a union that has different
ojects of different types during different scenarios ? And what must the
stride be when incrementing the array pointer ?

Freight ships contain equally-sized shipping containers.
Shipping containers may contain all kinds of different things.

The array contains equally-sized unions that may contain different-
sized objects. The array doesn't directly contain different-sized
objects.

Phil
 
N

Nick

Peter Nilsson said:
A union may be larger than its largest element, just as a
struct may be larger than the sum of its element sizes.

Yes, that's why in the bit you snipped I said "plus any padding". Maybe
I was using "padding" slightly incorrectly, meaning also "space between
items for alignment purposes".
For instance, an implementation is free to make the
following union 4 bytes in size...

union x { char c; };

In fact, in many cases it will have to.
 
J

john

Keith said:
john said:
I read that "
Unions may occur within arrays.
"

But, a Union is a variable that may hold (at different times) objects
of different types and sizes. whereas in an array, each element should
have the same data type . So, how can an array hold a union that has
different ojects of different types during different scenarios ? And
what must the stride be when incrementing the array pointer ?

In an array of unions, every element of the array does have the same
type. That type just happens to be a union type.

The size of a union object is (at least) the size of its largest member.

For example:

union u {
char tiny; /* 1 byte */
double medium; /* typically 8 bytes */ char huge[1024]; /*
1024 bytes */
};
union u arr[100];

The array object "arr" contains 100 elements, each of which is an object
of type "union u". Each element may hold either a char, a double, or an
array of 1024 chars. To allow this, each element is at least 1024 bytes
in size. Yes, this means that if you store a char value in "tiny",
you're wasting 1023 bytes. If that's a problem, you should use
something other than a union to hold your data.

The stride of the array is (at least) 1024 bytes. (It could be even
larger if the implementation adds padding at the end, which it's
permitted to do, but it's unlikely in this case.)

Note that the implementation doesn't keep track for you of which member
of the union is currently active. It's up to you do to that yourself,
or to deal with the consequences if you read one member after writing a
different one.

I believe you are wrong about this - consider this example...

main()
{
union { char a; long int b; } u[10];
memset(u, 'x', 10 * sizeof *u);
u[0].a = 'a';
u[1].b = 0;
printf("%s\n", u);
}

On gcc, this yields
../prog
axxxxxxx

The problem is that it just doesn't make sense to have an array of
inconsistent types.
 
K

Keith Thompson

Peter Nilsson said:
No. A union is a type not an object.

A union type is a type. A union object is an object. A union value
is a value. A "union", without qualification, could conceivably
refer to any of those (or to the Teamsters).

The standard does sometimes use the word "union" without
qualification; I think it usually means "union type", but in some
cases either that or "union object" would make sense.
 
E

Eric Sosman

Keith said:
[...]
Note that the implementation doesn't keep track for you of which member
of the union is currently active. It's up to you do to that yourself,
or to deal with the consequences if you read one member after writing a
different one.

I believe you are wrong about this - consider this example...

main()
{
union { char a; long int b; } u[10];
memset(u, 'x', 10 * sizeof *u);

Aside: Why not just `sizeof u'?
u[0].a = 'a';
u[1].b = 0;
printf("%s\n", u);
}

On gcc, this yields
./prog
axxxxxxx

This proves nothing, because it invokes undefined behavior (for
at least two different reasons).
The problem is that it just doesn't make sense to have an array of
inconsistent types.

You don't: You have an array whose elements are all of one
type, just like all other arrays in C. The type in question is
a union type, all all the elements have the same sizeof. However,
there is no requirement that the union at [0] and the union at [1]
are currently storing values of the same element type. In your
example above, we could have

for (i = 0; i < 10; ++i) {
if (i % 2 == 0)
u.a = '0' + i;
else
u.b = i * i;
}

.... so the even-numbered unions hold `char' values and the odds
hold `long's. Knowing that, you could then write

for (i = 10; --i >= 0; ) {
if (i % 2 == 0)
printf ("u[%d].a = %c\n", i, u.a);
else
printf ("u[%d].b = %ld\n", i, u.b);
}

.... since for each array element you're accessing the union member
that was most recently stored. If you wrote `if (i % 2 != 0)' or
`if (i % 3 == 0)' or some such, the behavior would be undefined.
As Keith said, it's up to you to remember which of a union's elements
is "current."
 
E

Eric Sosman

[...] A "union", without qualification, could conceivably
refer to any of those (or to the Teamsters).

Hoffa League, Hoffa League,
Hoffa League onward ...
 
K

Keith Thompson

john said:
Keith Thompson wrote: [...]
Note that the implementation doesn't keep track for you of which member
of the union is currently active. It's up to you do to that yourself,
or to deal with the consequences if you read one member after writing a
different one.

I believe you are wrong about this

I don't believe so.
consider this example...

main()
{
union { char a; long int b; } u[10];
memset(u, 'x', 10 * sizeof *u);
u[0].a = 'a';
u[1].b = 0;
printf("%s\n", u);
}

On gcc, this yields
./prog
axxxxxxx

You appear to have ignored a number of compile-time warnings.

So let's consider this example:

#include <stdio.h>
#include <string.h>

int main(void)
{
union { char a; long int b; } u[10];
memset(u, 'x', sizeof u);
u[0].a = 'a';
u[1].b = 0;
printf("%s\n", (char*)u);
return 0;
}

which yields the same output without quite so much undefined
behavior. But it still treats an array of unions as if it were
a string. What is this intended to demonstrate? Just what do
you think is happening here, and how does it contradict what I
wrote above?
The problem is that it just doesn't make sense to have an array of
inconsistent types.

Right, which is why it's not possible in C.

(Actually, heterogeneous arrays are quite possible and sensible in
dynamically typed languages, but C isn't such a language. A Perl
array, for example, might contain a mixture of numbers, strings,
references, and undefined values. C unions can be a good tool for
emulating something like that.)

Every element of the array ``u'' is of the same type, which happens
to be a union type. Each such element may contain, at various times,
either a char value or a long int value.

By itself, the code above doesn't appear to do anything useful.
But if you keep track somehow of the current value of each
element, it might make perfect sense. For example, you might
have a separate array each of whose elements specifies whether
the corresponding element of ``u'' currently contains a char or
a long int. Or you might have some complex encoding where the
first element is assumed to contain a value of a certain type,
and the types of the current values of the following elements are
determined by the previous values. Or (and this is more common),
the union could be a member of a struct, and another struct member
could specify the currently active menber (some languages provide
this directly as "variant records").
 
J

john

Keith said:
john said:
Keith Thompson wrote: [...]
Note that the implementation doesn't keep track for you of which
member of the union is currently active. It's up to you do to that
yourself, or to deal with the consequences if you read one member
after writing a different one.

I believe you are wrong about this

I don't believe so.
consider this example...

main()
{
union { char a; long int b; } u[10];
memset(u, 'x', 10 * sizeof *u);
u[0].a = 'a';
u[1].b = 0;
printf("%s\n", u);
}

On gcc, this yields
./prog
axxxxxxx

You appear to have ignored a number of compile-time warnings.

So let's consider this example:

#include <stdio.h>
#include <string.h>

int main(void)
{
union { char a; long int b; } u[10];
memset(u, 'x', sizeof u);
u[0].a = 'a';
u[1].b = 0;
printf("%s\n", (char*)u);
return 0;
}

which yields the same output without quite so much undefined behavior.
But it still treats an array of unions as if it were a string. What is
this intended to demonstrate? Just what do you think is happening here,
and how does it contradict what I wrote above?
The problem is that it just doesn't make sense to have an array of
inconsistent types.

Right, which is why it's not possible in C.

Possibly the code above was not the best example. I believe the code
below shows the problem more clearly...

print_chars(const char *p)
{
auto int i;
for(i=0;i<4;i++) printf("%c",*p++);
}

main()
{
union { char a; long int b; } u[10];
memset(u, 'x', 10 * sizeof *u);
u[0].a = 'a';
u[1].a = 'b';
u[2].a = 'c';
u[3].a = 'd';
print_chars(u);
putchar('\n');
}

With gcc I get:
../prog
axxx

instead of abcd. Arrays of unions are not sensible even when the types
are consistent!
 
B

Ben Pfaff

john said:
print_chars(const char *p)
{
auto int i;
for(i=0;i<4;i++) printf("%c",*p++);
}

main()
{
union { char a; long int b; } u[10];
memset(u, 'x', 10 * sizeof *u);
u[0].a = 'a';
u[1].a = 'b';
u[2].a = 'c';
u[3].a = 'd';
print_chars(u);
putchar('\n');
}

Are you confusing the following two declarations?
union { char a; long int b; } u[10];
union { char a[10]; long int b[10]; } u;
There are important differences. The latter is what you appear
to expect.
 
E

Eric Sosman

Keith said:
john said:
Keith Thompson wrote: [...]
Note that the implementation doesn't keep track for you of which
member of the union is currently active. It's up to you do to that
yourself, or to deal with the consequences if you read one member
after writing a different one.

I believe you are wrong about this

I don't believe so.
consider this example...

main()
{
union { char a; long int b; } u[10];
memset(u, 'x', 10 * sizeof *u);
u[0].a = 'a';
u[1].b = 0;
printf("%s\n", u);
}

On gcc, this yields
./prog
axxxxxxx

You appear to have ignored a number of compile-time warnings.

So let's consider this example:

#include<stdio.h>
#include<string.h>

int main(void)
{
union { char a; long int b; } u[10];
memset(u, 'x', sizeof u);
u[0].a = 'a';
u[1].b = 0;
printf("%s\n", (char*)u);
return 0;
}

which yields the same output without quite so much undefined behavior.
But it still treats an array of unions as if it were a string. What is
this intended to demonstrate? Just what do you think is happening here,
and how does it contradict what I wrote above?
The problem is that it just doesn't make sense to have an array of
inconsistent types.

Right, which is why it's not possible in C.

Possibly the code above was not the best example. I believe the code
below shows the problem more clearly...

print_chars(const char *p)
{
auto int i;

This may be the first time I've seen the `auto' keyword used
except as an obfuscatory device.
for(i=0;i<4;i++) printf("%c",*p++);
}

main()
{
union { char a; long int b; } u[10];
memset(u, 'x', 10 * sizeof *u);
u[0].a = 'a';
u[1].a = 'b';
u[2].a = 'c';
u[3].a = 'd';
print_chars(u);

The compiler should have given you a diagnostic here: The
function parameter is a `const char *' but the argument you provide
is a `union <tagless> *'. The latter does not convert to the former
automatically, so the compiler should have complained. If it didn't,
I suspect you're operating the compiler in a C-ish but not-quite-C
mode, such as the default mode of gcc. (Even so, I'm surprised
that you didn't get at least a warning.)

Had you received a warning (or had you not ignored it if it did
in fact appear), that would/should have alerted you to the fact that
you're doing something suspicious. In particular, you're using a
`char*' to look at the "unspecified values" of the bytes that come
after `u[0].a'. This is legal (more or less) because in C any object
of any type can be viewed as an array of bytes. On the other hand,
there's no guarantee about what sort of output you might get, beyond
the first 'a' -- the values of the other three bytes are unspecified
(they are not even guaranteed to retain their original 'x' values).
putchar('\n');
}

With gcc I get:
./prog
axxx

instead of abcd. Arrays of unions are not sensible even when the types
are consistent!

An array of big-fat-unions-that-happen-to-hold-chars is not
an array of chars (except in the special sense mentioned earlier).
An array of big-fat-unions-that-happen-to-hold-shorts is not an
array of shorts.

Here's an analogy, perhaps not a good one but maybe it will
help. I show you a row of ten shipping cartons, each a one-foot
cube. You put a packet of breath mints in each carton, and the
tiny little packet fits into a carton with lots of room left over
so you also put in some bubble wrap to keep things from rattling
around.

All ten cartons are lined up on the floor, each abutting its
neighbors with no space in between. I point at the breath mints
in carton [0], and ask "How far is it from this packet of mints
to the next?" Here are two possible answers:

1) "The mint packet is three inches long, so the next mint
packet must be three inches away."

2) "The box holding the mint packet is a foot long and the
next packet is in the next box, so it's a foot away."

Right now, you seem to be leaning toward answer (1) -- which means
your career as a shipping clerk will be short and inglorious. ;-)
 
K

Kenny McCormack

john said:
Keith Thompson wrote: [...]
Note that the implementation doesn't keep track for you of which member
of the union is currently active. It's up to you do to that yourself,
or to deal with the consequences if you read one member after writing a
different one.

I believe you are wrong about this

I don't believe so.

Holy Moly!!!!

Kiki doesn't think he is wrong!

Stop the presses! Batten down the hatches! Duck and Cover!

--
(This discussion group is about C, ...)

Wrong. It is only OCCASIONALLY a discussion group
about C; mostly, like most "discussion" groups, it is
off-topic Rorsharch [sic] revelations of the childhood
traumas of the participants...
 
W

Willem

john wrote:
) Possibly the code above was not the best example. I believe the code
) below shows the problem more clearly...
)
) print_chars(const char *p)
) {
) auto int i;
) for(i=0;i<4;i++) printf("%c",*p++);
) }
)
) main()
) {
) union { char a; long int b; } u[10];
) memset(u, 'x', 10 * sizeof *u);
) u[0].a = 'a';
) u[1].a = 'b';
) u[2].a = 'c';
) u[3].a = 'd';
) print_chars(u);
) putchar('\n');
) }
)
) With gcc I get:
) ./prog
) axxx
)
) instead of abcd. Arrays of unions are not sensible even when the types
) are consistent!

Are you being intentionally silly, or do you really have no idea of how
C works ?

The code above gives warnings about converting types, that is a good
indicator that what you're doing is completely wrong.

Try this print_chars:

print_chars(const union { char a; long int b; } *p)
{
int i;
for(i=0;i<4;i++) printf("%c",p.a);
}


Also, how many 'x'es do you think that memset() writes ?


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
K

Keith Thompson

john said:
Keith said:
john said:
Keith Thompson wrote: [...]
Note that the implementation doesn't keep track for you of which
member of the union is currently active. It's up to you do to that
yourself, or to deal with the consequences if you read one member
after writing a different one.

I believe you are wrong about this

I don't believe so.
consider this example...
[snip]

You appear to have ignored a number of compile-time warnings.

So let's consider this example:
[snip]

which yields the same output without quite so much undefined behavior.
But it still treats an array of unions as if it were a string. What is
this intended to demonstrate? Just what do you think is happening here,
and how does it contradict what I wrote above?
The problem is that it just doesn't make sense to have an array of
inconsistent types.

Right, which is why it's not possible in C.

Possibly the code above was not the best example. I believe the code
below shows the problem more clearly...

print_chars(const char *p)
{
auto int i;
for(i=0;i<4;i++) printf("%c",*p++);
}

main()
{
union { char a; long int b; } u[10];
memset(u, 'x', 10 * sizeof *u);
u[0].a = 'a';
u[1].a = 'b';
u[2].a = 'c';
u[3].a = 'd';
print_chars(u);
putchar('\n');
}

With gcc I get:
./prog
axxx

instead of abcd. Arrays of unions are not sensible even when the types
are consistent!

Again, you have ignored a number of warnings from gcc, and not even
bothered to tell us about them. (I don't know of a way to make gcc
compile the above without warnings -- not that you should even try.)
Either that, or the code you're posting isn't the code you compiled.

Passing a pointer-to-union to a function that expects a
pointer-to-char (in this case, passing ``u'' to your print_chars()
function) is invalid, and for reasons having nothing to do with
unions.

I thought I had addressed whatever point you were trying to make
in the portion of my followup that you snipped. I'll try again.

It's true that an array of unions isn't likely to be useful if you
don't keep track somehow of which member of each union was most
recently stored. (Sometimes unions are used for type-punning,
but we'll assume that you only want to access the most recently
stored member.)

*However*, an array of unions can be useful if you *do* keep track
of the most recently stored member for each element.

If you disagree with anything I've said, please explain why.
If possible, please explain in English, not by posting some code
and saying that it shows the problem. Thanks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,071
Latest member
MetabolicSolutionsKeto

Latest Threads

Top