C programmers! How do you use your 'union's ?

R

Razmig K

Dear mates,

This is an another survey for the common (and uncommon) nontrivial
uses of the aforementioned C construct.
It's posted by the same average C programmer who's made a similar
survey about the uses of 'enum's.

I know that space optimization is the key idea behind 'union's, and in
most cases a type-field is necessary to verify the type of the entity
in the 'union' variable.

I'm just curious about how my elder colleagues utilize this feature of
C in their implementations based on diverse programming styles in
diverse application domains.

Thank you for your interest.

Best regards,
//rk
 
M

Malcolm

Razmig K said:
This is an another survey for the common (and uncommon) nontrivial
uses of the aforementioned C construct.
My programs are pretty much a union-free zone.
I'm just curious about how my elder colleagues utilize this feature of
C in their implementations based on diverse programming styles in
diverse application domains.
One place unions are used is in the X Window system. The idea is that the
client and the X server can reside on different machines, and packets can go
over a network. An obvious optimisation is to make each packet of fixed
size, with a type field telling you how to interpret the data - a mouse
click, a request to redraw, etc. The messages are therefore implemented as
unions.
 
S

Severian

Dear mates,

This is an another survey for the common (and uncommon) nontrivial
uses of the aforementioned C construct.
It's posted by the same average C programmer who's made a similar
survey about the uses of 'enum's.

I know that space optimization is the key idea behind 'union's, and in
most cases a type-field is necessary to verify the type of the entity
in the 'union' variable.

I'm just curious about how my elder colleagues utilize this feature of
C in their implementations based on diverse programming styles in
diverse application domains.

In my largest project, 10 years old and going strong, I found exactly
*two* uses of 'union' in ~500K lines of C code.

And were I refactoring either area of the code base today, I would get
rid of them.

- Sev
 
P

Peter Ammon

Razmig said:
Dear mates,

This is an another survey for the common (and uncommon) nontrivial
uses of the aforementioned C construct.
It's posted by the same average C programmer who's made a similar
survey about the uses of 'enum's.

I know that space optimization is the key idea behind 'union's, and in
most cases a type-field is necessary to verify the type of the entity
in the 'union' variable.

I'm just curious about how my elder colleagues utilize this feature of
C in their implementations based on diverse programming styles in
diverse application domains.

Thank you for your interest.

Best regards,
//rk

I use unions in my "ooze" code. An ooze is an abstract input stream
that can represent either a string or a file. Mr. Heathfield's
thesaurus deserves credit for the name.

For your viewing pleasure:
http://homepage.mac.com/gershwin/temp/ooze.h
http://homepage.mac.com/gershwin/temp/ooze.c

Oh, and the license for this stuff (just in case)
http://www.people.cornell.edu/pages/pa44/princess/license.txt

-Peter
 
I

istartedi

Razmig K said:
Dear mates,

This is an another survey for the common (and uncommon) nontrivial
uses of the aforementioned C construct.
It's posted by the same average C programmer who's made a similar
survey about the uses of 'enum's.

I use unions in a stack-based VM. Several different types can be pushed
onto the same stack. I'm hard pressed to think of something more elegant
than unions for that.

I missed the poll on enums. I never use them. I always #define stuff.

Now next you'll probably ask about gotos. I was taught to be a goto snob by
more than one professor. Recently though, I've loosed up and used some
gotos for functions that require extensive "clean up" before returning.

--$teve
 
J

Joe Wright

istartedi said:
[snip]
Oh, and the license for this stuff (just in case)
http://www.people.cornell.edu/pages/pa44/princess/license.txt

Holy horizontals, Batman!

There are no line terminators. Quick, the Bat-cariage!
return that is... (groan).
First, its carriage. Or maybe carrier. Think typewriters.
The IBM Model B had a carriage. That was the mechanism which held the
platen and therefore the paper and moved it (through escapes) past a
fixed print mechanism. The Teletype printers and the IBM Selectric have
fixed platen (paper) and moving print mechanisms called carriers. So
what does CR mean after all? Sorry. :)
 
K

Kevin Easton

Razmig K said:
Dear mates,

This is an another survey for the common (and uncommon) nontrivial
uses of the aforementioned C construct.
It's posted by the same average C programmer who's made a similar
survey about the uses of 'enum's.

I know that space optimization is the key idea behind 'union's, and in
most cases a type-field is necessary to verify the type of the entity
in the 'union' variable.

I'm just curious about how my elder colleagues utilize this feature of
C in their implementations based on diverse programming styles in
diverse application domains.

One use for unions is if you have a data structure - say, a binary tree
- where the data carried by a node depends on its type:

struct node {
struct node *child_left, *child_right;
enum { NODE_TYPE_A, NODE_TYPE_B, NODE_TYPE_C } type;
union {
struct {
int x;
char *y;
char *z;
} type_a;
struct {
int x;
void *y;
} type_b;
struct {
struct tm x;
long int y;
} type_c;
} data;
};

One place this example in particular appears is building expression
parse trees.

- Kevin.
 
C

CBFalconer

Jens said:
My too...

I used them once in a parser written with bison, and in an
implementation of a network protocol, but usually I avoid them
since I usually dislike to access the same data as two different
types. And even if I am forced to do so (hexdumps for example) I
usually use pointers of different type.

This is a very short sighted view. unions can be very effective
in adapting data records to their function. They can be used to
build the logical (but verbose) equivalent of a Pascal variant
record.

For example, if you have a symbol table, and want to record the
characteristics of that symbol, e.g. a constant, a define, a
function, a variable, all of which have widely different things to
record, a union is the appropriate mechanism. One field of the
structure will hold the type of the identifier, another may point
to the spelling, and a union can hold such things as values,
constantness, etc.

Even with todays monstrous memories, conservation is useful. If
nothing else it reduces the missing of cache values and thrashing
of virtual memory.

If you examine Linux kernel source or filesystems, I believe you
will find many examples of unions.
 
S

Severian

This is a very short sighted view. unions can be very effective
in adapting data records to their function. They can be used to
build the logical (but verbose) equivalent of a Pascal variant
record.

For example, if you have a symbol table, and want to record the
characteristics of that symbol, e.g. a constant, a define, a
function, a variable, all of which have widely different things to
record, a union is the appropriate mechanism. One field of the
structure will hold the type of the identifier, another may point
to the spelling, and a union can hold such things as values,
constantness, etc.

Even with todays monstrous memories, conservation is useful. If
nothing else it reduces the missing of cache values and thrashing
of virtual memory.

If you examine Linux kernel source or filesystems, I believe you
will find many examples of unions.

This is similar to one of my uses of union that I would not now do the
same way. I maintain an array of "steps" to perform, and each step
contains a step type and the data for the step. The data is different
for each type, so I used a union in my original design (about 8 or 9
years ago).

However, since the data in my case is variable-length, it would
actually be more space-efficient to separately allocate each step's
data and point to it, and I would recode it that way.

Were the data all the same length (within a small %), or all very
small, a union would be appropriate.

- Sev
 
C

CBFalconer

Severian said:
.... snip ...

This is similar to one of my uses of union that I would not now
do the same way. I maintain an array of "steps" to perform, and
each step contains a step type and the data for the step. The
data is different for each type, so I used a union in my original
design (about 8 or 9 years ago).

However, since the data in my case is variable-length, it would
actually be more space-efficient to separately allocate each
step's data and point to it, and I would recode it that way.

Were the data all the same length (within a small %), or all
very small, a union would be appropriate.

This is where Pascal variant records are superior. new(thing,
kind) can be called with a parameter specifying the variant, and
only enough memory for that variant will be allocated. With C you
will have to nest malloc calls and set up a chain of void*
pointers. This expands the record (struct) with the space to hold
those pointers.

C99 has provisions for implementing the equivalent of Pascal
variant records.
 
D

Dave Thompson

This is where Pascal variant records are superior. new(thing,
kind) can be called with a parameter specifying the variant, and
only enough memory for that variant will be allocated. With C you
will have to nest malloc calls and set up a chain of void*
pointers. This expands the record (struct) with the space to hold
those pointers.
First, I don't see what void* pointers, or any kind of pointers, have
to do with it. Second, you can fairly easily compute the size of any
"variant" (in C, case of trailing union) with offsetof and sizeof.

I think it is technically illegal to access the resulting memory as
the whole type, but in practice it should work as long as you only
access (parts of) the variant that is allocated, as is required for
Pascal also. The only thing that's likely a problem is assigning, or
passing as an argument or returning by value, the whole type, and
(only) the first can be fixed by instead memcpy'ing the right size.

You do need to name the union and the choice on every access, which is
clutter unless you form and use a local pointer or in C++ reference,
or yuckily-global macros, but doesn't change/limit the semantics.
C99 has provisions for implementing the equivalent of Pascal
variant records.

Huh? The only thing new in C99 in this area is Flexible Array Member
in struct, which is nothing like Pascal variant record. OTOH, C++
derived or "subclass" types are functionally similar, though
notationally and conceptually reversed (specials "contain" common
instead of common "contains" specials).

- David.Thompson1 at worldnet.att.net
 
B

Ben Bacarisse

Datesfat Chicks said:
Consider:

UINT32 x;

x >>= 24;

A lot of compilers will implement this statement using a shift
subroutine (typically very inefficient), so it leads to constructs
like:

UNION32 x;

x.uc[3] = x.uc[0];
x.uc[0] = x.uc[1] = x.uc[2] = 0;

which will in some cases compiler more efficiently.

Putting aside the efficiency claims, it's probably worth noting that
these two bits of code don't mean the same thing. There may be
situations where the first is correct (and even portable) and others in
which the second is correct (and even portable). There will be systems
(C implementation + hardware) where they do the same thing but they
don't mean the same thing.
 
B

BGB

There are only two uses of unions that I'm aware of:

a)Variant records of one sort or another:

http://en.wikipedia.org/wiki/Tagged_union

interestingly, I don't do tagged unions...

I do however do "variants" in a very different way ("magic pointers").
these could almost be called "tagged pointers", except that this is not
strictly correct (only some types are present directly in the pointer,
and generally use reserved regions of the address space rather than
value alignment to encode types).


I have sometimes used them for untagged unions though, namely:
put a value in;
get the same value back out later.

this can also be done by using raw/untyped memory and pointer
operations, but the use of a union is a little nicer as then one can
type, say:
"v->pb" rather than, say, "*(unsigned char **)v", which can have a few
uses...

also, unlike raw memory, they can partly abstract over data-size issues,
as the size of a union will be that of its largest member.

b)Converting or picking apart machine data types.

(b) occurs every day of the week in small embedded systems.

Consider:

UINT32 x;

x>>= 24;

A lot of compilers will implement this statement using a shift
subroutine (typically very inefficient), so it leads to constructs
like:

UNION32 x;

x.uc[3] = x.uc[0];
x.uc[0] = x.uc[1] = x.uc[2] = 0;

which will in some cases compiler more efficiently.

errm...

you sure about all this?...
it seems, a little, pulled out of ones' ass...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top