I've streamlined the processs to construct arbitrary C type
declarations from English, to avoid extra parentheses:
* Start with any declaration that has a name, such as int x;
Probably better to start with the phrase "declare x as" and write:
x
(without the "int" or ";").
Note: I will address parentheses in a moment.
* To create an array N of that, replace the name x by x[N] or (x[N])
Rather than "replace", we now have "add (append) [N]". To omit
the size of the array, you simply omit the size of the array:
x[N]
or
x[]
(while remembering that without a size, you get an "incomplete
type", or -- if x is eventually going to be a formal parameter,
*and* these are the innermost brackets -- the type gets rewritten
by the compiler, replacing "array of" with "pointer to").
* To create a pointer to that, replace the name x by (*x), although
when x already has * on it's left, the parentheses are not needed.
Better yet, simply say "prefix with *". Again, handling parentheses
is something I will get to.
* To create a function returning that, replace the name x by x() or
(x()). Any parameter types go inside the ().
Instead of "replace", again you should go with "add".
So, we have the rules:
array N of: add "[N]"
pointer to: prefix with "*"
function (args) returning: add (args)
and ultimately you have a C keyword that gives the type-name (or,
if you use those evil
typedefs, a typedef-ed name that is not
a keyword but acts as if it were one, except when it does not
).
To put in the C keyword that gives the type-name, just insert it
at the front. (Note: I have deliberately avoided const, volatile,
and restrict qualifiers throughout.)
And now we get to parentheses. Any time you "add or prefix", you
must add parentheses if what you have so far will bind less tightly
than what you are about to add. This is the same rule one uses
for expressions, where, e.g., if you want to multiply by 5, you
will need parentheses if you just added x and y: (x+y)*5 needs the
parentheses. (As we will see in a moment, this is simpler for
declarations than for expressions in general.)
The phrase "bind less tightly" is the key here. Doing it right
every time *is* difficult, at least until you have memorized the
rules and practiced them for quite a while; and you simply have to
memorize and practice: there is no short-cut. Luckily, most people
did a lot of memorizing and practicing when they were children
taking math classes. Thus, you should be familiar with operator
binding, although you might have called it "precedence" (which I
think is not the best word for it). In many languages, including
C, expressions with infix operators have a way of binding some
operators more tightly than others, so that:
result = x1 * x2 + y1 * y2;
"means":
result = (x1 * x2) + (y1 * y2);
(there are some languages, including Lisp and APL, that do away
with "precedence" entirely, but if you are a programmer you should
know enough languages to be familiar with those that have it, and
thus bind some operators more tightly than others).
C is a bit odd in having both a prefix "*" operator (for
pointer-following) and postfix-"[]" and "()" operators (for arrays
and function calls), and even more odd in having the "declaration
mirrors use" rule for declarations. C's set of binding levels (or
"operator precedence") is fairly well-chosen though (albeit with
a few notable exceptions), so that needing extra parentheses for
these is relatively rare. But sometimes they are required.
In particular, array subscripting and function-calling both
bind more tightly than pointer-following, so that -- if v1 and
v2 are variables with appropriate types -- an expression like:
*v1
();
binds the same as:
*((v1)());
and:
*v2();
binds the same as:
*((v2()));
Again, () and [] (postfix function-call and array-index) bind more
tightly than unary "*" (pointer-following).
Because "declaration mirrors use", these rules apply in declarations
as well. Adding "array N of" or "function (args) returning" to a
declaration means adding [N] or (args) to whatever you have so far.
These will bind more tightly than whatever else you have so far;
so if "whatever else you have so far" has a pointer "*" on the
left, you need to enclose what-you-have-so-far in parentheses.
Since inserting a prefix "*" never binds more tightly (than array-of
or function-returning), you never have to add parentheses at this
point.
Let us run through several examples, then:
- declare x as array 3 of pointer to int.
x -- takes care of "declare x as"
x[3] -- add brackets and size for "array 3 of"
*x[3] -- prefix for "pointer to"
int *x[3] -- insert final type keyword
- declare x as pointer to array 3 of int.
x -- "declare x as"
*x -- "pointer to"
(*x)[3] -- "array 3 of": parentheses before appending
int (*x)[3] -- insert final type keyword.
- declare x as
pointer to
array 4 of
pointer to
function (argsA) returning
pointer to
array 5 of
pointer to
function (argsB) returning
void
x -- declare as x
*x -- pointer to
(*x)[4] -- array 4 of: had to pre-parenthesize
*(*x)[4] -- pointer to
(*(*x)[4])(argsA) -- function (argsA) returning: pre-parenthesize
*(*(*x)[4])(argsA) -- pointer to [etc]
(*(*(*x)[4])(argsA))[5]
*(*(*(*x)[4])(argsA))[5]
(*(*(*(*x)[4])(argsA))[5])(argsB)
void (*(*(*(*x)[4])(argsA))[5])(argsB)
and one last "trick" one:
- declare x as pointer to array 2 of function (void) returning int
x
*x
(*x)[2]
(*x)[2](void) -- ("looks funny"; did you catch the trick yet?)
int (*x)[2](void)
The trick here is that this is an invalid declartion: *x has type
"array 2 of function ..." and "array of function" is a constraint
violation. So although it is possible to write such a declaration,
it need not compile (and must draw a diagnostic).
To turn any of the above into a stand-alone declaration, we need
only add a semicolon. I think it is best to work these without
the semicolons, since we might instead use "= {some initializer};"
for instance.
If you want to handle qualifiers, the rule is to insert them at
the same time as the "*" pointer-to part, just after the "*" (because
const, void, and restrict go with pointers), and then again to
insert const or void (but not restrict) at the "top level" when
you get to the C keyword (or typedef-name-acting-like-a-keyword)
that defines the type. The top-level const-or-void can come either
before or after the C keyword, while "deeper" qualifiers must come
after the "*".
I'm having trouble however building cast types using this method.
To build a cast, start with "declare x as", build a declaration
for x, and then remove the identifier "x" and enclose the entire
thing in parentheses.