**Important note to OP and other clc readers: my responses here are
more of an exercise for myself to see if I have a good grasp of linkage
and scope in C, so if there are any mistakes, please do not hestitate
to correct me.
OK, you are missing a couple of things, some small and one fairly large:
Note: I changed 'extern b' to 'extern int b'
(This is good since C99 requires a type-name. In C89 aka C90 it is
OK to leave it out, but I would say "not good style".)
I believe the compiler. If you just look at the linkage side of things,
then the later declaration of b should assume the linkage of the most
previous declaration of b (internal linkage), but that IMHO isn't the
problem.
Actually, the most recent declaration+definition of "b" (int b = -2)
has *no* linkage.
The problem is the definition of b (the line "int b = -2").
Because it appears inside a function and does not explicitly define a
storage class, it is given storage class auto by default.
This is correct -- but "storage-class" (more precisely, "storage
duration", as specified by a storage-class specifier) is mostly
separate from linkage. There are three possible storage durations
for objects: static, automatic, and allocated. (The lattermost
was missing from C89 but one might as well just assume it was
included, since otherwise malloc and free cannot work. Also, there
are 5 storage class specifier keywords: auto, register, static,
extern, and -- purely for syntactic reasons -- typedef.)
There are also three possible linkages for identifiers: "external",
"internal", and "none". Note the distinction here between "objects",
which have storage duration, and identifiers, which have linkage.
When we deal with ordinary variable names ("b", "foo", whatever)
we generally have both an object and an identifier, though. (I
use weasel-words here to sidestep issues with identifiers for
functions, typedef'ed names, goto labels, etc.)
The later declaration of b as "extern" is therefore a contradiction
- "extern" and "auto" are both storage class specifiers, but an
object is only allowed to have *one* storage class.
This is true; but we can have more than one object with the same
name:
double xyz = 3.141592653589793238462643383279502884;
void f(void) {
int xyz = 42;
...
}
Here we have two different "xyz"s. The outer one, at file scope,
declares and defines an object (of type "double") with static
duration and external linkage. The inner one, at block scope,
declares and defines an object (of type "int") with automatic
duration and no linkage.
One hard and fast requirement is that identifiers for objects
that have automatic duration always have no linkage, and are always
at block scope. However, this does not work in reverse: identifers
that are at block scope can have linkage, and can have other than
automatic duration:
void g(void) {
static int zorg; /* block scope, static duration, no linkage */
extern double evil; /* static duration, external linkage */
...
}
The "extern" keyword is particularly squirrely, because it means
"external linkage unless already visible with internal linkage".
It also has a side effect of suppressing definition-ness, if a
declaration that would be a definition would be a "tentative
definition":
int abc; /* "tentative definition" of abc (an int) */
int def = 4; /* "definite definition" of def */
extern int ghi; /* declaration (and nothing else) of ghi */
extern int jkl = 5; /* "definite definition" of jkl */
At the end of a "translation unit", any tentative definitions become
actual definitions, initialized as if with "= {0}".
What this means is that, for identifiers that name objects, there
are *five* properties we have to be concerned about:
- type (int, double, etc.; these are "obvious" and are independent
of the rest of these items so I will now ignore "type")
- scope (file or block)
- linkage (external, internal, or none)
- storage duration (static or automatic)
- definition-ness (yes, "tentative", or no)
The "interesting" keywords here -- auto, extern, register, and
static, which are the storage-class specifier keywords excluding
"typedef" -- have various effects on scope, linkage, storage
duration, and definition-ness:
"auto": allowed only in block scope and then has no effect
"register": allowed only in block scope and then has no effect
"static": allowed in both block and file scope; effect
depends on which scope: can change duration or linkage
"extern": allowed in both block and file scope; effect
depends on visible previous declaration(s), if any,
and on whether the definition-ness is already absolute:
can change linkage, duration, and definition-ness
For an identifier naming an object in block scope, "static" gives
it static duration; it continues to have "no linkage". For the
same kind of identifier in file scope, "static" gives it internal
linkage; it already had static duration.
For an identifer naming an object in block scope, "extern" gives
it the same linkage as any previous visible linkage; but if the
only visible name has no linkage, extern gives it external linkage.
If the identifier is in file scope, any previous visible linkage
is by definition either internal or external, so extern just re-uses
that linkage.
If no previous declaration is visible, extern gives the identifier
external linkage (at any scope). The object will have static
duration, and "tentative definition-ness" is suppressed so that a
declaration that does not provide an initial value is just a
declaration, not a declaration-and-tentative-definition.
Thus, we can make up a table, giving the four "interesting" properties
for identifiers that name objects, and the way the four keywords affect
them, depending on whether they appear in block or file scope. The
last entry is for "if no keyword is used".
+----------------------+-----------+----------+--------------------+
| keyword | duration | linkage | definition? |
+----------------------+-----------+----------+--------------------+
| auto (file scope) | (illegal) | ---- | --- |
| (block scope) | automatic | none | yes |
+----------------------+-----------+----------+--------------------+
| register(file scope) | (illegal) | ---- | --- |
| (block scope) | automatic | none | yes |
+----------------------+-----------+----------+--------------------+
| static (file scope) | static | internal | yes |
| (block scope) | static | none | yes |
+----------------------+-----------+----------+--------------------+
| extern (file scope) | static | varies | suppress tentative |
| (block scope) | static | varies | no (cannot init.) |
+----------------------+-----------+----------+--------------------+
| (none) (file scope) | static | external | tentative or yes |
| (none) (block scope) | automatic | none | yes |
+----------------------+-----------+----------+--------------------+
"Varies" means "external unless currently shown to be internal".
The "currently shown" part is tricky at block scope, since it
depends on whether a previous file-scope declaration has been
obscured by a block-scope declaration for the same identifier.
Note that the "auto" keyword is redundant: it is illegal at file
scope, and at block scope, it does the same thing you would get
if you did not use a storage-class specifier keyword.
The compiler ultimately assumes that your "extern int b" must be
referring to a different "b" than the one you declare and define in
main(), i.e. one that has storage class extern and external linkage;
the compiler can't find such a variable and complains accordingly.
This part is correct.
Now on to the second example...
Note: Changed "extern b" to "extern int b"
Not quite. First off, in line 1 your "int b" has external linkage: A
variable defined in file-scope and which isn't "static" has external
linkage by default.
Right. Look in the table above: no keyword, file scope: we get static
duration, external linkage, and "tentative or yes" definition. The
declaration includes an initial value so the "definition" answer is
"yes".
On line 5, we have extern at file scope, so (per the table) we get
static duration, "varies" linkage, and "no" for definition. There
is no initializer (one would be illegal here anyway) and "tentative
definition-ness" is suppressed, so this is purely a declaration.
For a third and fourth example, consider:
static int x = 42; /* declaration 1 (also definition) of x */
int main(void) {
int a = 0;
{
extern int x; /* decl 2: iffy, but legal */
x++;
}
}
The "extern int x" line again refers to this entry:
keyword duration linkage definition?
+----------------------+-----------+----------+--------------------+
| extern (file scope) | static | varies | suppress tentative |
| (block scope) | static | varies | no (cannot init.) |
+----------------------+-----------+----------+--------------------+
It is at block scope so the duration is "static" and definition-ness
is "no", but as before, the linkage is "varies". We must play
compiler and ask "is there a previous x in some visible scope that
has some linkage?" The answer is yes: the previous file-scope
"static int x" is visible, and has internal linkage. So here "x"
gets internal linkage, and this refers to the same "x" as the
"static int x = 42".
If we change "int a = 0;" to "int x = 0;" in main(), however, we get:
static int x = 42; /* decl 1 (also definition) */
int main(void) {
int x = 0; /* decl 2 (also definition) */
{
extern int x; /* decl 3: ERROR */
x++;
}
}
Here the only visible "x" at the "extern int x" line is the
block-scope "x", which has no linkage. Thus, "extern int x" gives
the third declaration of "x" external linkage, in spite of the
first declaration giving it internal linkage. This triggers
undefined behavior (paragraph 7 of section 6.1.2.2 of the C99 draft
I keep handy).