Storage specifiers

A

aneesh

Hi all,
I have a program, this works fine but if we declare static below "int
i" it shows different storage class specifier. what will be the
reason.

#include <stdlib.h>

static int i ;
int i;
int main()
{

return 0;
}
---------------------------------------
#include <stdlib.h>

int i;
static int i ;

int main()
{

return 0;
}

thanks
Aneesh
 
C

Chris Torek

static int i;
int i;

is valid at file scope, yet:

int i;
static int i;

is not.

The reason is surprisingly complicated.

Before we can say why the first is OK and the second is not, we
will need a good grasp of identifiers and their properties.
Identifiers themselves are fairly straightforward. For the most
part, they are just names that start with an alphabetic character,
and go on to include alphanumerics. Hence "i", "main", "a2b", and
so on are all (once you remove the double quotes) valid identifiers.
Identifiers name things, like variables, functions, and even macros.
The trick is deciding which things they name, and in turn, what
those things are.

In this case, "i" names an ordinary variable -- but all identifiers
have a number of special properties. Two important ones are "scope"
and "name space". A third, and one which becomes crucial soon, is
"linkage", but we should consider scopes and name spaces first.

The scope of an identifier essentially controls its visibility
within a given "translation unit" (roughly, "source file after
expanding #include directives"). If you have done much C programming
at all, you will be familiar with both file-scope and narrower-scope
identifiers:

int shared;

void f(void) {
int i;
...
}

void g(void) {
int i;
...
}

Here the two "i"s each have "block scope", so that the "i" in f()
is completely different from the "i" in g(). Neither "i" can be
named outside the corresponding function -- trying to use f()'s "i"
in another function h() simply will not work; it cannot be named
there. On the other hand, the variable called "shared" is visible
inside both f() and g(), and would be visible in h() if we added
it below these, unless we declared a block-scope variable also named
"shared" (which would then hide or "shadow" the outer one).

The C Standard states quite clearly that two identifiers have the
same scope if and only if their scopes terminate at the same point.
(Many of the things the Standard says are not very clear, but this
one is.) Block scopes end at the ending of their enclosing block,
while file scope does not end until the end of the translation
unit. All of this means that the name "i" in f() refers to f()'s
"i", because that i's scope ends at f()'s final close brace, and
so on. Re-using the name "shared" -- including any attempt to
re-declare it -- refers to the file-scope variable, unless the
re-declaration is in an inner scope, such as:

void h() {
int shared; /* UGH, shadows file-scope variable */
/* Do not do this if you can possibly avoid it. */
/* It confuses people who have to maintain the code later. */
}

or unless "shared" is in a different "name space".

C's name spaces are relatively primitive compared to other languages.
Probably the most straightforward example is struct and union tags.
These tags are in the "tag name space" and never collide with
ordinary variables. This means you may write, e.g.:

struct glorp {
int a;
char *b;
} glorp;

The ordinary variable "glorp" is in the ordinary-variable namespace,
while the tag "glorp" in "struct glorp" is in the structure tag
name space, so these are two different "glorp"s. (I tend to think
of them as "the regular glorp" and "the struct glorp", myself, with
the word "struct" as part of the name. Of course, I also think of
the word "struct" as standing for "STRangle spelling for User-defined
abstraCt Type", and hence read it as if it were the word "type" in
some other language. :) )

Here, in the example that kicks off the whole problem, the identifier
"i" is in the ordinary namespace, so we can pretty much ignore the
whole issue. Still, it is something to keep in mind: you can re-use
the same names for different things, even in the same scopes, as
long as they are in different name-spaces. (In general, it is
probably wise to limit such re-use, for the same reason that it is
generally bad to "shadow" outer-scope variables -- it gets confusing.)

Finally, we get to "linkage" of identifiers. Linkage is the means
by which separate translation units can "meet up with each other",
as it were. An identifier that has "external linkage" is visible
outside its translation unit. An identifier with "internal linkage"
or "no linkage" -- these are the only other possibilities -- is
not. This means that "internal linkage" variables do not "escape"
outside the translation unit. When you are working on large
projects, knowing that some variable cannot be seen outside some
given source module can be very helpful, so making your file-scope
variables "static" whenever possible is often a good idea. (It
is helpful because it means you can avoid searching for other uses
of that variable, and avoiding work is good. :) )

We now have almost everything we need to answer the question -- which,
in case you forgot, is:

"
static int i;
int i;

is valid at file scope, yet:

int i;
static int i;

is not. Why?"

I say "almost everything" because we now get into the places where
C really gets confusing.

A file-scope variable invariably has some -- i.e., not "no" --
linkage, either "internal" or "external". It also always has
"static duration" -- something I have not mentioned above -- and
(obviously) "file scope".

Duration is not a property of identifiers, but rather of variables,
or more precisely, of "objects". There are three durations:
"static", "automatic", and "allocated". This "static" is only
partly connected to the "static" keyword. In this case, we are
dealing with file-scope objects (variables), which *always* have
static duration, so the "static" keyword cannot possibly change
the duration. Instead, the "static" kewyord, in this particular
case, affects the linkage.

As nonsensical as this may seem, the "static" keyword has two
meanings. For block-scope variables, it changes their duration,
but for file-scope variables, it changes their linkage. (In C99,
the "static" keyword has yet another new meaning; fortunately, we
can ignore that entirely here.)

Without the "static" keyword, a file-scope variable usually winds
up with external linkage. Adding "static" tells the compiler:
"this identifier is to have internal linkage."

There is another keyword you can use here, the "extern" keyword.
This keyword is just as misleading as "static". You might think:
"extern, ah, that must mean external linkage"; but if you did, you
would be wrong. :) As usual, C's keywords are twisted around.
The default for file-scope variables is external linkage, so C does
not *need* a keyword to specify that. Instead, putting "extern"
in front of a file-scope variable declaration has the effect of
removing "tentative definition-ness" (which I have not explained,
and will not here). Moreover, leaving out both "extern" *and*
"static" has the same effect on linkage as using an explicit "extern"
(you just get that "tentative definition" thing). And now things
get *really* bizarre, because...

Aside from suppressing tentative definitions, the "extern" keyword
gives these identifiers the *same* linkage as any file-scope
declaration already visible. If no such declaration is visible,
"extern" gives the identifier external linkage. In particular,
this means that, in:

static int internal_linkage;
extern int internal_linkage; /* surprise! */

the second line *means* "static" even though it *says* "extern"!

*Now* we finally have all the pieces, and can answer the original
question. Consider the first code fragment:

static int i;
int i;

Here we have two file-scope declarations for the identifier "i".
The first one gives "i" internal linkage. The second one does not
use the "extern" keyword (and is thus a tentative definition --
not that it matters since the first declaration is already a
tentative definition as well), but has the same linkage effect
as if it did. Since the first declaration is already in scope,
this "extern" refers back to it and gives this second "i" internal
linkage. So this is really two "static int i;" lines, even
though the second one uses the implied "extern" keyword to mean
"static-style linkage".

On the other hand, the second fragment reads:

int i;
static int i;

Here the first line has an implied "extern" as far linkage goes,
but this time there is no "i" in scope. The "extern" cannot find
a previous declaration to refer back to, and thus gives this "i"
external linkage. The second line then declares "i" again, but
this time says it should have internal linkage. The C standard
says that the effect here is undefined, but a good C compiler will
produce some kind of message ("warning", or "error", or electric
shock to the programmer, or some such) to alert you and give you
a chance to fix the code.
 
C

Chris Dollin

Chris said:
.......................................Of course, I also think of
the word "struct" as standing for "STRangle spelling for User-defined
abstraCt Type", and hence read it as if it were the word "type" in
some other language. :) )

I think you meant "STRange spelling ...".
.................................................. The C standard
says that the effect here is undefined, but a good C compiler will
produce some kind of message ("warning", or "error", or electric
shock to the programmer, or some such) .............

Perhaps you *did* mean "strangle"?
 
C

Chris Torek

[I wrote, in part, with typo fixed]
A struct really isn't an abstract type though, right?

Well, I will let you define "abstract type", and then we can see
whether "struct S" fits that definition. :)

C++'s "classes", which are types that also have operations literally
embedded within them, fit almost every definition, while C's structs
require a looser definition for "abstract type". But if anything
in C meets your definition, "struct" will (assuming your definition
is not radically odd, at least). It is as close as you get. When
"struct foo" defines a new type, it is incompatible with all other
types, and any operations you define on "struct foo" will work only
on that type.

(The "typedef" keyword, by contrast, *never* defines a new type.
The "typedef struct { ... }" construct relies on the "struct"
keyword to define the type, then gives that new type a set of
aliases. You can toss out the typedef keyword, give the struct a
tag, and get the same effect; and my personal preference is to do
just that, almost every time. In order to be self-referential or
mutually-referential, structs *must* have tags; so why not just
give *every* struct a tag, and use "struct foo" as the type-name?)
It isn't really visible is it? Since you have to provide a local
declaration to use it.
e.g., main.c: int i;
foo.c: extern int i; /* local declaration to see i */

If you give an identifier internal linkage, a later local declaration
will not be able to match it. For instance, if we change main.c's
declaration to "static int", foo.c's "extern int" will likely
produce a link-time error of the form "undefined symbol: i".
I would call that "invisible" -- main.c's "static int i" is invisible
to foo.c -- and hence, by contrast, an "i" with external linkage
is the opposite, or "visible".

You do have to re-declare it, but not re-define it (and indeed, if
you re-define it, some systems will even refuse to link the code,
though traditional Unix-like systems are more lax about that).
Could you point out what the new [C99] meaning [of static] is, please? :)

C99 has a number of new features for array arguments, including
"variable length arrays":

void f(size_t m, size_t n, double a[m][n]) {
size_t i, j;

for (i = 0; i i < m; i++)
for (j = 0; j < n; j++)
operate(a[j]);
}

which are defined in such a way as to make old Fortran programmers
happy. :) (Except, of course, that C arrays are row-major instead
of column-major.) Along with these come some features designed for
Fortran-like optimizations (for "vectorizing" compilers, and what
are called "strip-mining" techniques, and so on). These include
the "restrict" keyword:

/*
* Functions that act like memcpy and memmove, but with
* different types (really just for illustration).
* Note that like_memcpy does not allow overlapping objects,
* while like_memmove does, hence:
*/
void like_memcpy(char *restrict dst, const char *restrict src, size_t len);
void like_memmove(char *dst, const char *src, size_t len);

A "restricted pointer" puts a burden on the programmer, and thus
relieves the compiler of a corresponding counter-burden. In this
case, it tells the compiler that it is safe to assume that no dst
ever names the same object as any src[j], and vice versa. This
means it is OK for the compiler to load up a whole bunch of src
values and blat them into a whole bunch of dst[j] values without
looking to see whether changing dst[j], for any valid j, changes
src, for any valid i.

Since a memmove()-like function deliberately allows overlapping
copies, however, we must not tell the compiler that dst[j] is never
the same object as src, because in some cases, dst[j] *is* the
same -- for instance, to "move a buffer forward one byte", we will
have dst[1] being the same as src[0]:

like_memmove(buf + 1, buf, n);

Here it is *not* safe to overwrite (buf+1)[0] before reading buf[1],
because (buf+1)[0] *is* buf[1].

Finally, we get to the new meaning for "static":

void g(double a[static 10]) {
...
}

The definition of g() here tells the compiler that the array (or
really, "pointer to first element of array") "a" will *always* have
*at least* ten elements. It might have more, but it will never
have fewer. (Exactly how this is supposed to help a compiler, I
am not really sure. It does allow loading the first ten elements
into registers, or a vector register, but alias analysis, perhaps
brute-forced via the "restrict" keyword, is far better for this.)
I thought a variable declared with extern could either be a definition
or declaration?

Yes; this is where "tentative definitions" and actual definitions
come in.

Note that a definition is always also a declaration, but a declaration
is not necessarily a definition. How can one tell them apart?

The Standard's answer is that a definition is a declaration that
also reserves storage for an object. (More precisely, this is how
one defines an object; there are different rules for other entities.
The actual text says:

A declaration specifies the interpretation and
attributes of a set of identifiers. A definition of an
identifier is a declaration for that identifier that:

- for an object, causes storage to be reserved for that
object;

- for a function, includes the function body;

- for an enumeration constant or typedef name, is the
(only) declaration of the identifier.

Fortunately we can limit ourselves to objects here.)

The problem with this answer is that the rules for "reserving
storage" are not at all clear, and if you poke around inside the
standard's text, as I recall, it winds up being somewhat circular
(the ones that reserve storage are the definitions, while the
definitions are the ones that reserve storage; great :) ). To
break the loop, you have to go deeper -- or, simpler, just use
a complete set of examples.

In the case of file-scope identifiers, we only need three examples
total, to cover all the cases. Here they are:

/* file.c */
int i;
extern int j;
int k = 7;

If this is the complete translation unit, then the final answers
are:

i is defined, and has initial value zero
j is declared, but never defined
k is defined, and has initial value 7

The definition of "i" is one of those so-called "tentative
definitions". The definition of k has an initializer -- it says
"= 7" -- so it is not "tentative" at all. The declaration of
j not only has no initializer, but also has an "extern" keyword
in front. This inhibits the tentative definition.

If we stick an "extern" keyword in front of the last line:

extern int k = 7;

it has no effect at all. The variable k is still initialized, so
this is an explicit definition. The "extern" keyword inhibits the
tentative definition, but there is none to inhibit; and it has that
weird side effect of meaning "static if already static, but not if
not" -- that is, internal linkage if already internal linkage,
external linkage otherwise -- but k is not already marked "static",
so again there is no meaning for "extern" to add.

If we stick an "extern" in front of the line defining "i", however,
it takes away the tentative definition. So here "extern" *would*
have an effect, making the "i" line act like the "j" line. But what
is this "tentative definition" stuff anyway? In particular, what
is it good for?

The answer is, it lets you declare an identifier that you intend
to define later in the same translation unit, so that you can
refer back to it before you define it. This sounds a little odd,
and again an example is quite helpful.

Consider the linked-up data structure called a "circular queue",
in which each element has forward and backward pointers to the next
and previous elements, along with the data for the current element.
The "last" element in the queue points to the first, and the first
to the last, so that the whole thing is a big loop. Such a structure
might look like this:

/* a queue of <key,value> pairs */
struct queue {
struct queue *next;
struct queue *prev;
char *key;
char *value;
};

Now suppose we want to construct a queue of, say:

<"bob", "dad">; <"jane", "mom">; <"sean", "boy">

where the last one entry "points forward" to the first, and the first
"points back" to the last. Simply declaring each one is easy enough:

struct queue bob;
struct queue jane;
struct queue sean;

but we would rather define them, filling them in with appropriate
initial values. But now we have a problem. While jane.prev should
be &bob, jane.next should be &sean. If we make the middle line
read:

struct queue jane = { &sean, &bob, "jane", "mom" };

the reference to bob, "&bob" will be OK, but &sean will not work,
because we have not yet defined "sean". No matter what order we
use, shuffling the definitions around will not work.

The solution is to declare all three objects first, *then* define
them. (Actually, we can get away with two declarations followed
by all three definitions. For instance, we can declare jane and
sean and then define bob, which also declares bob; then we can
define jane and sean, using the declaration provided by the
definition.)

Of course, we hardly need tentative definitions for *this*, because
we can just write:

extern struct queue bob, jane, sean;
struct queue bob = { &jane, &sean, "bob", "dad" };
...

But what if we want the identifiers to have internal linkage?

Since "extern" means "static (internal linkage) if already static,
external linkage otherwise", "extern" by itself is not going to
work. We have to use "static", and static does not inhibit
definitions the way "extern" can.

Fortunately, we have those "tentative definitions", so we can write:

static struct queue bob, jane, sean;
static struct queue bob = { &jane, &sean, "bob", "dad" };
static struct queue jane = { &sean, &bob, "jane", "mom" };
static struct queue sean = { &bob, &jane, "sean", "boy" };

The first line has no initializers, so it "tentatively defines"
all three variables, declaring them in the process. Then we
define them for real, including initializers. If we never
defined them, the tentative definitions would become actual
definitions at the end of the translation unit, initialized as
if we had written "= { 0 }", or equivalently in this case,
"= { NULL, NULL, NULL, NULL }".

The keywords themselves, and their rather odd behavior -- we could
use "extern" on each of the three definitions, for instance -- are
bizarre in name and meaning only for historical reasons. In K&R
C, "tentative definitions" did not exist, and it was in fact
impossible to define a queue like the above with internal linkage.
(You could use the "extern" trick to do it with external linkage.)
In the 1980s, the ANSI C committee came up with a clumsy but workable
way to use all the existing stuff, and add this new ability, using
these "tentative definitions".

Most of C's oddities really are a result of this "organic growth".
Back in the Dim Time, Dennis Ritchie's early C compilers did not
have "typedef" at all. (These were also the days when the C compiler
-- there was just the one, plus of course any backup versions of
it -- only ran the code through the preprocessor if the first
character of the file was '#'.) When typedef got added, it *had*
to act a lot like "#define", because it was used to replace existing
"#define"s. The things the ANSI C committee added in the 1980s
had to fit into (and, to a large extent, not break) existing code.
If something was painted into a corner, why, it was just left
standing there in the corner. So C is full of oddities.
 
J

j

Chris Torek said:
[I wrote, in part, with typo fixed]
A struct really isn't an abstract type though, right?

Well, I will let you define "abstract type", and then we can see
whether "struct S" fits that definition. :)

Well, ``one which has no concrete implementation''

That is, it specifies the relationships between the operations upon
it, rather than an implementation of those operations.
C++'s "classes", which are types that also have operations literally
embedded within them, fit almost every definition, while C's structs
require a looser definition for "abstract type". But if anything
in C meets your definition, "struct" will (assuming your definition
is not radically odd, at least). It is as close as you get. When
"struct foo" defines a new type, it is incompatible with all other
types, and any operations you define on "struct foo" will work only
on that type.

I was told that the embedment of operators is irrelevant to abstract
data type-ness, it is only relevant to polymorphism which is separate.

In the case of file-scope identifiers, we only need three examples
total, to cover all the cases. Here they are:

/* file.c */
int i;
extern int j;
int k = 7;

If this is the complete translation unit, then the final answers
are:

i is defined, and has initial value zero
j is declared, but never defined
k is defined, and has initial value 7

The definition of "i" is one of those so-called "tentative
definitions". The definition of k has an initializer -- it says
"= 7" -- so it is not "tentative" at all. The declaration of
j not only has no initializer, but also has an "extern" keyword
in front. This inhibits the tentative definition.

If we stick an "extern" keyword in front of the last line:

extern int k = 7;

it has no effect at all. The variable k is still initialized, so
this is an explicit definition. The "extern" keyword inhibits the
tentative definition, but there is none to inhibit; and it has that
weird side effect of meaning "static if already static, but not if
not" -- that is, internal linkage if already internal linkage,
external linkage otherwise -- but k is not already marked "static",
so again there is no meaning for "extern" to add.

If we stick an "extern" in front of the line defining "i", however,
it takes away the tentative definition. So here "extern" *would*
have an effect, making the "i" line act like the "j" line. But what
is this "tentative definition" stuff anyway? In particular, what
is it good for?

The answer is, it lets you declare an identifier that you intend
to define later in the same translation unit, so that you can
refer back to it before you define it. This sounds a little odd,
and again an example is quite helpful.

I got confused here. If a tentative definition is something which lets
you declare an identifier that you intend to define later, then why
was ``i'' in the above example considered to be a "so-called"
tentative definition. Since ``i'' was already defined, you would be
unable to define it later(unless we shadow it with a definition with
block-scope).If you could elaborate on that, I would much appreciate
it :)
 
J

j

Chris Torek said:
[I wrote, in part, with typo fixed]
... Of course, I also think of
the word "struct" as standing for "STRange spelling for User-defined
abstraCt Type", and hence read it as if it were the word "type" in
some other language. :) )

A struct really isn't an abstract type though, right?

Well, I will let you define "abstract type", and then we can see
whether "struct S" fits that definition. :)

Well, ``one which has no concrete implementation''

That is, it specifies the relationships between the operations upon
it, rather than an implementation of those operations.
C++'s "classes", which are types that also have operations literally
embedded within them, fit almost every definition, while C's structs
require a looser definition for "abstract type". But if anything
in C meets your definition, "struct" will (assuming your definition
is not radically odd, at least). It is as close as you get. When
"struct foo" defines a new type, it is incompatible with all other
types, and any operations you define on "struct foo" will work only
on that type.

I was told that the embedment of operators is irrelevant to abstract
data type-ness, it is only relevant to polymorphism which is separate.

In the case of file-scope identifiers, we only need three examples
total, to cover all the cases. Here they are:

/* file.c */
int i;
extern int j;
int k = 7;

If this is the complete translation unit, then the final answers
are:

i is defined, and has initial value zero
j is declared, but never defined
k is defined, and has initial value 7

The definition of "i" is one of those so-called "tentative
definitions". The definition of k has an initializer -- it says
"= 7" -- so it is not "tentative" at all. The declaration of
j not only has no initializer, but also has an "extern" keyword
in front. This inhibits the tentative definition.

If we stick an "extern" keyword in front of the last line:

extern int k = 7;

it has no effect at all. The variable k is still initialized, so
this is an explicit definition. The "extern" keyword inhibits the
tentative definition, but there is none to inhibit; and it has that
weird side effect of meaning "static if already static, but not if
not" -- that is, internal linkage if already internal linkage,
external linkage otherwise -- but k is not already marked "static",
so again there is no meaning for "extern" to add.

If we stick an "extern" in front of the line defining "i", however,
it takes away the tentative definition. So here "extern" *would*
have an effect, making the "i" line act like the "j" line. But what
is this "tentative definition" stuff anyway? In particular, what
is it good for?

The answer is, it lets you declare an identifier that you intend
to define later in the same translation unit, so that you can
refer back to it before you define it. This sounds a little odd,
and again an example is quite helpful.

I got confused here. If a tentative definition is something which lets
you declare an identifier that you intend to define later, then why
was ``i'' in the above example considered to be a "so-called"
tentative definition. Since ``i'' was already defined, you would be
unable to define it later(unless we shadow it with a definition with
block-scope).If you could elaborate on that, I would much appreciate
it :)

I've read 6.9.2 of c99 (it cleared up what I was confused on) so I
understand the ``tentative definition'' thing now. Thanks though, for
the explanation. :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top