Initialized and uninitialized global variable allocation

C

Cristea Bogdan

Hi

I have started reading some old book of c programming (Peter van der Linden, Expert C Programming, Prentice Hall, 1994) and I have found the following statement: "initialized variables are allocated on the data segment of the program and uninitialized variables on the BSS segment".
Experimenting with a short program shows that this statement is no longer true:
It seems that only global variables go into BSS (regardless if they initialized or not), while static variables seem to be allocated at run time.
Below is the program used for tests:

#define SIZE (1<<20)
static char arrs[SIZE];
char arrg[SIZE] = {0};
int main()
{
return 0;
}

I have used gcc to compile using the default flags and size to check the segments size.

Could someone shed some light into this issue (how global and static variables are allocated) ?

thanks
Bogdan
 
J

James Kuyper

Hi

I have started reading some old book of c programming (Peter van der Linden, Expert C Programming, Prentice Hall, 1994) and I have found the following statement: "initialized variables are allocated on the data segment of the program and uninitialized variables on the BSS segment".

Do you know which compiler he was describing? Different compilers are
free to make different choices in that regard. What you've described
could not possibly be true for platforms where the very concept of a
segment is meaningless, for instance. Of course, many different
compilers might make the same choice when targeting the same platform.
Experimenting with a short program shows that this statement is no longer true:
It seems that only global variables go into BSS (regardless if they initialized or not), while static variables seem to be allocated at run time.

The standard doesn't define what "global variable" means; the best
definition I can think of that applies in a C context would be an object
with static storage duration named by an identifier with external
linkage. Some people use the term "file global" for an object with
static storage duration named by an identifier with file scope, even if
it doesn't have external linkage.
Below is the program used for tests:

#define SIZE (1<<20)
static char arrs[SIZE];
char arrg[SIZE] = {0};

The 'static' keyword has several different meanings; in this context, it
gives arrs[] internal linkage, and has nothing to do with the storage
duration. Both arrs[] and arrg[] have static storage duration; therefore
every element of arrs[] and arrg[] that is not explicitly initialized is
implicitly zero-initialized. arrg[0] is explicitly zero-initialized,
while all of the other elements of arrg[] are implicitly
zero-initialized, as are all elements of arrs[].
int main()
{
return 0;
}

I have used gcc to compile using the default flags and size to check the segments size.

The best place for answers to detailed questions about the behavior of
gcc would be a forum dedicated to gcc. This is not such a forum.
 
B

Barry Schwarz

Hi

I have started reading some old book of c programming (Peter van der Linden, Expert C Programming, Prentice Hall, 1994) and I have found the following statement: "initialized variables are allocated on the data segment of the program and uninitialized variables on the BSS segment".
Experimenting with a short program shows that this statement is no longer true:
It seems that only global variables go into BSS (regardless if they initialized or not), while static variables seem to be allocated at run time.
Below is the program used for tests:

#define SIZE (1<<20)
static char arrs[SIZE];
char arrg[SIZE] = {0};
int main()
{
return 0;
}

I have used gcc to compile using the default flags and size to check the segments size.

Could someone shed some light into this issue (how global and static variables are allocated) ?

Why do you care? Seriously. What difference does it make to the code
you intend to write? Can you think of a construct in the language you
would change (++i; could be changed to i+=1;) based on this knowledge?
And would knowing what the answer is for one particular system provide
any insight to a different system?

Global and static variables are allocated however and wherever the
compiler writer decided they should be allocated. With rare
exceptions, only people writing or debugging compilers and linkers
need to be concerned with it.

Did the introductory material at the start of the book identify which
compiler running under which operating system on which hardware the
author was using as a basis for his discussion. If not, you now have
an example of someone's belief that his limited experience allows him
to pontificate universal truths.

If your goal is to learn how to program in C, concentrate on that and
not the peculiarities of a 20 year old system which may not even be in
existence anymore.
 
B

Barry Schwarz

Hi

I have started reading some old book of c programming (Peter van der Linden, Expert C Programming, Prentice Hall, 1994) and I have found the following statement: "initialized variables are allocated on the data segment of the program and uninitialized variables on the BSS segment".
Experimenting with a short program shows that this statement is no longer true:
It seems that only global variables go into BSS (regardless if they initialized or not), while static variables seem to be allocated at run time.
Below is the program used for tests:

#define SIZE (1<<20)
static char arrs[SIZE];
char arrg[SIZE] = {0};
int main()
{
return 0;
}

I have used gcc to compile using the default flags and size to check the segments size.

Could someone shed some light into this issue (how global and static variables are allocated) ?

Your title and your discussion are unrelated. You discuss global and
static variables. None of these variables can ever be uninitialized.
Global variables (variables defined, not just declared, at file
scope), whether static or not, are always initialized, either
explicitly in the definition or by default. Static variables defined
in a function also are always initialized, either explicitly in the
definition or by default.

The default initialization is 0, 0.0, or NULL depending on whether the
variable is an integer, real, or pointer. Aggregates are initialized
by applying this rule recursively.
 
A

Alan Curry

Hi

I have started reading some old book of c programming (Peter van der
Linden, Expert C Programming, Prentice Hall, 1994) and I have found the
following statement: "initialized variables are allocated on the data
segment of the program and uninitialized variables on the BSS segment".
Experimenting with a short program shows that this statement is no
longer true: [...]
char arrg[SIZE] = {0};

You're right. Something has changed since 1994. Specifically, gcc is now
smart enough to optimize an all-zero initializer by putting the variable
into bss. Initialize it to {1} instead of {0} and you'll see the old
behavior come back.
 
B

Ben Bacarisse

Cristea Bogdan said:
I have started reading some old book of c programming (Peter van
der Linden, Expert C Programming, Prentice Hall, 1994) and I have
found the following statement: "initialized variables are allocated on
the data segment of the program and uninitialized variables on the BSS
segment". Experimenting with a short program shows that this
statement is no longer true: It seems that only global variables go
into BSS (regardless if they initialized or not), while static
variables seem to be allocated at run time. Below is the program used
for tests:

#define SIZE (1<<20)
static char arrs[SIZE];
char arrg[SIZE] = {0};

There are three things that *might* matter here:

(a) Is the array "static"? I.e. does it have internal or external
linkage?

(b) Is the array explicitly initialised? As others have said, arrays
like this are always initialised, but some have an explicit
initialiser.

(c) Are any of the elements initialised to something other than 0?

There are only six combinations to try (not eight because (c) only
matters if (b) is true). Try then all to see what the pattern is.

You could use something like this:

#define S (1<<20)
#if A
static
#endif
char array
#if B
= {
#if C
1
#else
0
#endif
}
#endif
;

and call gcc using -D to set A, B and C to 0/1 from the command line.

<snip>
 
R

Roberto Waltman

Barry said:
Global and static variables are allocated however and wherever the
compiler writer decided they should be allocated. With rare
exceptions, only people writing or debugging compilers and linkers
need to be concerned with it.

Embedded systems development is, (generally speaking,) one of those
exceptions, and I do not believe is a rare one.
 
R

Richard Damon

Embedded systems development is, (generally speaking,) one of those
exceptions, and I do not believe is a rare one.

If the distinction is between "BSS" and "Initialize", even Embedded
systems developers don't normally care. If the location in memory
matters then they are going to use some form of extension to place it
exactly where they want it.

The only difference between "BSS" and "Initilized" segments is that the
BSS segment is automatically zero filled, while the Initilized segment
is filled by copying from a (hopefully compressed) copy of the segment
at start up. The BSS is possible slightly faster to initialize, but not
that significantly normally.
 
C

Cristea Bogdan

  I have started reading some old book of c programming (Peter van der Linden, Expert C Programming, Prentice Hall, 1994) and I have found the following statement: "initialized variables are allocated on the data segment of the program and uninitialized variables on the BSS segment".
  Experimenting with a short program shows that this statement is no longer true:
It seems that only global variables go into BSS (regardless if they initialized or not), while static variables seem to be allocated at run time.
Below is the program used for tests:
#define SIZE (1<<20)
static char arrs[SIZE];
char arrg[SIZE] = {0};
int main()
{
       return 0;
}
I have used gcc to compile using the default flags and size to check thesegments size.
Could someone shed some light into this issue (how global and static variables are allocated) ?

Why do you care?  Seriously.  What difference does it make to the code
you intend to write?  Can you think of a construct in the language you
would change (++i; could be changed to i+=1;) based on this knowledge?
And would knowing what the answer is for one particular system provide
any insight to a different system?

Global and static variables are allocated however and wherever the
compiler writer decided they should be allocated.  With rare
exceptions, only people writing or debugging compilers and linkers
need to be concerned with it.

Did the introductory material at the start of the book identify which
compiler running under which operating system on which hardware the
author was using as a basis for his discussion.  If not, you now have
an example of someone's belief that his limited experience allows him
to pontificate universal truths.

If your goal is to learn how to program in C, concentrate on that and
not the peculiarities of a 20 year old system which may not even be in
existence anymore.

The author used a gcc compiler for SunOS. Why do I care about this ?
In embedded systems it is important to how and when each variable is
allocated.
 
B

Barry Schwarz

snip

snip

The author used a gcc compiler for SunOS. Why do I care about this ?
In embedded systems it is important to how and when each variable is
allocated.

Let's see now:

You have already established that the assertion from the author
is no longer true.

You confirm that the author was basing the assertion on an old
compiler for an operating system no longer marketed.

You care because in embedded systems this is important.

Since the one embedded system I worked on didn't have BSS and data
segments, I'd love to be educated:

Do the development environments (compiler, linker, library) for
many other embedded systems have BSS and data segments?

After determining where and when a variable would be allocated,
do you actually change some property of the variable (such as auto to
static) to alter that allocation? I always thought variable
properties should be driven by the requirements associated with the
variable and not the vagaries of some particular compiler. What
happens if at some point in the project it becomes necessary to change
compilers? Does someone have to go back and check that any
"artificial" properties are still relevant (or even correct)?

Since this has nothing to do with the C language, wouldn't a better
newsgroup be one devoted to embedded systems or perhaps to the
compiler system used for your embedded system?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top