Initial values of File scoped and Block level variables

M

Madhav

Hi all,

I did not understand why do the global vars are
initialized to NULL where as the block level variables have random
values? I know that the C standard requires this as was mentioned in a
recent thread.

I want to know why this descrimination is in place. Can't all the
variables be initialised to NULL automatically by the compiler? This
would make programming a little easier.

Regards,
Madhav.
 
U

usenet

Madhav said:
I did not understand why do the global vars are
initialized to NULL where as the block level variables have random
values? I know that the C standard requires this as was mentioned in a
recent thread.

I want to know why this descrimination is in place. Can't all the
variables be initialised to NULL automatically by the compiler? This
would make programming a little easier.

No, the data for automatic variables can not be initialized to NULL by
the compiler.

Before main() is called, a system-specific piece of code runs, which
prepares the environment for your C program to run in. One of it's tasks
is to clear the .bss section, e.g. setting all global and static
variables to zero.

Automatic variables are allocated on the stack, and do not have a fixed
address. The same memory locations will often be re-used when entering
and leaving functions. To clean all automatic variables every time a
function or block is entered, your compiler would have to insert a piece
of code at the beginning of *every function* to do that.
 
R

Richard Heathfield

Madhav said:
Hi all,

I did not understand why do the global vars are
initialized to NULL where as the block level variables have random
values? I know that the C standard requires this as was mentioned in a
recent thread.

The Standard requires that all static and file scope objects that are not
explicitly initialised by the program are given default static initialiser
values, viz. 0, 0.0, NULL (recursively for aggregates).

The Standard makes no such demands for automatic objects, except in the case
of partial initialisation of aggregate objects (where the partial
initialisation is honoured, and the rest of the object initialised as
mentioned earlier).
I want to know why this descrimination is in place.

I don't know why statics and file scope objects get a default
initialisation. I know why automatics don't. It's a programmer choice. If
the programmer wishes the program to spend time assigning 0-values to
automatic objects, the programmer can choose that behaviour by writing:

int i = 0;

and if he doesn't wish that, he can simply write:

int i;

which indicates that, at this stage, he isn't fussed about i's value.
Can't all the
variables be initialised to NULL automatically by the compiler?

Yes, a compiler is free to do that if it wishes (NULL for pointers, 0 for
integers, 0.0 for floating point numbers, and the obvious for structs,
unions, and arrays), but it is not /required/ to do that.
This would make programming a little easier.

I'm tempted to agree, but it would diminish programmer choice. I like my
objects to start off with known values, in the interests of determinism, so
your proposal would suit my style of programming. But other programmers
prefer it the way it is now, and whilst it is possible for me to get the
effect I want by taking the current strategy and adding code (typically as
simple as = {0} in my case), the reverse would not be true.

I am not (yet) so monumentally egomaniacal as to insist that every C
compiler writer in the world should change their product just to suit my
coding style. But I'm working on it.
 
R

Richard Heathfield

(e-mail address removed) said:
No, the data for automatic variables can not be initialized to NULL by
the compiler.

Not so. They could be. They just aren't.
Automatic variables are allocated on the stack, and do not have a fixed
address. The same memory locations will often be re-used when entering
and leaving functions. To clean all automatic variables every time a
function or block is entered, your compiler would have to insert a piece
of code at the beginning of *every function* to do that.

See? I told you it was possible!
 
C

Christopher Benson-Manica

[snip discussion about initialization of automatic variables]
I'm tempted to agree, but it would diminish programmer choice.

I don't want to be too broad, but I think programmer choice is one of
the central qualities that endears C to those who use it. I imagine
that programmers for embedded platforms are in any case perfectly happy
to go on exercising their freedom of choice in this area, given that
not every environment has the luxury of CPU cycles to burn.
 
R

Richard Heathfield

Christopher Benson-Manica said:
[snip discussion about initialization of automatic variables]
I'm tempted to agree, but it would diminish programmer choice.

I don't want to be too broad, but I think programmer choice is one of
the central qualities that endears C to those who use it.

Hence the "but".
 
T

tmp123

Madhav said:
Hi all,

I did not understand why do the global vars are
initialized to NULL where as the block level variables have random
values? I know that the C standard requires this as was mentioned in a
recent thread.

I want to know why this descrimination is in place. Can't all the
variables be initialised to NULL automatically by the compiler? This
would make programming a little easier.

Regards,
Madhav.

Probably, the origin was only a speed problem. Init the globals at
start of program is not very expensive, just a fill with 0 (at start of
program lots of things are done).
However, init locals each time a function call is done means an
important overwork, not necessary most part of time.

Kind regards.

PS: In some places, a trick is used:
- at first testing stages the globals, stack ... are filled not with
cero but with a very strange value, like H'C5. Reason: if something has
not been init, better to crash as soon as possible, to detect and fix.
- at final version, init is done with cero for the oposite reason.
 
K

Keith Thompson

No, the data for automatic variables can not be initialized to NULL by
the compiler.

It certainly could be. In fact, a compiler that did so would be
perfectly legal (but it would encourage lazy programmers to depend on
its behavior).
Before main() is called, a system-specific piece of code runs, which
prepares the environment for your C program to run in. One of it's tasks
is to clear the .bss section, e.g. setting all global and static
variables to zero.

Automatic variables are allocated on the stack, and do not have a fixed
address. The same memory locations will often be re-used when entering
and leaving functions. To clean all automatic variables every time a
function or block is entered, your compiler would have to insert a piece
of code at the beginning of *every function* to do that.

As far as the C language is concerned, there's no such thing as a
".bss section", or even a "stack"; they're just particular ways to
implement the semantics required for the language. (Automatic
variables are allocated in a stack-like last-in/first-out fashion, but
there are ways to implement that other than a contiguous linear stack
in memory.)

But I believe the motivation for implicitly initializing global
variables but not automatic variables is based on the kind of
implementation details you're talking about. Global variables are
initialized to 0 converted to the appropriate type; that could be 0,
0L, '\0', 0.0, NULL, etc. In many (most?) implementations, all those
values are represented as all-bits-zero, making it easy to initialize
all the globals by zeroing a contiguous block of memory when the
program is loaded. (If a NULL pointer or a floating-point 0.0 isn't
represented as all-bits-zero, the implementation has to go to some
extra effort, but it already has to handle explicit initializations to
non-zero values anyway.)

Initializing local variables, as you point out, would require some
extra work. The burden on the compiler would be fairly trivial, but
the cost at run-time could be significant -- and it would typically
have been even more significant back when the language was being
designed.
 
J

jacob navia

Richard Heathfield a écrit :
(e-mail address removed) said:




Not so. They could be. They just aren't.




See? I told you it was possible!

The lcc-win32 compiler will init all locals to zero when
called with:
lcc -stackinit 0

By default, lcc-win32 initializes all stack to
0xFFFA5A5A
This provokes many program that use uninitialized variables and run
with other compilers to break down with lcc-win32, what has led
to MANY false bug reports.

Still I think this is worth doing.

jacob
 
J

jacob navia

tmp123 a écrit :
Probably, the origin was only a speed problem. Init the globals at
start of program is not very expensive, just a fill with 0 (at start of
program lots of things are done).
However, init locals each time a function call is done means an
important overwork, not necessary most part of time.

Kind regards.

PS: In some places, a trick is used:
- at first testing stages the globals, stack ... are filled not with
cero but with a very strange value, like H'C5. Reason: if something has
not been init, better to crash as soon as possible, to detect and fix.
- at final version, init is done with cero for the oposite reason.

lcc-win32 will initialize the stack to
0xFFFA5A5A
This is a nan, and provokes crashes when used as a pointer.
If you want another value (include zero)
you call it with
lcc -stackinit 0

jacob
 
D

Default User

Keith said:
the >> variables be initialised to NULL automatically by the
compiler? This >> would make programming a little easier.

It certainly could be. In fact, a compiler that did so would be
perfectly legal (but it would encourage lazy programmers to depend on
its behavior).



Some compilers do so, in some modes, notably debug mode. This leads to
the complaint seen sometimes, "my program runs fine in debug mode, but
crashes when I compile in release mode!!!!"




Brian
 
J

John Devereux

Default User said:
Some compilers do so, in some modes, notably debug mode. This leads to
the complaint seen sometimes, "my program runs fine in debug mode, but
crashes when I compile in release mode!!!!"

Or they can initialise to a deliberately large non-zero value so as to
more likely provoke an obvious failure.
 
J

Joseph Dionne

Madhav said:
Hi all,

I did not understand why do the global vars are
initialized to NULL where as the block level variables have random
values? I know that the C standard requires this as was mentioned in a
recent thread.

I want to know why this descrimination is in place. Can't all the
variables be initialised to NULL automatically by the compiler? This
would make programming a little easier.

Regards,
Madhav.

Technically, global variables are not "initialized" by the compiler. Program
globals, data segment, variables are either initialized by the programmer, or
by the program loader as it is copied into ram and it's code/data space is
allocated.

There are two exceptions, partially initialized structures and arrays.

For example the compiler will add the initialized data to the program machine
code image emitted to disk padding with NULL all elements you do NOT
initialize in source code.

struct {
int ii;
long ll;
char a[32];
} gvar = { 1 };

In the above, ii will be set to (int)1, with ll and a[] set to NULL.

char a[32] = "hi";

In the above, a[0] = 'h' a[1] = 'i' and a[2-31] == 0

For uninitialized global variables, the loader(LD) allocates data segment
memory using calloc() which does the "initialization" to NULL for all
elements, and only a reference to type of memory and how many elements are
needed are emitted to the machine code image emitted to disk by the compiler.

So ...

struct {
int ii;
long ll;
char a[32];
} gvar;

creates an entry in the disk program image that the compiler emits to disk
that informs the loader to allocate 1 element of sizeof(gvar) memory.

Memory "allocated" between braces "{}" is actually stack memory which retains
the data from the last function to use that stack memory. This memory
can/will change from call to call being altered by the "other" functions that
are called in between.

For the "compiler" to accomplish what you perceive as a failing, it will need
to insert machine code that runs BEFORE your code does to initialize the
variables declared. This extra code will consume time and may not be needed
if your code simply assigns a value prior to use, such as the initialization
in a for(;;) loop.

Joseph
 
K

Keith Thompson

Joseph Dionne said:
Technically, global variables are not "initialized" by the compiler.
Program globals, data segment, variables are either initialized by the
programmer, or by the program loader as it is copied into ram and it's
code/data space is allocated.

Technically, global variables are implicitly initialized; the
implementation (of which the compiler is a part) can accomplish this
in any convenient manner. Data segments and program loaders are
implementation details that may be used to implement the semantics
required by the language; they're outside the scope of the langauge
and of this newsgroup.
There are two exceptions, partially initialized structures and arrays.

For example the compiler will add the initialized data to the program
machine code image emitted to disk padding with NULL all elements you
do NOT initialize in source code.

struct {
int ii;
long ll;
char a[32];
} gvar = { 1 };

NULL is a macro that expands to a null pointer constant; it doesn't
apply to non-pointer types.
In the above, ii will be set to (int)1, with ll and a[] set to NULL.

No, ll will be set to 0L, and each element of a will be set to '\0'
(more generally, to zero converted to the appropriate type).
char a[32] = "hi";

In the above, a[0] = 'h' a[1] = 'i' and a[2-31] == 0

For uninitialized global variables, the loader(LD) allocates data
segment memory using calloc() which does the "initialization" to NULL
for all elements, and only a reference to type of memory and how many
elements are needed are emitted to the machine code image emitted to
disk by the compiler.

An implementation could use calloc() to initialize globals that aren't
initialized explicitly, but only for types where zero (or 0.0, or a
null pointer value) is represented as all-bits-zero. Also, it's
unlikely that an implementation would use either calloc() or malloc()
to allocate the space used for global variables, though I suppose it's
possible. (The "heap" is typically distinct from the memory space
used for global and static variables.)
So ...

struct {
int ii;
long ll;
char a[32];
} gvar;

creates an entry in the disk program image that the compiler emits to
disk that informs the loader to allocate 1 element of sizeof(gvar)
memory.

Maybe. There may not be a disk at all.
Memory "allocated" between braces "{}" is actually stack memory which
retains the data from the last function to use that stack memory.
This memory can/will change from call to call being altered by the
"other" functions that are called in between.

A stack (in the sense of a contiguously allocated region of memory
that grows and shrinks linearly, typically with a CPU register
dedicated as a top-of-stack pointer) is one way to implement automatic
variables. It's not the only way. I don't believe the word "stack
even appears in the standard.

[snip]
 
J

Joseph Dionne

Keith said:
Technically, global variables are implicitly initialized; the
implementation (of which the compiler is a part) can accomplish this
in any convenient manner. Data segments and program loaders are
implementation details that may be used to implement the semantics
required by the language; they're outside the scope of the langauge
and of this newsgroup.

Now you are just being flippant. This group is for discussing c language
programming, and what a compiler does to create the executable object is
pertinent to the OP's post. As I read what he "thought" should happen would
not be a function of compilation, but a feature added in 4th generation languages.

C is considered a 3rd generation, however IMHO it is more like a 2.5
generation language because reserved words of c translate one to one to
assembler instructions, with few exceptions.

The "advantages" of c standard libraries make it easier for one to write
software, but one could write their own function libraries in the c lexicon
without third party libraries, providing they had an expert understanding of
the target hardware.
There are two exceptions, partially initialized structures and arrays.

For example the compiler will add the initialized data to the program
machine code image emitted to disk padding with NULL all elements you
do NOT initialize in source code.

struct {
int ii;
long ll;
char a[32];
} gvar = { 1 };


NULL is a macro that expands to a null pointer constant; it doesn't
apply to non-pointer types.

NULL is "#define NULL 0" in every c compiler I have used, and I code regularly
on ten different *nix operating systems. NULL, the macro, is neither a
"pointer," int, long, or char, but can be assigned to all.

In the above, ii will be set to (int)1, with ll and a[] set to NULL.


No, ll will be set to 0L, and each element of a will be set to '\0'
(more generally, to zero converted to the appropriate type).

You say potato, I say NULL -- most c developers think zero when reading NULL.
char a[32] = "hi";

In the above, a[0] = 'h' a[1] = 'i' and a[2-31] == 0

For uninitialized global variables, the loader(LD) allocates data
segment memory using calloc() which does the "initialization" to NULL
for all elements, and only a reference to type of memory and how many
elements are needed are emitted to the machine code image emitted to
disk by the compiler.


An implementation could use calloc() to initialize globals that aren't
initialized explicitly, but only for types where zero (or 0.0, or a
null pointer value) is represented as all-bits-zero. Also, it's
unlikely that an implementation would use either calloc() or malloc()
to allocate the space used for global variables, though I suppose it's
possible. (The "heap" is typically distinct from the memory space
used for global and static variables.)

So ...

struct {
int ii;
long ll;
char a[32];
} gvar;

creates an entry in the disk program image that the compiler emits to
disk that informs the loader to allocate 1 element of sizeof(gvar)
memory.


Maybe. There may not be a disk at all.

Memory "allocated" between braces "{}" is actually stack memory which
retains the data from the last function to use that stack memory.
This memory can/will change from call to call being altered by the
"other" functions that are called in between.


A stack (in the sense of a contiguously allocated region of memory
that grows and shrinks linearly, typically with a CPU register
dedicated as a top-of-stack pointer) is one way to implement automatic
variables. It's not the only way. I don't believe the word "stack
even appears in the standard.

[snip]
 
J

Jordan Abel

Now you are just being flippant. This group is for discussing c
language programming, and what a compiler does to create the
executable object is pertinent to the OP's post. As I read what he
"thought" should happen would not be a function of compilation, but a
feature added in 4th generation languages.

C is considered a 3rd generation, however IMHO it is more like a 2.5
generation language because reserved words of c translate one to one
to assembler instructions, with few exceptions.

The "advantages" of c standard libraries make it easier for one to
write software, but one could write their own function libraries in
the c lexicon without third party libraries, providing they had an
expert understanding of the target hardware.

More than that... Other than basic things like fopen(), fputc(), etc, or
more primitive C-callable functions used by them (nearly all of which
are in stdio.h - and a _lot_ of stdio.h can still be written in C) which
require knowledge of how to access operating system services directly,
the C library can be written entirely in "standalone" C - that is, in
the subset of C which is permitted in programs that are strictly
conforming for freestanding implementation, except without the
restrictions on the identifiers used. More precisely, with said
restrictions turned upside down, since if you're writing the C library,
you're part of the implementation, and thus should be using identifiers
reserved to the implementation rather than those belonging to the user.
NULL is "#define NULL 0" in every c compiler I have used, and I code regularly
on ten different *nix operating systems. NULL, the macro, is neither a
"pointer," int, long, or char, but can be assigned to all.

However, you _should_ not assign it to non-pointer types, since it can
be ((void*)0)

Here's part of <sys/_null.h> (#included on my system, FreeBSD 6.0, by
any header which is expected to provide a definition for NULL)

#if defined(_KERNEL) || !defined(__cplusplus)
#define NULL ((void *)0)
#else /* !defined(_KERNEL) && defined(__cplusplus)
....

glibc appears to do something similar.
You say potato, I say NULL -- most c developers think zero when
reading NULL.

Yes, but good ones follow "zero" with ", possibly cast to pointer to
void".
 
J

Joseph Dionne

Jordan said:
More than that... Other than basic things like fopen(), fputc(), etc, or
more primitive C-callable functions used by them (nearly all of which
are in stdio.h - and a _lot_ of stdio.h can still be written in C) which
require knowledge of how to access operating system services directly,
the C library can be written entirely in "standalone" C - that is, in
the subset of C which is permitted in programs that are strictly
conforming for freestanding implementation, except without the
restrictions on the identifiers used. More precisely, with said
restrictions turned upside down, since if you're writing the C library,
you're part of the implementation, and thus should be using identifiers
reserved to the implementation rather than those belonging to the user.




However, you _should_ not assign it to non-pointer types, since it can
be ((void*)0)

Here's part of <sys/_null.h> (#included on my system, FreeBSD 6.0, by
any header which is expected to provide a definition for NULL)

#if defined(_KERNEL) || !defined(__cplusplus)
#define NULL ((void *)0)
#else /* !defined(_KERNEL) && defined(__cplusplus)
...

glibc appears to do something similar.




Yes, but good ones follow "zero" with ", possibly cast to pointer to
void".

I surrender! Since I never use NULL in my code, choosing to use a (cast)0
where cast is "int *", "char *", etc, you win.

Furthermore, "int ii = NULL" in C99 compilers and above (at least) started to
generate a warning about type violations, you hold the higher ground in the
argument.
 
K

Keith Thompson

Joseph Dionne said:
Now you are just being flippant.

Not at all.
This group is for discussing c
language programming, and what a compiler does to create the
executable object is pertinent to the OP's post. As I read what he
"thought" should happen would not be a function of compilation, but a
feature added in 4th generation languages.

The specific methods that particular compilers use to implement the
required semantics of the C language are mostly off-topic, though a
general discussion of how those semantics are implemented could be
topical. It's appropriate (IMHO) to mention that stacks, heaps, data
segments, ".bss" segments, program loaders and so forth are common
techniques, but asserting that they are *the* way the language is
implemented is both inapproprate and inaccurate.
C is considered a 3rd generation, however IMHO it is more like a 2.5
generation language because reserved words of c translate one to one
to assembler instructions, with few exceptions.

I don't think that's particularly accurate, but it's beside the point.

[...]
NULL is "#define NULL 0" in every c compiler I have used, and I code
regularly on ten different *nix operating systems. NULL, the macro,
is neither a "pointer," int, long, or char, but can be assigned to all.

There's a big difference between "every C compiler you've used" and
"every C compiler". NULL is defined to expand to a null pointer
constant. I commonly use implementations where NULL expands to
((void*)0). Using NULL for anything other that a pointer is wrong,
though it may happen to "work" in some implementations.

You really need to understand that pointers and integers are two
different things.
You say potato, I say NULL -- most c developers think zero when reading NULL.

No.
 
M

Mark McIntyre

On Sat, 07 Jan 2006 23:32:53 GMT, in comp.lang.c , Joseph Dionne

(and didn't snip any of 101 lines)
I surrender!

Can you also please learn to trim posts a little? You can remove
anything not relevant to your reply.

Mark McIntyre
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,045
Latest member
DRCM

Latest Threads

Top