Initialising Variables

S

santosh

DaveC said:
I always used to initialise variables at declaration, then a couple of
colleagues started telling me it was bad practice and that the compiler
should be left to spot the use of uninitilised variables and hence
possible bugs.

Your thoughts on the above would be welcome (as an aside), but my main
problem follows.

The standard doesn't require a compiler to diagnose use of
uninitialised objects, merely that the subsequent behaviour is
undefined. So you can't rely on all compilers to catch this for you,
though almost all mainstream ones do.

Initialising variables to a safe or default value is mainly a feel-good
thing. For small functions with a few variables, which you're going to
use straightaway, dummy initialisations are not really worth it. They
may not also be worth it for functions with a large number of variables
encapsulated into structures.

If your code is correct, then dummy initialisations rarely have a
useful effect. If your code is wrong, then the default values, (often
zero), are as likely to lead to wrong results than an arbitrary value.

This is something that one should analyse on a case-by-case basis. Some
variables are really not worth initialising to safe values, while some
might benifit from them, in certain border conditions.

IMHO, initialising pointers to NULL is more relevant than initialising
non-pointer objects. An arbitrary value in a pointer will give you no
idea of whether it is pointing to valid object or not, while zero is
always indicative of an unusable pointer. The same is not often true
for scalar variables, zero may often be a valid value. Correct code is
more important than relying on dummy initialisations and compilers.
 
F

Flash Gordon

santosh wrote, On 23/01/07 17:01:
The standard doesn't require a compiler to diagnose use of
uninitialised objects, merely that the subsequent behaviour is
undefined. So you can't rely on all compilers to catch this for you,
though almost all mainstream ones do.

They do not always get it right. I've seen gcc versions produce warnings
where it was easily provable (to me, not to the compiler) that the
variable would always be initialised before use.
Initialising variables to a safe or default value is mainly a feel-good
thing. For small functions with a few variables, which you're going to
use straightaway, dummy initialisations are not really worth it. They
may not also be worth it for functions with a large number of variables
encapsulated into structures.

If your code is correct, then dummy initialisations rarely have a
useful effect. If your code is wrong, then the default values, (often
zero), are as likely to lead to wrong results than an arbitrary value.

You do not necessarily use an initial value that might be considered
"safe". You might pick a value that will force an obviously wrong
result, or make use of knowledge of the implementation to select an
initial value that will cause a crash (e.g. pointer values that give a
SIGSEGV is dereferenced etc).
This is something that one should analyse on a case-by-case basis. Some
variables are really not worth initialising to safe values, while some
might benifit from them, in certain border conditions.

IMHO, initialising pointers to NULL is more relevant than initialising
non-pointer objects. An arbitrary value in a pointer will give you no
idea of whether it is pointing to valid object or not, while zero is
always indicative of an unusable pointer. The same is not often true
for scalar variables, zero may often be a valid value. Correct code is
more important than relying on dummy initialisations and compilers.

Correct code is very important. However, code that produces repeatable
results if there is an error is very useful as well since it makes it
easier to identify and fix the bug.

In other words, do your damnedest to make the code correct and *also* do
all you can to make it as easy as possible to identify and fix and bugs.
After all, none of us are perfect, so in projects of a significant size
there *will* be bugs.
 
C

CBFalconer

Flash said:
santosh wrote, On 23/01/07 17:01:
.... snip ...

They do not always get it right. I've seen gcc versions produce
warnings where it was easily provable (to me, not to the compiler)
that the variable would always be initialised before use.

IIRC the gcc warning is "XXX may be used uninitialized". The rest
is up to you.

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews
 
R

Richard Harter

[snip]
If your code is correct, then dummy initialisations rarely have a
useful effect. If your code is wrong, then the default values, (often
zero), are as likely to lead to wrong results than an arbitrary value.

An unmentioned advantage of dummy initializations is that they simplify
debugging by making the execution more likely to be repeatable.
 
I

Ian Collins

santosh said:
The standard doesn't require a compiler to diagnose use of
uninitialised objects, merely that the subsequent behaviour is
undefined. So you can't rely on all compilers to catch this for you,
though almost all mainstream ones do.

Initialising variables to a safe or default value is mainly a feel-good
thing. For small functions with a few variables, which you're going to
use straightaway, dummy initialisations are not really worth it. They
may not also be worth it for functions with a large number of variables
encapsulated into structures.
One advantage of C99 is the ability to mix variable declarations and
code, so this debate becomes moot.
 
C

CBFalconer

Ian said:
.... snip ...

One advantage of C99 is the ability to mix variable declarations
and code, so this debate becomes moot.

One disadvantage of C99 is the ability to mix variable declarations
and code. Fortunately, one can ignore this mis-feature.

--
Some informative links:
<http://www.catb.org/~esr/faqs/smart-questions.html>
<http://www.caliburn.nl/topposting.html>
<http://www.netmeister.org/news/learn2quote.html>
<http://cfaj.freeshell.org/google/> (taming google)
<http://members.fortunecity.com/nnqweb/> (newusers)
 
I

Ian Collins

CBFalconer said:
Ian Collins wrote:

.... snip ...



One disadvantage of C99 is the ability to mix variable declarations
and code. Fortunately, one can ignore this mis-feature.
Humbug!
 
R

Richard Heathfield

santosh said:

IMHO, initialising pointers to NULL is more relevant than initialising
non-pointer objects.

How do you feel about non-pointer objects that could be used as indices into
an array?
 
G

Guest

CBFalconer said:
One disadvantage of C99 is the ability to mix variable declarations
and code. Fortunately, one can ignore this mis-feature.

Given that it doesn't break any existing code, and as you mention you
can simply not use it if you don't like it, why can it be a
disadvantage?
 
R

Richard Heathfield

Harald van D?k said:
Given that it doesn't break any existing code, and as you mention you
can simply not use it if you don't like it, why can it be a
disadvantage?

It becomes a disadvantage when *other people* use it, even though I don't
like it, if I have to read their stuff later on.
 
I

Ian Collins

Richard said:
Harald van D?k said:




It becomes a disadvantage when *other people* use it, even though I don't
like it, if I have to read their stuff later on.
So you make a strong case for initialising variables to help detect the
use of uninitialised variables, but object to the language feature that
eliminates the problem?
 
S

santosh

Richard said:
santosh said:



How do you feel about non-pointer objects that could be used as indices into
an array?

It _might_ be useful. As I said, in my experience so far, the choice of
whether to initialise or not really depends on a case by case basis.

As other posters have noted, using a default value, will likely produce
a reproducable bug, if the concerned object is not reinitialised
properly, instead of leading to varying behaviour. The value will also
be easier to spot in a debugging session.

Interesting points on both sides.
 
R

Richard Heathfield

Ian Collins said:

So you make a strong case for initialising variables to help detect the
use of uninitialised variables, but object to the language feature that
eliminates the problem?

I thank you for acknowledging that my case for defensive initialisation is
strong. :)

I don't agree that the language feature (defining objects at arbitrary[1]
points) eliminates the problem, though. The simplest example I can think of
is when the appropriate value for an object depends on conditions which
cannot be (or at least have not been and are unlikely to be) reduced to a
single expression.

int z;

if(a) { z = 6; } else { z = 42; }

can trivially be reduced to:

int z = a ? 6 : 42;

but:

int z;

if(a)
{
if(b || c) z = 6 * foo() / y;
else if(d) z = (x & 17) * y + w;
}
else
{
if(e && f || g)
{
if(h || j) z = 42;
}
else
{
z = foo() * bar();
}
}

use(z);

is much harder to reduce to a single expression. Yes, you could encapsulate
the problem in a function (although the problem is then simply shoved down
into that function rather than eliminated), but very often it is /not/ so
encapsulated.

And you will note that the code fragment above /still/ leaves z
uninitialised under certain conditions!

Furthermore, consider the following code:

struct foo f;
bar(&f, otherinfo);

Is the effect of this code deterministic? You can't tell, not from just this
information. Okay, so we can't see the bar() code, and we might argue that
the code *might* be deterministic (perhaps, for example, its task is to
give f its proper starting values, in line with your "just in time
definition" strategy). But it might not be. We can, however, reduce the
probability that it behaves non-deterministically by removing a possible
point of indeterminacy, and we do that by giving f a known value:

struct foo f = {0};

[1] For sufficiently loose meanings of "arbitrary".
 
R

Richard Heathfield

santosh said:

As other posters have noted, using a default value, will likely produce
a reproducable bug, if the concerned object is not reinitialised
properly, instead of leading to varying behaviour. The value will also
be easier to spot in a debugging session.

Yes. In fact, I've been noting that point in clc on and off over a period of
several years (for what seems to me to be a fairly large value of
"several").
 
I

Ian Collins

Richard said:
Ian Collins said:




I thank you for acknowledging that my case for defensive initialisation is
strong. :)
As in the resistance put up by a hooked marlin is strong :)
I don't agree that the language feature (defining objects at arbitrary[1]
points) eliminates the problem, though. The simplest example I can think of
is when the appropriate value for an object depends on conditions which
cannot be (or at least have not been and are unlikely to be) reduced to a
single expression.
Maybe not all of them, but probably the majority. There will always be
the exceptions that prove the rule.
Furthermore, consider the following code:

struct foo f;
bar(&f, otherinfo);

Is the effect of this code deterministic? You can't tell, not from just this
information. Okay, so we can't see the bar() code, and we might argue that
the code *might* be deterministic (perhaps, for example, its task is to
give f its proper starting values, in line with your "just in time
definition" strategy). But it might not be. We can, however, reduce the
probability that it behaves non-deterministically by removing a possible
point of indeterminacy, and we do that by giving f a known value:

struct foo f = {0};
Agreed.

But if C had constructors... oops, sorry, I thought I was replying to
Mr. Navia for a moment! Time for bed.
 
S

santosh

Richard said:
santosh said:



Yes. In fact, I've been noting that point in clc on and off over a period of
several years (for what seems to me to be a fairly large value of
"several").

Is it also correct to conclude that you have a strong preference for
calloc over malloc?
 
M

Mark F. Haigh

santosh said:
Is it also correct to conclude that you have a strong preference for
calloc over malloc?

I hate unnecessary uses of calloc(). On many of the systems that can
least afford it, it's generally implemented as an alloction followed by
essentially a memset.

Even on the "good" systems, it can evict your working set from your
cache hierarchy in a hurry, especially if the calloc code does not take
advantage of architecture-specific instructions like "prepare for store"
and friends.

On medium to large SMP systems (especially the ones with some level of
the caching hierarchy shared between CPU cores, or systems with large
coalescing write buffers decoupling caches) it immediately violates the
general rule of "strive for locality".


Mark F. Haigh
(e-mail address removed)
 
R

Richard Tobin

IMHO, initialising pointers to NULL is more relevant than initialising
non-pointer objects.
[/QUOTE]
How do you feel about non-pointer objects that could be used as indices into
an array?

What would you initialise them to? INT_MIN?

-- Richard
 
M

Mark F. Haigh

Richard said:
Richard Tobin said:




I disagree. I consider it a sensible safety precaution.

Actually, Mr. Tobin and myself consider the opposite approach from yours
to be a reasonable safety precation.

Not in the sense that it makes debugging easier, or that it is
potentially more deterministic at runtime, both of which are highly
debatable and situation-dependent.

Rather, leaving a variable uninitialized (rather than initalize it to a
dummy value) helps both the compiler and the many available static
analysis tools to determine if there is a code path that uses an
unitialized value. The idea is to leverage automated data flow analysis
wherever it is possible.

At the worst, a false positive signifies that the code is too complex,
both for human readers and for compilers / analysis tools. This is
still quite a valuable thing to know.

"Lying to the compiler" is just a synonym for "garbage in, garbage out".
Why prevent the compiler or other analysis tools from helping you by
feeding them garbage?
> Obviously your mileage varies, and I respect your reasons, but other
> reasonable views exist.
>

Is that so? Not being a reasonable person myself, I suppose I wouldn't
know.


Mark F. Haigh
(e-mail address removed)
 
R

Richard Tobin

santosh said:
As other posters have noted, using a default value, will likely produce
a reproducable bug, if the concerned object is not reinitialised
properly, instead of leading to varying behaviour.

In most environments, you will get the same bogus value for repeated
runs of a program.

If you're going to initialise variables for this purpose, it would be
best to initialise them to values likely to cause errors when they are
used. For pointers, NULL works as far as dereferencing is concerned,
but of course NULL is often used in valid data to mark absent values,
ends of lists, and so on, so it will often not produce an error. On
some machines, (void *)1 would be a good choice: it's not NULL and it
will produce an error when dereferenced. It would be handy to have
a defined value intended for this purpose. For integers a suitable
value depends on the use of the integer. For floating point values,
a signalling NaN would be ideal.

As I said before, not initialising allows a good compiler to detect
your error at compile time. It might be possible in some cases for it
to detect an error if appropriate "bad" values are used for
initialisation: unconditional dereferencing of a pointer known to be
NULL for example.

-- Richard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,775
Messages
2,569,601
Members
45,183
Latest member
BettinaPol

Latest Threads

Top