This is undefined, but is it legal?

J

jl_post

Hi,

I've heard that if you've declared a variable (such as a double or
an int) and not initialize it, then the result of printing out its
value is undefined.

I've also heard that "undefined behavior" can mean just about
anything, such as "flying monkeys shooting out of your nose." Sure,
that's an exaggeration, but normally I interpret that to mean that the
program can crash (or cease running) erratically, or even corrupt
data.

So my question is: Although I can never safely predict the printed
output of an uninitialized int or double, is it still safe (or legal)
to do so? In other words, if I run this program:

#include <iostream>

int main(int argc, char ** argv)
{
int i;
double d;

std::cout << "i = " << i << std::endl; // safe?
std::cout << "d = " << d << std::endl; // safe?

return 0;
}

I may not be able to predict what will print out, but is there any
chance that the program can crash because of those lines?

If you're curious why I ask this, it's because in some code I'm
working through there is a structure with ints -- some of which are
never used nor initialized. However, this structure (will all its
ints) is getting written out to disk (and later read back in). But at
no time are the values of these uninitialized ints used for logic in
the code.

Because the code is writing out uninitialized values (and later
reading them in), is there a possibility that the program can self-
destruct (or corrupt anything) just because those ints weren't
initialized?

Thanks in advance.

-- Jean-Luc
 
L

Lionel B

Hi,

I've heard that if you've declared a variable (such as a double or
an int) and not initialize it, then the result of printing out its value
is undefined.

I've also heard that "undefined behavior" can mean just about
anything, such as "flying monkeys shooting out of your nose."

In fact the commonly acknowledged consequence is that your fridge defrosts.
Sure,
that's an exaggeration, but normally I interpret that to mean that the
program can crash (or cease running) erratically, or even corrupt data.

So my question is: Although I can never safely predict the printed
output of an uninitialized int or double, is it still safe (or legal)
to do so?

It's certainly "legal" in the sense that it won't prevent your program
compiling.
In other words, if I run this program:

#include <iostream>

int main(int argc, char ** argv)
{
int i;
double d;

std::cout << "i = " << i << std::endl; // safe?
std::cout << "d = " << d << std::endl; // safe?

return 0;
}

I may not be able to predict what will print out, but is there any
chance that the program can crash because of those lines?

You're ok. The *value* of the variable i may well be undefined, but
i is nonetheless an int; and outputting an int - any int, whatever its
value - should never crash your program. Ditto double, etc.
 
J

James Kanze

I've heard that if you've declared a variable (such as a
double or an int) and not initialize it, then the result of
printing out its value is undefined.

Anything you do with its *value* is undefined behavior. (You
can still take its address, or assign to it.) With the
exception of unsigned char and char.
I've also heard that "undefined behavior" can mean just about
anything, such as "flying monkeys shooting out of your nose."
Sure, that's an exaggeration, but normally I interpret that to
mean that the program can crash (or cease running)
erratically, or even corrupt data.

In non-priviledged mode, under a modern general purpose OS,
that's what it normally means. On systems without a priviledged
mode (or in kernel code)... I have seen it require the disk to
be reformatted.
So my question is: Although I can never safely predict the
printed output of an uninitialized int or double, is it still
safe (or legal) to do so?

No. It's undefined behavior.
In other words, if I run this program:
#include <iostream>
int main(int argc, char ** argv)
{
int i;
double d;
std::cout << "i = " << i << std::endl; // safe?
std::cout << "d = " << d << std::endl; // safe?
return 0;
}
I may not be able to predict what will print out, but is there
any chance that the program can crash because of those lines?

Of course. It's not likely with int, on most modern machines
(but there is at least one where it is a distinct possibility).
With double, it's possible on every Windows or Unix machine I
know.
If you're curious why I ask this, it's because in some code
I'm working through there is a structure with ints -- some of
which are never used nor initialized. However, this structure
(will all its ints) is getting written out to disk (and later
read back in). But at no time are the values of these
uninitialized ints used for logic in the code.
Because the code is writing out uninitialized values (and
later reading them in), is there a possibility that the
program can self- destruct (or corrupt anything) just because
those ints weren't initialized?

Formally, yes, and any good debugging system will complain. (I
know Purify does, because I've had to deal with the same
problem.) Why don't you just initialize the structs.

And how are you writing them out? If you're just copying the
bits of a struct to disk, then you have no guarantee of being
able to read the data in the future.
 
A

Andrew Koenig

You're ok. The *value* of the variable i may well be undefined, but
i is nonetheless an int; and outputting an int - any int, whatever its
value - should never crash your program. Ditto double, etc.

Not true. It's not true in theory for int, and definitely not true in
practice for double -- because IEEE floating-point, which most modern
computers use, has a notion of "signaling not-a-number" values that cause a
run-time error condition if accessed.

So even in the simple case where the implementation puts random bits into
uninitialized doubles and does no other checking, it is entirely possible
that those random bits will happen to be a signaling NaN value, which will
cause a run-time exception.

Even for int, there is no reason why an implementation cannot keep track of
whether variables have been initialized, and stop the program if it tries to
access an uninitialized variable. I am not aware of any such
implementations in widespread use today, but they have existed in the past.
 
M

Marcel Müller

Pete said:
My reading of IEEE-754 is that acessing a signaling NaN causes an
invalid operation exception by default, and the result of that exception
is just to return a quite NaN. Unless the program has installed a trap
handler, this is completely innocuous.

In theory.
In practice a had a bug in some debug output a few days ago that
accidentally read a double from a union that was initialized as a
pointer. In many cases nothing strange happened. But under some
conditions the whole operating system immediately froze, even when
running in the debugger. I don't know what exception the random bits
caused, but obviously it was strange enough that nobody tested it before.


Marcel
 
M

Marcel Müller

Victor said:
Generally speaking, any *use* of an uninitialised object has undefined
behaviour.

You should exclude the operations 'taking the address of' and 'creating
a reference to'. They are obviously allowed unless you dereference the
pointers, again except for an assignment.
Strictly speaking this applies to PODs only. However, since C++ objects
usually are valid after construction, they make less trouble unless your
constructor leaves uninitialized members. This should be up to very
basic libraries only. E.g. some std::vector implementations do not
initialize the elements in the range [size(),capacity()[.


Marcel
 
J

James Kanze

You should exclude the operations 'taking the address of' and
'creating a reference to'. They are obviously allowed unless
you dereference the pointers, again except for an assignment.

The actual language says that it is using the value of an
uninitialised object which has undefined behavior. Roughly
speaking (or perhaps exactly speaking), it is the lvalue to
rvalue conversion which triggers the undefined behavior. (You
can still assign to the object, for example.)
Strictly speaking this applies to PODs only.

What makes you say that? It applies to everything. (Of course,
a lot of non-POD's have user defined constructors, which take
care of the initialization. But the standard doesn't require
it.)
However, since C++ objects usually are valid after
construction, they make less trouble unless your constructor
leaves uninitialized members. This should be up to very basic
libraries only. E.g. some std::vector implementations do not
initialize the elements in the range [size(),capacity()[.

Some? The standard doesn't allow them to initialize those
elements (and it doesn't allow you to access them).
 
M

Marcel Müller

James said:
The actual language says that it is using the value of an
uninitialised object which has undefined behavior. Roughly
speaking (or perhaps exactly speaking), it is the lvalue to
rvalue conversion which triggers the undefined behavior. (You
can still assign to the object, for example.)

Hmm, plausible definition.
What makes you say that? It applies to everything. (Of course,
a lot of non-POD's have user defined constructors, which take
care of the initialization. But the standard doesn't require
it.)

And that is the point. The standard cannot ensure that a user defined
class constructor initializes all members. E.g. if a pointer member is
not initialized correctly, the assignment operator may fail to cleanup
the old content. On the other side there might be good reasons not to
initialize all PODs of a class in the constructor. So it is always up to
the user to ensure the defined behavior. I would not call it a good
design, when invoking the assignment operator for a default constructed
object is undefined behavior. But the standard does not generally
require defined behavior in this case. In some GUI libraries it is e.g.
required to call some initialization function after the constructor is
called, because some abstract base classes cannot be fully initialized
without the virtual functions of the derived class. Before that
initialization call nearly any other function call is invalid.

libraries only. E.g. some std::vector implementations do not
initialize the elements in the range [size(),capacity()[.

Some? The standard doesn't allow them to initialize those
elements (and it doesn't allow you to access them).

Oh, I always thought that this is some kind of optimization, since
objects in an vector have to be default constructable anyway.


Marcel
 
J

James Kanze

James said:
libraries only. E.g. some std::vector implementations do not
initialize the elements in the range [size(),capacity()[.
Some? The standard doesn't allow them to initialize those
elements (and it doesn't allow you to access them).
Oh, I always thought that this is some kind of optimization,
since objects in an vector have to be default constructable
anyway.

Since when? All the standard requires is copy constructable and
assignable. (Which is a good thing, because I have some
instances of vectors of objects which are not default
constructable.)
 
O

Old Wolf

Anything you do with its *value* is undefined behavior. (You
can still take its address, or assign to it.) With the
exception of unsigned char and char.

And how are you writing them out? If you're just copying the
bits of a struct to disk, then you have no guarantee of being
able to read the data in the future.

Nobody's mentioned this explicitly yet :
AFAIK, it is not undefined to copy the bits
to disk and read them back, even if the variable
is uninitialized. (Of course the value is
still indeterminate after the garbage is read
back from disk).
 
J

James Kanze

Nobody's mentioned this explicitly yet :
AFAIK, it is not undefined to copy the bits
to disk and read them back, even if the variable
is uninitialized. (Of course the value is
still indeterminate after the garbage is read
back from disk).

That's a good point. I think it depends on how you do it, but
in most cases, it should be OK. All the C++ standard speaks
about directly is istream::read and ostream::write (and the
streambuf functions they used); for istream, ostream and
streambuf, the interface uses char*, so you're OK. For
wistream, wostream and wstreambuf? A wchar_t can have a
trapping representation. As for fread and fwrite, inherited
from C, they are required to do input and output "as if" using
fgetc and fputc, which means byte access as well (even if they
take a void*), and thus defined behavior.

For anything else, it depends on the implementation, but in
practice, it's hard to imagine a system level read or write
doing anything but byte accesses.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,073
Latest member
DarinCeden

Latest Threads

Top