J
jacob navia
In C, we have read-only memory (const), read/write memory
(normal data), and write only memory.
Let's look at the third one in more detail.
Write only memory is a piece of RAM that can only
be written to, since its contents are undefined.
The program is allocating a new piece of data, and
the previous contents aren't relevant. This memory
is generated by malloc and friends, or allocated
statically by the compiler by making the processor
increase the stack area at the entry of the function.
When you read from write only memory (you use an
uninitialized variable) the behavior is undefined,
i.e. it is declared a big mistake.
It is big because the consequences are random. When
a value is read from that memory locations, its contents
are random. We know that if we have rebooted maybe the
OS has just cleaned up and this memory is not quite
random, it is mostly zero. If the machine is running
since a while however, the contents are probably
whatever data was written to that address before, in
another program.
The symptoms are very vague. Programs that run at the
start stop working, a program that has just run crashes,
symptoms that *could* be related to this program or not.
Confusing symptoms.
This confusion is because of the random nature of the
data being introduced in the program.
The great remedy of course, is to have default values
for all local variables and set them at the start. This
doesn't eliminate all bugs, but at least
eliminates those of reading from write only memory.
As a first approach:
Data *Function(Data *d, index i)
{
Data workCopy;
DataIndex di;
etc...
memset(&workCopy,0,sizeof(Data));
memset(&di, 0, sizeof(DataIndex));
etc...
}
Shorter would be:
Data *Function(Data *d, index i)
{
Data workCopy={0};
DataIndex di={0};
etc...
}
This already much better, but it is still bothersome
if you forget one.
Even better would be if we would just write:
_Pragma(Stdc,Zeroinit,Function)
meaning that in the given function all local
data should be zeroed before use at function
entry.
But that is still too long...
Couldn't we just decide that by default all locals
are zeroed at entry of the function?
Only when you write:
_Pragma(Stdc, Nozeroinit, function)
would be the zeroing of memory be avoided.
I think that would be the best. Not to write
anything at all. This would slow software a bit,
(maybe) but for *many* applications running
in PCs today that would not do any real
performance lost.
Zero is used as default value in many situations,
pointers couldn't by chance destroy another data item
since even uninitialized pointers would be NULL.
How nice. This would mean also no change to
existing programs. They would just run a few
microseconds slower and nobody would care.
The data must be brought to the L1 cache anyway,
and zeroing locals ensures that they do not
provoke a processor L1 cache fault later within
the code of the function. A burst mode
can be probably used if present. This reduces
the cost of each cache failure: all at once.
Most locals space is small anyway.
The problems arising from reading write-only
memory would be restricted to hard traps in
the program, very easy to pinpoint to a
specific line of code.
In most implementations a NULL dereference
traps, and the error is pinpointed exactly
where it arises. With bogus values in a
pointer there is some chance that the pointer
destroys other data structures. Using NULL
there is none. The integrity of the program itself
is not destroyed.
(normal data), and write only memory.
Let's look at the third one in more detail.
Write only memory is a piece of RAM that can only
be written to, since its contents are undefined.
The program is allocating a new piece of data, and
the previous contents aren't relevant. This memory
is generated by malloc and friends, or allocated
statically by the compiler by making the processor
increase the stack area at the entry of the function.
When you read from write only memory (you use an
uninitialized variable) the behavior is undefined,
i.e. it is declared a big mistake.
It is big because the consequences are random. When
a value is read from that memory locations, its contents
are random. We know that if we have rebooted maybe the
OS has just cleaned up and this memory is not quite
random, it is mostly zero. If the machine is running
since a while however, the contents are probably
whatever data was written to that address before, in
another program.
The symptoms are very vague. Programs that run at the
start stop working, a program that has just run crashes,
symptoms that *could* be related to this program or not.
Confusing symptoms.
This confusion is because of the random nature of the
data being introduced in the program.
The great remedy of course, is to have default values
for all local variables and set them at the start. This
doesn't eliminate all bugs, but at least
eliminates those of reading from write only memory.
As a first approach:
Data *Function(Data *d, index i)
{
Data workCopy;
DataIndex di;
etc...
memset(&workCopy,0,sizeof(Data));
memset(&di, 0, sizeof(DataIndex));
etc...
}
Shorter would be:
Data *Function(Data *d, index i)
{
Data workCopy={0};
DataIndex di={0};
etc...
}
This already much better, but it is still bothersome
if you forget one.
Even better would be if we would just write:
_Pragma(Stdc,Zeroinit,Function)
meaning that in the given function all local
data should be zeroed before use at function
entry.
But that is still too long...
Couldn't we just decide that by default all locals
are zeroed at entry of the function?
Only when you write:
_Pragma(Stdc, Nozeroinit, function)
would be the zeroing of memory be avoided.
I think that would be the best. Not to write
anything at all. This would slow software a bit,
(maybe) but for *many* applications running
in PCs today that would not do any real
performance lost.
Zero is used as default value in many situations,
pointers couldn't by chance destroy another data item
since even uninitialized pointers would be NULL.
How nice. This would mean also no change to
existing programs. They would just run a few
microseconds slower and nobody would care.
The data must be brought to the L1 cache anyway,
and zeroing locals ensures that they do not
provoke a processor L1 cache fault later within
the code of the function. A burst mode
can be probably used if present. This reduces
the cost of each cache failure: all at once.
Most locals space is small anyway.
The problems arising from reading write-only
memory would be restricted to hard traps in
the program, very easy to pinpoint to a
specific line of code.
In most implementations a NULL dereference
traps, and the error is pinpointed exactly
where it arises. With bogus values in a
pointer there is some chance that the pointer
destroys other data structures. Using NULL
there is none. The integrity of the program itself
is not destroyed.