C
cr88192
CJ said:Hello:
We know that C programs are often vulnerable to buffer overflows which
overwrite the stack.
But my question is: Why does C insist on storing local variables on the
stack in the first place?
as per the standard, it does not.
as per implementations:
it matches established practice and calling conventions (aka: in 32-bit land
code from one compiler very often links acceptably with that from another);
in the general case, this allows the greatest performance (the stack is
usually a dedicated register into a sliding region of memory, and can thus
be adjusted very quickly).
I can see two definite disadvantages with this:
1) deeply nested recursive calls to a function (especially if it defines
large local arrays) can easily overflow the stack
2) the problems described above of security vulnerabilities.
1. whatever is done, sufficiently deep recursion will break something (be it
a stack overflow or running out of heap).
2. security vulnerabilities will still exist, though they will be mildly
reduced.
My solution would be for C instead to store its local variables on the
heap - effectively separating data from executable code.
What do people think?
simply for partly addressing security issues, a compiler could concievably
treat arrays specially, namely by moving them off the main stack, and
possibly implementing bounds checking (for common cases). if done well, this
could potentially be done with only a minor performance impact. note that
ordinary local variables would likely remain on the stack.
if one does move the locals off the stack (actually, I had considered partly
doing this eventually for the sake of implementing lexical closures), then
they could go "all the way", essentially ending nearly all use of the main
stack (apart from possibly temporary values or similar), which would allow
implementation of many features, such as closures, call/cc, more effective
use of tail-elimination, ...
the big cost would be, for a language like C, this would incur a notable
performance cost (and, very likely, tightly couple the compiled code and the
runtime, making compilation of stand-alone code very problematic).
however, for such a compiler, one "could" possibly make use of a hybrid
approach, using good old stack-frames wherever it can be "proven" that it is
safe to do so (functions are leaf and don't use any advanced features, or
can be verified not use and such features and only call functions with this
same property).
basically, we allow both performance and call/cc, by proving that call/cc,
closures, or anything like them, occur nowhere within the possible call
graph from this point downward (could be very difficult in practice, given
tracability issues, possible use of function pointers, ...).
so, in short, this would be a very expensive feature (but still something I
may pursue at some point, noting that my compiler is primarily JIT-based so
this is acceptable, but likely not so in a more traditional stand-alone
compiler...).
or such...