Are destructors ever optimized away?

Jonathan Lee · May 13, 2009

Hello all,
I have a case where I need a destructor to be performed pretty much
exactly as coded, including zero-ing out all the data members. Since
the memory for the class is freed right away, I can imagine a compiler
optimizing this away. Is there any guarantee that things will happen
as I want them to? If not, is there anything I can do to be sure it
does happen?

An example of what I mean:

ARC4::~ARC4() { for (unsigned i = 0; i < 256; ++i) S = 0; ii =
0; jj =0; }

--Jonathan

Neelesh · May 13, 2009

Hello all,
I have a case where I need a destructor to be performed pretty much
exactly as coded, including zero-ing out all the data members. Since
the memory for the class is freed right away, I can imagine a compiler
optimizing this away. Is there any guarantee that things will happen
as I want them to? If not, is there anything I can do to be sure it
does happen?

An example of what I mean:

ARC4::~ARC4() { for (unsigned i = 0; i < 256; ++i) S = 0; ii =
0; jj =0; }

Since this destructor is explicitly defined, it won't be optimized
away in most of the cases. However, there are exceptions related to
copying and/or returning objects where such a destructor gets
optimized away (Eg Return value optimization) (12.8/15

"Whenever a temporary class object is copied using a copy
constructor....an implementation is permitted to treat the original
and the copy as two different ways of referring to the same object and
not perform a copy at all, even if the class copy constructor or
destructor have side effects. For a function with a class return type,
if the expression in the return statement
is the name of a local object....an implementation is permitted to
omit creating the temporary object to hold the function return value,
even if the class copy constructor or destructor has side effects."

Of course the destructor isn't called becuase the extra copy isn't
created. So if your destructor doesn't do anything other than
"desturction" then this should not be a problem. But if the destructor
also does some side effetcts, like writing to a file etc, then that
might not get performed when the destructor is optimized away in the
cases described above.

Phlip · May 13, 2009

Jonathan said:
An example of what I mean:

ARC4::~ARC4() { for (unsigned i = 0; i < 256; ++i) S = 0; ii =
0; jj =0; }

Note the destructor is (apparently) out-of-line, outside its class, with
external linkage. Would that affect how the compiler might optimize it away?

Further, if the function called memset(), would that have better odds of staying
around? Or how about a user-defined method that set its memory to 0? (Because
memset() is Standard and hence might have some back-door for optimizing _it_ away?)

robertwessel2 · May 13, 2009

Hello all,
I have a case where I need a destructor to be performed pretty much
exactly as coded, including zero-ing out all the data members. Since
the memory for the class is freed right away, I can imagine a compiler
optimizing this away. Is there any guarantee that things will happen
as I want them to? If not, is there anything I can do to be sure it
does happen?

An example of what I mean:

ARC4::~ARC4() { for (unsigned i = 0; i < 256; ++i) S = 0; ii =
0; jj =0; }

Basically the as-if rule gives the compiler license to optimize the
destructor away if the C program could not tell if it ran or not. Or
to optimize parts of it away if the application couldn't tell if those
parts were omitted. So in a typical case, let's say the RC4 S array
was allocated as an automatic, and the destructor was invoked at the
end of the routine, there's nothing preventing the compiler from
removing it.

Having the destructor defined out of line and in a different module
won't do it either - many systems (can) do link-time code generation
where a method defined in another module can be inlined where it's
called (and then optimized away).

Not even declaring it volatile is guaranteed to do that, since
volatile is pretty loosely defined, particularly if the compiler
notices that nothing else *could* be seeing that storage (say an
automatic variable that you've never taken the address of), then the
as-if rule applies again. I suspect that almost all compilers will
honor volatile even in that case, but it's probably not required.
Volatile often has a significant performance hit, although the
basically random access pattern in the RC4 S array might minimize
that. Writing your own secure_memset() function *is* a common
approach, then having it internally case the passed pointed to pointer
to a volatile type. That would likely work on almost all real system,
and would limit the volatile performance hit to the clear itself.

Probably the only thing you could do that would work in all cases is
to write the clear routine in a separate module in such a away that
even link time code generation could not see into it - perhaps an
assembler routine, or an operating system routine that could not be
optimized away. For example, Windows has an API called
SecureZeroMemory() for just that purpose. Although obviously none of
those options are portable.

joshuamaurice · May 13, 2009

Not even declaring it volatile is guaranteed to do that, since
volatile is pretty loosely defined, particularly if the compiler
notices that nothing else *could* be seeing that storage (say an
automatic variable that you've never taken the address of), then the
as-if rule applies again.

While partially true, I must disagree with parts. I do agree that
volatile is loosely defined. Reads and writes to volatile primitives
are defined as an observable side effect of the abstract machine. The
general description given is that the compiler shall not optimize them
away, \even if\ it can prove no part of the program can otherwise
access them, thus including a stack variable.

However, the definition of an observable side effect of the abstract
machine is pretty open to interpretation. Is that an actual
instruction issued on the chip to load or store which may go to a
cache? Must it go to main memory? Is there even memory?

From what I've read, volatile in C and C++ as defined by the C and C++
standards has 3 valid uses, (none of which include threading in the
slightest).
1- Memory mapped IO.
2- setjmp longjump hackery to make sure reads and writes aren't
optimized away across the jump boundary.
3- signal handlers to make sure the reads and writes aren't optimized
away across between the normal code and the signal handler.

I'd suggest the OP more clearly explain what he's trying to accomplish
with this destructor, and why he's so worried about the compiler
optimizing it away in the case when the object is about to die.
Chances are, I'd guess he's trying to do something non-standard, like
using volatile as a threading primitive (which is totally broken on a
general platform).

robertwessel2 · May 13, 2009

While partially true, I must disagree with parts. I do agree that
volatile is loosely defined. Reads and writes to volatile primitives
are defined as an observable side effect of the abstract machine. The
general description given is that the compiler shall not optimize them
away, \even if\ it can prove no part of the program can otherwise
access them, thus including a stack variable.

However, the definition of an observable side effect of the abstract
machine is pretty open to interpretation. Is that an actual
instruction issued on the chip to load or store which may go to a
cache? Must it go to main memory? Is there even memory?

From what I've read, volatile in C and C++ as defined by the C and C++
standards has 3 valid uses, (none of which include threading in the
slightest).
1- Memory mapped IO.
2- setjmp longjump hackery to make sure reads and writes aren't
optimized away across the jump boundary.
3- signal handlers to make sure the reads and writes aren't optimized
away across between the normal code and the signal handler.

I more-or-less agree, but there is no defined precedence between the
as-if rule and the definition of volatile, and that the as-if rule is
commonly interpreted to apply to everything, I think it's not
unreasonable to assume that the required effects of volatile can be
omitted if the compiler can prove that no other part of the system
*can* cause it to be "modified in ways unknown to the implementation,
or have other unknown side effect." An implementation could quite
reasonably insist that automatics are implemented in "ordinary" memory
(IOW, without special side effects), and if the address is never taken
of that automatic, it cannot be reasonably modified outside of the
routine that defines it in any meaningful way (and we can make that
even stronger if the compiler is smart enough to trace use of pointers
that area taken of such objects, and eventually determines that
nothing allows that pointer to "escape").

But it's not really a big issue, since except in a (very) few special
cases, the conditions where I think a compiler could remove a volatile
reference under the as-if rule are all situations where there's
basically no way to use such a thing usefully in your program in the
first place.

I'd suggest the OP more clearly explain what he's trying to accomplish
with this destructor, and why he's so worried about the compiler
optimizing it away in the case when the object is about to die.
Chances are, I'd guess he's trying to do something non-standard, like
using volatile as a threading primitive (which is totally broken on a
general platform).

What he's trying to do is securely clear a crucial part of the RC4
encryption function's state after being done using it. So another
program could not come along and look into memory and perhaps be able
to break the encrypted message by reading that state. A similar
problem exists with programs that have to store passwords in memory
while processing them. Using a simple memset() to clear that very
often does not work because a memset() of an array that's not further
used can often be optimized away. A program like a debugger, or some
injected code that might look at the stack area at an (in)opportune
time might be able to pick that up.

James Kanze · May 13, 2009

[...]

I more-or-less agree, but there is no defined precedence
between the as-if rule and the definition of volatile, and
that the as-if rule is commonly interpreted to apply to
everything, I think it's not unreasonable to assume that the
required effects of volatile can be omitted if the compiler
can prove that no other part of the system *can* cause it to
be "modified in ways unknown to the implementation, or have
other unknown side effect."

The as-if rule applies if and only if the change does not affect
"observable behavior". Accesses to volatile objects are
observable behavior, exactly in the same way a read or write to
a file is observable behavior. A compiler cannot "optimize" it
out any more than it can optimize out reads and writes to files.

The problem in most cases is that "what constitutes and access"
is implementation defined. Which normally means that the
implementation is required to document it (but I've never seen
such documentation). From actual examination of generated code,
I've concluded that the definition used by g++, Sun CC and VC++
(at least in the versions I have access to) is simply issuing
the load or store instruction is considered the access. On most
modern machines, this doesn't mean that the value will actually
end up in physical memory---it does mean, however, that within
the executing thread, the old values will no longer be
available.

[...]

What he's trying to do is securely clear a crucial part of the
RC4 encryption function's state after being done using it.

If so, then overwriting the memory with 0's isn't going to
change much.

So another program could not come along and look into memory
and perhaps be able to break the encrypted message by reading
that state. A similar problem exists with programs that have
to store passwords in memory while processing them. Using a
simple memset() to clear that very often does not work because
a memset() of an array that's not further used can often be
optimized away. A program like a debugger, or some injected
code that might look at the stack area at an (in)opportune
time might be able to pick that up.

The real problem is that an older image of the memory exists in
virtual memory, on disk. And depending on the way virtual
memory is managed, the memory he's overwritten may never be
paged out (rewritten to disk), or it may be paged to a different
sector, leaving the earlier data on disk. If a program is
concerned about such issues, the first thing it has to do is to
lock the page with the buffer in RAM, so that it will never be
paged out. Having done that, memsetting the memory (perhaps by
calling a function in a different translation unit) is probably
sufficient, especially if you turn off optimization on the
translation unit(s) which does it.

Jonathan Lee · May 13, 2009

First off, hello all and thanks for the information!

For those that asked: yup, I'm trying to securely clear memory used by
a cipher (several block and stream ciphers, RC4 as an example above).
It's not production code -- just something for me to fiddle around
with. Still, I'd like it to be as robust as possible.

I only had a quick chance to read through the e-mails, but I figure my
course of action will be:

1. make the copy constructor and assignment operator private to avoid
the situation Neelesh pointed out (should be done anyways).
2. clear the memory through a pointer-to-volatile so that the writes
"ought" to occur, and I won't have to suffer the performance penalty
the volatile will cause in general
3. Use placement new to wrap the ensure the memory is allocated as
James described.
4. Disallow local variables/construction via Named Constructor Idiom.
I'll have to look into seeing if I can force the placement new.

Hopefully I didn't miss anything in this quick response. Any further
comments still welcome. And thanks again to all

--Jonathan

Phlip · May 13, 2009

robertwessel2 said:
What he's trying to do is securely clear a crucial part of the RC4
encryption function's state after being done using it. So another
program could not come along and look into memory and perhaps be able
to break the encrypted message by reading that state.

I thought secure OSes zilched memory with a blitter in between

Note that more than the encryptor must be secure - the data it encrypted also
lives somewhere in memory. I thought secure OSes emitted a hardware fault if
protected-mode programs tried to reach into each others address space and read
these data...

Phlip · May 13, 2009

blargg said:
5. Write some tests which scan memory to ensure you really are cleaning up
sensitive data. In my opinion, if you don't have a test, you must assume
that you aren't clearing all sensitive items. You can't leave this up to
chance.

I tend to run tests in Debug mode. (Meaning tests that drive the design, not
that seek the raw envelop of the languages.) Remember to run such memory
scanning tests with Debug mode off and all production mode optimizations
turned on!

pasa · May 23, 2009

What he's trying to do is securely clear a crucial part of the RC4

encryption function's state after being done using it.

Read here:
https://www.securecoding.cert.org/c...optimization+when+dealing+with+sensitive+data

Unfortunately the C++ version is just a simple copy, not covering the
dtor
case or having any C++ related info.

What would happen if your destructors aren't get called?	14	Feb 3, 2012
Are HDLs Misguided?	37	Dec 11, 2010
Are members of a c++ class initialized to 0 by the default constructor ?	2	Jun 21, 2011
Help with synthesis optimizing away one of my bits	8	Nov 29, 2007
does python have useless destructors?	138	Jun 9, 2004
undefined reference to `vtable for Segment' due to virtual destructor???	5	Nov 20, 2006
Just plain crap?	6	Jun 22, 2007
ideal interface for Random Number Generators?	39	Jun 7, 2010

Are destructors ever optimized away?

Jonathan Lee

Neelesh

Phlip

robertwessel2

joshuamaurice

robertwessel2

James Kanze

Jonathan Lee

Phlip

Phlip

pasa

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads