There's some confusion here. If we know that the given executable file
is the output of a particular compiler whose output is a deterministic
function of its input, then certainly a brute-force search will find, in
finite time, some source file which when compiled by that compiler
yields the desired output.
The more general problem, which is described in the papers cited,
appears to suppose that we are given an executable file and a compiler
for some high-level language, but the executable was not necessarily
produced by the compiler. We are asked to find a source file which,
when compiled, produces an executable which behaves equivalently to the
given one (let us say, it produces the same output as the given
executable for all possible inputs). Of course, the new executable need
not be byte-by-byte identical to the original one; the compiler may not
be capable of producing such a thing. I presume there are also some
restrictions needed on the language, so that the source really is a
high-level description of the program and not just a disassembly or
something. (Certainly we can disassemble the program, but we won't know
that the instructions in the disassembly are actually the instructions
that the machine executes. For instance, it may jump into the middle of
one of the "instructions" in the proposed disassembly.)
I don't doubt that the latter problem is reducible to the halting
problem. The former certainly is not, and it more accurately describes
the OP's question. Not that it's a practical solution, of course.
I think that this approach assumes that the compiler vendor is writing
the decompiler.
For instance, a compiler vendor can substitute inline assembly for a
function call and do other operations that are not actually attainable
from the C language itself. It is entirely possible that some of the
inline source code cannot be reversed to C (we can easily imagine
system specific things like file operations).
Compilers have bugs. I found a bug generated in the assembly of the
Intel compiler, for instance. Now, what is the decompiler going to do
with this incorrect assembly language when trying to turn it back into
C?
I think that in general, the problem is a lot more difficult than most
people think. Not to say that useful decompilers cannot be created.
Just that these decompilers will exhibit lots and lots of problems,
especially when you try do do something complex with them.
I have also found that even the much more simplistic job of converting
an executable into its assembly language can sometimes go awry, even
with the best available disassemblers. If this task can be
problematic, I guess that the far more difficult job of translating
that assembly into code that could have made it will be a lot more
difficult.
I have tried a few executable decompilers. My bottom line assessment
is that none of them work (their net value is close to zero). Now,
pcode decompilers work great. DotNet code or Java can be turned back
into reliable source straight away. The new source code won't look
much like the original, but it can be compiled again most of the time
and the new binary will pretty much behave like the old one.
IMO-YMMV.
[snip]