Self-reproducing program in Ken Thompson's Turing paper

L

lovecreatesbea...

Ken Thompson mentioned a self-reproducing program that products an
exact copy of its source code as output in his 1983 Turing paper,
http://www.acm.org/classics/sep95/ . Can this be done in C? How can a
program at runtime know its compiling-time source code?
 
C

Christopher Benson-Manica

Ken Thompson mentioned a self-reproducing program that products an
exact copy of its source code as output in his 1983 Turing paper,
http://www.acm.org/classics/sep95/ . Can this be done in C? How can a
program at runtime know its compiling-time source code?

Sure.

#include <stdlib.h>

int main( void ) {
system( "links http://www.google.com" ); /* On DS9K, may open "link"
to the Underdark, beware */
return 0;
}

Surely you've posted here often enough to know that you'll get "STFW"
for an answer to this question? Certainly quines have been discussed
many times on this very newsgroup.
 
W

William Hughes

Ken Thompson mentioned a self-reproducing program that products an
exact copy of its source code as output in his 1983 Turing paper,http://www.acm.org/classics/sep95/. Can this be done in C?

Gosh, I don't know. Maybe I should do a web search, or check the FAQ
or try to run the
strange looking code that shows up in people's signatures.
How can a
program at runtime know its compiling-time source code?

In general it can't, but most programs are not self-reproducing..
However, given that the runtime is a function of the source code,
maybe it
would be possible to choose the source code so that the runtime
prints out this source code. Maybe I should do a web search, or check
the FAQ or try to run the
strange looking code that shows up in people's signatures.
 
L

Lew Pitcher

Ken Thompson mentioned a self-reproducing program that products an
exact copy of its source code as output in his 1983 Turing paper,http://www.acm.org/classics/sep95/. Can this be done in C? How can a
program at runtime know its compiling-time source code?

Others have pointed you at various resources wrt writing C quines.
They have answered your basic question.

But, if that's all you got out of Ken T's paper, then you missed a
lot.

His paper wasn't so much about quines as stand-alone programs, but
with the potential security exposures that quines (when incorporated
into C compilers) could have. He pointed out that a compiler could
imbed a trap-door security breach into anything it compiles, and (if
the compiler was compiled with itself), that trap-door could be
hidden /in the program/ rather than in the source code.

The quine comes from the compiler compiling in the security breach
insertion code into the (2nd generation) compiler - a quine in object
code rather than a quine in source code.
 
R

Richard Bos

Lew Pitcher said:
But, if that's all you got out of Ken T's paper, then you missed a
lot.

His paper wasn't so much about quines as stand-alone programs, but
with the potential security exposures that quines (when incorporated
into C compilers) could have. He pointed out that a compiler could
imbed a trap-door security breach into anything it compiles, and (if
the compiler was compiled with itself), that trap-door could be
hidden /in the program/ rather than in the source code.

Mind you, that trap-door only works if you assume that the compiler will
only be used to compile itself. It doesn't work if you use it to compile
an unrelated compiler of another language, which is used to compile a
cross-compiler of a language using another paradigm, which is used to
compile - for a different CPU - an interpreter for yet another language,
which is used to interpret a re-cross-compiler of the original language.
Bonus points if any of the architectures involved is completely virtual;
more bonus points if any of the languages was designed specifically for
this purpose.

Richard
 
A

Army1987

Lew Pitcher said:
Others have pointed you at various resources wrt writing C quines.
They have answered your basic question.

But, if that's all you got out of Ken T's paper, then you missed a
lot.

His paper wasn't so much about quines as stand-alone programs, but
with the potential security exposures that quines (when incorporated
into C compilers) could have. He pointed out that a compiler could
imbed a trap-door security breach into anything it compiles, and (if
the compiler was compiled with itself), that trap-door could be
hidden /in the program/ rather than in the source code.

The quine comes from the compiler compiling in the security breach
insertion code into the (2nd generation) compiler - a quine in object
code rather than a quine in source code.
Huh?

--
#include <stdlib.h>
int main(int argc, char *argv[])
{
return system(argv[0]);
}
 
T

Tor Rustad

Richard said:
Mind you, that trap-door only works if you assume that the compiler will
only be used to compile itself.

Only??! It will work in more cases than that.

The seriousness here, is that system programs are typically written in
C, and C compilers are typically written in C.

If you develop a new C compiler, and use the infected compiler when
bootstrapping, the backdoor will be inherited. Also, if you create a C
cross compiler, the backdoor will be passed forward.

The payload, can even be targeted for multiple platforms.
 
T

Tor Rustad

Ken Thompson mentioned a self-reproducing program that products an
exact copy of its source code as output in his 1983 Turing paper,
http://www.acm.org/classics/sep95/ . Can this be done in C? How can a
program at runtime know its compiling-time source code?

This is a brilliant paper, read it carefully.I have given this puzzle to
student/junior programmers, multiple times.

If you really want to learn something, try hard to solve this puzzle in
C yourself. Seeing a solution, gives you little.
 
K

Keith Thompson

Tor Rustad said:
Only??! It will work in more cases than that.

The seriousness here, is that system programs are typically written in
C, and C compilers are typically written in C.

If you develop a new C compiler, and use the infected compiler when
bootstrapping, the backdoor will be inherited. Also, if you create a C
cross compiler, the backdoor will be passed forward.

If I recall correctly (and it's entirely possible I don't), that's
unlikely. The trapdoor works by recognizing a certain piece of the
code being compiled, and generating special output in response to it.
An independently developed cpomiler is unlikely to contain anything
matching that pattern.
 
T

Tor Rustad

[Can't see my response, this is a repost & possibly a duplicate message]

Keith said:
>
> If I recall correctly (and it's entirely possible I don't), that's
> unlikely. The trapdoor works by recognizing a certain piece of the
> code being compiled, and generating special output in response to it.

Right. There are two parts, the "back door" and the "infecter". The back
door is targeted for some security sensitive software, e.g. SSH, while
the "infecter" is targeted the compiler, which will later be used to
compile e.g. SSH.
> An independently developed cpomiler is unlikely to contain anything
> matching that pattern.

Normally, compilers these days are not hand-coded, tools like lex & yacc
generate some of the C code of the compiler. These tools will leave a
signature. I have not done any prototyping of this, but it should not be
that hard to detect tokens like e.g. yyparse for the infecter.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top