C decompiler/disassembler?

D

DMn2004404

I recently downloaded a program (*.EXE) written in C, and I discovered
it had a few bugs. The original author of the program is apparently
unreachable, so I figure it falls on me to try and crack into the
program and fix the bugs. Does anyone know of a good C decompiler or
disassembler I can use to do this, that will read an EXE program as
input and generate its source code?

Brandon Taylor
 
K

Keith Thompson

DMn2004404 said:
I recently downloaded a program (*.EXE) written in C, and I discovered
it had a few bugs. The original author of the program is apparently
unreachable, so I figure it falls on me to try and crack into the
program and fix the bugs. Does anyone know of a good C decompiler or
disassembler I can use to do this, that will read an EXE program as
input and generate its source code?

It's theoretically possible, given an executable file, to generate
C source code that will at least have the same behavior, if not
re-generate an identical executable if compiled. Do not expect
any such generated C source code to be readable or maintainable.

It's not theoretically possible to regenerate the original C
source file from an executable. Too much information, including
most identifiers, is discarded during compilation. It's like
unscrambling an egg.
 
U

Uno

Keith said:
It's theoretically possible, given an executable file, to generate
C source code that will at least have the same behavior, if not
re-generate an identical executable if compiled. Do not expect
any such generated C source code to be readable or maintainable.

Let me see if I can narrow the question to be topical and practical.
[my browser only sometimes lets me paste as quotation; how annoying:]
#include <stdio.h>
#include <math.h>

// C will be caller

int main()
{

double d, pi;
d = atan(1.0);
pi = (4.0)*d;
printf("pi is %f\n", pi);

return 0;
}

// gcc -std=c99 -Wall -Wextra c_pi1.c -o out

If you were to pick apart the executable resulting from *this* program,
with access to the same compiler and running on the same platform, how
would you go about it?
It's not theoretically possible to regenerate the original C
source file from an executable. Too much information, including
most identifiers, is discarded during compilation. It's like
unscrambling an egg.

and there's an infinite number of programs that can produce the same
executable.
 
B

Billy Mays

program and fix the bugs. Does anyone know of a good C decompiler or
disassembler I can use to do this, that will read an EXE program as
input and generate its source code?
[...]
It's not theoretically possible to regenerate the original C
source file from an executable. Too much information, including
most identifiers, is discarded during compilation. It's like
unscrambling an egg.

I believe the expression is something like "you can't turn hamburger
back into a cow." :)

http://www.itee.uq.edu.au/~cristina/dcc.html

From the docs: "Dcc has a fundamental implementation flaw that limits
it to about 30KB of input binary program, i.e. it currently handles toy
programs only! The problem is that pointers are kept in many places;
many of these pointers point to elements of arrays. The arrays are all
of variable size; the realloc system call can and will change the
virtual addresses of these arrays, thus invalidating the pointers.
Because of this, results are unpredictable as soon as one array is resized."
 
T

Tim Harig

That operation has been broken (in the sense of "breaking the code"),
given the existence of DNA, and assuming you're not talking about
*cooked* hamburger (which, as I understand it from various "CSI"
TV shows, destroys the DNA). Of course, it's still more economical
to raise another cow rather than attempting to clone one.

You also do not get the same cow that you had before since the new cow will
not have the memories and experiences that the old cow did. You will
likely get something that is quite similar with similar tendencies; but,
some things are also likely to be different.
Executables do not contain anywhere near the hints provided by DNA.
You may or may not lose:
- auto variable names (if no debug symbols)
- types (if no debug symbols) (if ints and longs are 4 bytes
each, you can't tell which was intended, and then port
the program to a system where that isn't true). You also
can't tell for sure whether loading 2 bytes of variable
is due to the variable being two bytes, or it's an
optimization of variable&65535.
- structures. It's difficult to tell whether two variables
close to each other are part of the same structure or just
different variables
- function names (if no symbols)
- Macros. Calls to things like getc() may expand to wierd stuff.
Also, the constant 1 might be SEEK_CUR, EXIT_FAILURE, SIGHUP,
SIG_IGN, or something else.

Indeed, there is no such thing as a decompiler; but, there are
deassemblers, used for reverse engineering, that can provide guesses to
information that is missing in the binary. Depending on how optimized the
program is, they can sometimes identify things like loop constructs and
notice data that is always used and manipulated together. The generated
assembly code includes the meta-information detected in its comments and
automatically generates replacement variable names for what types it able
to detect that you can search and replace as you figure out their
functionality. They still are not going to get you anywhere near the
origional C source; but, the information that they can provide makes it
much easier to figure out how and what a program is doing.
 
T

Tim Harig

I recently downloaded a program (*.EXE) written in C, and I discovered
it had a few bugs. The original author of the program is apparently
unreachable, so I figure it falls on me to try and crack into the
program and fix the bugs. Does anyone know of a good C decompiler or
disassembler I can use to do this, that will read an EXE program as
input and generate its source code?

1. If you know how to do what the program does, it will be *much* easier
simply to write it from scratch as you are unlikely to be able to
the origional source back through disassembly.

2. If you don't know how the program works and you cannot find any
documentation that allows you to implement the functionality
yourself, then as a very last result, you can try to reverse
engineer the program. There are disassemblers that can try to
guess about some of the structure and data of the program. (As well
as packet sniffers, hex data viewers, system call tracers, etc that
help you infer what the program is doing without having to decode
the actual source.) These tools fall under the domain of reverse
engineering and you will be better off searching for reverse
engineering tools and asking in reverse engineering groups
directly. You might find some information in security groups as
finding security vulnerabilities often entails reverse
engineering.
 
N

Nick Keighley

nothing will generate "its source code". Compilation is information
lossy. You may be able to generate "some corresponding source code",
hopefully of a readable nature.
If you were to pick apart the executable resulting from *this* program,
with access to the same compiler and running on the same platform, how
would you go about it?

I fail to see your point. Do you mean "if I were the writer of a
decompiler how would I pick apart the executable"? Are you expecting a
lot of optimisation or something?

What would be wrong with

#include <stdio.h>

int main (void)
{
printf ("pi is %f\n", 3.14159265);
return 0;
}

and there's an infinite number of programs that can produce the same
executable.

so what?


--

"High Integrity Software: The SPARK Approach to Safety and Security"
Customers interested in this title may also be interested in:
"Windows XP Home"
(Amazon)
 
N

Nick Keighley

That operation has been broken (in the sense of "breaking the code"),
given the existence of DNA,

you might be surprised at the limitaions of cloning...

[...] Of course, it's still more economical
to raise another cow rather than attempting to clone one.

Executables do not contain anywhere near the hints provided by DNA.

I think you might be surprised how little there is in the DNA!

Consider CC the Cloned Cat. He was a tortoise-shell but was marked
completly differently from his clone doner(?). This is because coat
marking are epigenic.
 
J

jchl

Nick Keighley said:
there are things that claim to be decompilers

http://boomerang.sourceforge.net/

Or how about HexRays, one of the 'better' x86 decompilers out there
(ridiculously expensive though). But manual disassembly will always produce
better results.

Anyway, depending on how much debug info is left intact inside the
executable, one could reproduce the original source down to the correct line
number of the source file.

Though typically, the best you could do is just recreate a 'functional
equivalent' of the original.
 
S

Sachin Magdum

Or how about HexRays, one of the 'better' x86 decompilers out there
(ridiculously expensive though). But manual disassembly will always produce
better results.

Anyway, depending on how much debug info is left intact inside the
executable, one could reproduce the original source down to the correct line
number of the source file.

Though typically, the best you could do is just recreate a 'functional
equivalent' of the original.

Question : Is your .EXE complied with debugging optoin ON? Anyone know
how to check that? In theory it should easy to extract code (may not
be as is but atleast ...) out of .EXE program compiled with debugging
ON.

Option : As explained, it is hard to regenerate the code out of EXE
without debugging info ON. Other feasible option might be to get
similar software without the issues with this software.

Option : It is even more fesible if you can specify in and out of the
program and someone could rewrite that, And even if you get the code
out of buggi software someone will need to understand and fix it,
right?
 
C

Chris H

In message <[email protected]
s.com> said:
I recently downloaded a program (*.EXE) written in C, and I discovered
it had a few bugs. The original author of the program is apparently
unreachable, so I figure it falls on me to try and crack into the
program and fix the bugs. Does anyone know of a good C decompiler or
disassembler I can use to do this, that will read an EXE program as
input and generate its source code?

Brandon Taylor

To some extent it depends where in the world you are and the license
attached to the application. The license may not permit reverse
engineering and neither may the law where you are unless you have
explicit permission...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top