Converting ASM to C

G

Glen Richards

Is there a way to do this. I mean there is a company who converts asm to
their wsl language and then from that to c is there a way that we can do
this?
 
E

E. Robert Tisdale

Glen said:
Is there a way to do this. I mean, is there a company
who converts asm to their wsl language and then from that to c?
No.

Is there a way that we can do this?

No.

Information is discarded in the process
of compiling a higher level language to assembler
that cannot be recovered from the assembler alone.
 
A

Allin Cottrell

E. Robert Tisdale said:
>
No.


No.

Information is discarded in the process
of compiling a higher level language to assembler
that cannot be recovered from the assembler alone.

OK, this is off-topic here, but I'm not convinced by the impossibility
claim. What is surely impossible is to retrieve the _particular_ C
code that, when run through some (unknown) compiler, generated a given
glob of machine code. But given the asm, shouldn't it be possible
in principle to generate (non-unique) C code of equivalent effect?
Whether anyone provides this sort of service commercially, I have
no idea.
 
E

E. Robert Tisdale

Allin said:
OK, this is off-topic here
but I'm not convinced by the impossibility claim.
What is surely impossible is to retrieve the _particular_ C code
that, when run through some (unknown) compiler,
generated a given glob of machine code.
But given the assembler, shouldn't it be possible, in principle,
to generate (non-unique) C code of equivalent effect?

In general, no. You would be obliged to emulate
the machine architecture and the operating system (OS).
You would need to be able to recognize calls to the OS
for I/O for example. In other words, you would need information
about the program besides what remains in the assembler listing
to resolve all of these references.
Whether anyone provides this sort of service commercially,
I have no idea.

There are (or at least were) people in the KBG, CIA, NSA, etc.
that could do a fairly reasonable job of "reverse engineering"
machine codes (assembler).
 
M

Mike Wahler

Glen Richards said:
Is there a way to do this.

Sure. Find out what the ASM program does, then
write the C code to do the same thing.
I mean there is a company who converts asm to
their wsl language and then from that to c is there a way that we can do
this?

There might indeed exist some 'automated' methods, but
their output (C source) would very likely be very cryptic,
usually meant only for consumption by a computer.

-Mike
 
S

Sidney Cadot

Allin said:
OK, this is off-topic here, but I'm not convinced by the impossibility
claim. What is surely impossible is to retrieve the _particular_ C
code that, when run through some (unknown) compiler, generated a given
glob of machine code. But given the asm, shouldn't it be possible
in principle to generate (non-unique) C code of equivalent effect?

In principle, yes. Enumerate all possible files (an infinite, but
countable set); compile them with all possible compiler/flags
combinations (a finite set); compare the results with the executable.
You will lose information (comments; symbol names; high-level
constructs) but it is guaranteed to work (given a rather large amount of
time).

But of course, there are smarter ways. Searching for "decompilation" on
Google gives a couple of interesting hits.

A large amount of work has been done in this area; both in an academic
setting and a commercial setting. With regard to the latter: there's a
terrifying quantity of code out there that is in active use, but for
which the source code is no longer available (mostly COBOL). Some
companies specialize in semi-automatic reverse-engineering of this vital
software.

Best regards, Sidney
 
P

Papadopoulos Giannis

Mike said:
There might indeed exist some 'automated' methods, but
their output (C source) would very likely be very cryptic,
usually meant only for consumption by a computer.

-Mike

Yes, it might seem as a Basic program... (only gotos)...
 
D

Darrell Grainger

Is there a way to do this. I mean there is a company who converts asm to
their wsl language and then from that to c is there a way that we can do
this?

Normally people ask if you can convert machine language to C source. Two
reasons for this. First is that I have the binaries but lost the source
code (it happens even with backups). The second is that I have someone
else's binaries and I want to reverse engineer them.

If you want to go from machine code to C source there are programs out
there that will do something. All are operating system specific and most
are compiler specific as well. Just do a search on "reverse engineer <your
OS> <your compiler>" and you might find something. The source code they
product is difficult to read and next to impossible to maintain. It is
often easier to reverse engineer the requirements and write the program
from scratch.

If you have actual assembly source code and want to turn it into C source
code that might actually be harder. The market for people who know C but
have some assembly code is a lot smaller than people who want to reverse
engineer binaries. It would also be specific to the assembler and the
operating system. Maybe the search for reverse engineering might find
something but the results will be about the same or worse than going from
binary to C source. If you cannot find an assembly language to C source
converter you can try getting an assembler, create a binary then use
machine language to C source converts.

Bottom line, it is usually more effort to maintain the resulting source
code then it would be to write the application from scratch.
 
K

Kelsey Bjarnason

[snips]

There are (or at least were) people in the KBG, CIA, NSA, etc.
that could do a fairly reasonable job of "reverse engineering"
machine codes (assembler).

Umm... it's not all that hard to reverse-engineer machine code, actually.
It's just tedious, slow, and not amenable to algorithmic solutions.
 
D

Dan Pop

In said:
[snips]

There are (or at least were) people in the KBG, CIA, NSA, etc.
that could do a fairly reasonable job of "reverse engineering"
machine codes (assembler).

Umm... it's not all that hard to reverse-engineer machine code, actually.
It's just tedious, slow, and not amenable to algorithmic solutions.

It's an excellent exercise for anyone heavily involved in assembly
programming. And, occasionally, a must if a piece of software (or even
hardware) is not properly documented.

As a trivial example, it's usually easier to figure out how to interface
C code to a Fortran program by looking at the Fortran compiler output
than by digging into the documentation.

Dan
 
G

glen herrmannsfeldt

Darrell Grainger wrote:

(snip)
If you have actual assembly source code and want to turn it into C source
code that might actually be harder. The market for people who know C but
have some assembly code is a lot smaller than people who want to reverse
engineer binaries. It would also be specific to the assembler and the
operating system. Maybe the search for reverse engineering might find
something but the results will be about the same or worse than going from
binary to C source. If you cannot find an assembly language to C source
converter you can try getting an assembler, create a binary then use
machine language to C source converts.

It was more popular some years ago when some assembly programs
needed Y2K fixes. Some decided if they were going to work on them
at all they might use more modern machines. The result might be C
that is about as readable as the assembly language. Maybe C variables
named after each register, and then operations are done to those
variables as they would be to the registers of the source machine.

-- glen
 
G

glen herrmannsfeldt

Dan Pop wrote:

(snip regarding reverse engineering)
It's an excellent exercise for anyone heavily involved in assembly
programming. And, occasionally, a must if a piece of software (or even
hardware) is not properly documented.
As a trivial example, it's usually easier to figure out how to interface
C code to a Fortran program by looking at the Fortran compiler output
than by digging into the documentation.

Especially if the compiler will generate the assembly code
in people readable form, as most will. Though it might
take more work to find the special cases and exceptions.

-- glen
 
D

Dan Pop

In said:
Dan Pop wrote:

(snip regarding reverse engineering)



Especially if the compiler will generate the assembly code
in people readable form, as most will.

Even if it doesn't, there may be tools that "reverse engineer" object
files into highly readable assembly, because the symbol table is
present in the file. E.g. objdump from the GNU binutils, but I remember
using a similar tool under MSDOS, too.
Though it might take more work to find the special cases and exceptions.

The idea is that you investigate the cases that are relevant to you.
If you need to pass 2 double precision numbers and one integer number
to the C routine, you couldn't care less about how Fortran passes strings.
You simply write the function/subroutine call of interest to you and
compile it. By examining the generated code, you know what the C
function will receive.

Dan
 
B

Bob Sheff

Normally people ask if you can convert machine language to C source. Two
reasons for this. First is that I have the binaries but lost the source
code (it happens even with backups). The second is that I have someone
else's binaries and I want to reverse engineer them.

Binary to Asm is often difficult, requireing many person-oriented passes
with a disassembler, making the judgements: (often for each byte!)
is that BYTE DATA or part of an INSTRUCTION?
When you decide it is an instruction, because of the flow, is the
referenced word an "long int", "float", pointer or some struct or array
base-address?
This process can only be verified when the resultant Assy source is
understandable, assembled and then linked into the IDENTICAL core image the
original program had.
I have proposed an instruction interpretor which could mark each byte as it
is used by each instruction while running the code (but i've never seen
one!).
Most disassemblers dont work(by themselves)! -- especially with variable
length instruction formats (like x86!)
If you want to go from machine code to C source there are programs out
there that will do something. All are operating system specific and most
are compiler specific as well.
A person who knows what a compiler will generate for each statement can
de-compile the assembly source fairly easily, that person can also write a
program to do the same thing more rapidly
Just do a search on "reverse engineer <your
OS> <your compiler>" and you might find something. The source code they
product is difficult to read and next to impossible to maintain. It is
often easier to reverse engineer the requirements and write the program
from scratch.
This depends a lot on the AMOUNT of programs you need to de-compile
certainly Hundreds of lines, maybe a Thousand lines and NOT MILLIONS of
lines.
If you have actual assembly source code and want to turn it into C source
code that might actually be harder.
Applied Conversion Technologies (www.actworld.com) was originally started
to exploit the technology I developed to translate 45MB of DG NOVA assembly
to C to move a CAM system to the PC/AT platform in the 80's. Much of the
Assy source contained comments that were useful in maintaining the
translated C source, some was not. The main features were simularities in
the programs that could be recognized and consistently translated.
Another project: involving the CDC 469 (Phalanx Gun) computer Assy to Ada
required discarding lots of comments relating to fixed point arithmetic
magnitue which was irrelevant when variables were re-cast to floating
point.
The market for people who know C but
have some assembly code is a lot smaller than people who want to reverse
engineer binaries.
RIGHT!

It would also be specific to the assembler and the
operating system. Maybe the search for reverse engineering might find
something but the results will be about the same or worse than going from
binary to C source.
absolutely NOT, there is information in the Assy source that shouldn't be
lost! However, each case will be different and custom for each
programmer/compiler and the effort expended to extract the design will be
a judgement of the business-persons involved.
If you cannot find an assembly language to C source
converter you can try getting an assembler, create a binary then use
machine language to C source converts.
one step forward, TWO or more back!
Bottom line, it is usually more effort to maintain the resulting source
code then it would be to write the application from scratch.
UNLESS you factor in the NEWLY introduced bugs while writing fro SCRATCH.
also: "better the bugs you know than the bugs you haven't met yet"

You must also factor in advances in interface design:
Does Visual XX replace all that code with a few mouse KLIKS? and MegaBytes
of DLL

Further, you must consider the goodness in moving forward from previous
designs accurately translated (warts and all), and not "re-inventing the
wheel".

Bob Sheff; PBgeek at att dot net
Independent Consultant:
Software(Pascal,PL/M,CHILL,FORTRAN,..assy) Conversion to C/C++
please do not reply to (e-mail address removed) or
(e-mail address removed)
 
Joined
Feb 8, 2010
Messages
1
Reaction score
0
Asm to C translator

A free project to translate assembler code into C will startup soon:

read more here: hxxp:// www . asm2c.gnx.at


of course:
their output (C source) will not look like the original code..
this tool is for your own programs, written in assambly..

but i think it will also work for disassambled code from binarys...
 
Joined
Nov 1, 2010
Messages
1
Reaction score
0
It's not correct to say that converting assembly language code back to C is impossible, or that the results are always very hard to understand.

It's true that conversion is a hard problem to crack, but it is entirely possible with the right tools.

Our company, MicroAPL, has a software tool called Relogix which reverse-engineers assembly code and produces C. We aim to get close to what a human programmer might write - i.e. readable, maintainable code.

To judge how well we do, take a look at some of the examples on our web site. These are all automatic translations produced by Relogix, before our engineers perform a post-translation cleanup.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top