Converting ASM to C

Discussion in 'C Programming' started by Glen Richards, Feb 11, 2004.

  1. Is there a way to do this. I mean there is a company who converts asm to
    their wsl language and then from that to c is there a way that we can do
    this?
     
    Glen Richards, Feb 11, 2004
    #1
    1. Advertising

  2. Glen Richards wrote:

    > Is there a way to do this. I mean, is there a company
    > who converts asm to their wsl language and then from that to c?


    No.

    > Is there a way that we can do this?


    No.

    Information is discarded in the process
    of compiling a higher level language to assembler
    that cannot be recovered from the assembler alone.
     
    E. Robert Tisdale, Feb 11, 2004
    #2
    1. Advertising

  3. E. Robert Tisdale wrote:
    > Glen Richards wrote:
    >
    >> Is there a way to do this. I mean, is there a company who converts asm
    >> to their wsl language and then from that to c?

    >
    > No.
    >
    >> Is there a way that we can do this?

    >
    > No.
    >
    > Information is discarded in the process
    > of compiling a higher level language to assembler
    > that cannot be recovered from the assembler alone.


    OK, this is off-topic here, but I'm not convinced by the impossibility
    claim. What is surely impossible is to retrieve the _particular_ C
    code that, when run through some (unknown) compiler, generated a given
    glob of machine code. But given the asm, shouldn't it be possible
    in principle to generate (non-unique) C code of equivalent effect?
    Whether anyone provides this sort of service commercially, I have
    no idea.

    --
    Allin Cottrell
    Department of Economics
    Wake Forest University, NC
     
    Allin Cottrell, Feb 11, 2004
    #3
  4. Allin Cottrell wrote:

    > OK, this is off-topic here
    > but I'm not convinced by the impossibility claim.
    > What is surely impossible is to retrieve the _particular_ C code
    > that, when run through some (unknown) compiler,
    > generated a given glob of machine code.
    > But given the assembler, shouldn't it be possible, in principle,
    > to generate (non-unique) C code of equivalent effect?


    In general, no. You would be obliged to emulate
    the machine architecture and the operating system (OS).
    You would need to be able to recognize calls to the OS
    for I/O for example. In other words, you would need information
    about the program besides what remains in the assembler listing
    to resolve all of these references.

    > Whether anyone provides this sort of service commercially,
    > I have no idea.


    There are (or at least were) people in the KBG, CIA, NSA, etc.
    that could do a fairly reasonable job of "reverse engineering"
    machine codes (assembler).
     
    E. Robert Tisdale, Feb 11, 2004
    #4
  5. Glen Richards

    Mike Wahler Guest

    "Glen Richards" <> wrote in message
    news:GYhWb.143769$U%5.658804@attbi_s03...
    > Is there a way to do this.


    Sure. Find out what the ASM program does, then
    write the C code to do the same thing.

    > I mean there is a company who converts asm to
    > their wsl language and then from that to c is there a way that we can do
    > this?


    There might indeed exist some 'automated' methods, but
    their output (C source) would very likely be very cryptic,
    usually meant only for consumption by a computer.

    -Mike
     
    Mike Wahler, Feb 11, 2004
    #5
  6. Glen Richards

    Sidney Cadot Guest

    Allin Cottrell wrote:

    > E. Robert Tisdale wrote:
    >
    >> Glen Richards wrote:
    >>
    >>> Is there a way to do this. I mean, is there a company who converts
    >>> asm to their wsl language and then from that to c?

    >
    > >

    >
    >> No.
    >>
    >>> Is there a way that we can do this?

    >>
    >>
    >> No.
    >>
    >> Information is discarded in the process
    >> of compiling a higher level language to assembler
    >> that cannot be recovered from the assembler alone.

    >
    >
    > OK, this is off-topic here, but I'm not convinced by the impossibility
    > claim. What is surely impossible is to retrieve the _particular_ C
    > code that, when run through some (unknown) compiler, generated a given
    > glob of machine code. But given the asm, shouldn't it be possible
    > in principle to generate (non-unique) C code of equivalent effect?


    In principle, yes. Enumerate all possible files (an infinite, but
    countable set); compile them with all possible compiler/flags
    combinations (a finite set); compare the results with the executable.
    You will lose information (comments; symbol names; high-level
    constructs) but it is guaranteed to work (given a rather large amount of
    time).

    But of course, there are smarter ways. Searching for "decompilation" on
    Google gives a couple of interesting hits.

    A large amount of work has been done in this area; both in an academic
    setting and a commercial setting. With regard to the latter: there's a
    terrifying quantity of code out there that is in active use, but for
    which the source code is no longer available (mostly COBOL). Some
    companies specialize in semi-automatic reverse-engineering of this vital
    software.

    Best regards, Sidney
     
    Sidney Cadot, Feb 11, 2004
    #6
  7. Mike Wahler wrote:
    > There might indeed exist some 'automated' methods, but
    > their output (C source) would very likely be very cryptic,
    > usually meant only for consumption by a computer.
    >
    > -Mike
    >


    Yes, it might seem as a Basic program... (only gotos)...
     
    Papadopoulos Giannis, Feb 11, 2004
    #7
  8. On Wed, 11 Feb 2004, Glen Richards wrote:

    > Is there a way to do this. I mean there is a company who converts asm to
    > their wsl language and then from that to c is there a way that we can do
    > this?


    Normally people ask if you can convert machine language to C source. Two
    reasons for this. First is that I have the binaries but lost the source
    code (it happens even with backups). The second is that I have someone
    else's binaries and I want to reverse engineer them.

    If you want to go from machine code to C source there are programs out
    there that will do something. All are operating system specific and most
    are compiler specific as well. Just do a search on "reverse engineer <your
    OS> <your compiler>" and you might find something. The source code they
    product is difficult to read and next to impossible to maintain. It is
    often easier to reverse engineer the requirements and write the program
    from scratch.

    If you have actual assembly source code and want to turn it into C source
    code that might actually be harder. The market for people who know C but
    have some assembly code is a lot smaller than people who want to reverse
    engineer binaries. It would also be specific to the assembler and the
    operating system. Maybe the search for reverse engineering might find
    something but the results will be about the same or worse than going from
    binary to C source. If you cannot find an assembly language to C source
    converter you can try getting an assembler, create a binary then use
    machine language to C source converts.

    Bottom line, it is usually more effort to maintain the resulting source
    code then it would be to write the application from scratch.

    --
    Send e-mail to: darrell at cs dot toronto dot edu
    Don't send e-mail to
     
    Darrell Grainger, Feb 11, 2004
    #8
  9. [snips]

    On Tue, 10 Feb 2004 21:44:59 -0800, E. Robert Tisdale wrote:

    >> Whether anyone provides this sort of service commercially,
    >> I have no idea.

    >
    > There are (or at least were) people in the KBG, CIA, NSA, etc.
    > that could do a fairly reasonable job of "reverse engineering"
    > machine codes (assembler).


    Umm... it's not all that hard to reverse-engineer machine code, actually.
    It's just tedious, slow, and not amenable to algorithmic solutions.
     
    Kelsey Bjarnason, Feb 11, 2004
    #9
  10. Glen Richards

    Dan Pop Guest

    In <> Kelsey Bjarnason <> writes:

    >[snips]
    >
    >On Tue, 10 Feb 2004 21:44:59 -0800, E. Robert Tisdale wrote:
    >
    >>> Whether anyone provides this sort of service commercially,
    >>> I have no idea.

    >>
    >> There are (or at least were) people in the KBG, CIA, NSA, etc.
    >> that could do a fairly reasonable job of "reverse engineering"
    >> machine codes (assembler).

    >
    >Umm... it's not all that hard to reverse-engineer machine code, actually.
    >It's just tedious, slow, and not amenable to algorithmic solutions.


    It's an excellent exercise for anyone heavily involved in assembly
    programming. And, occasionally, a must if a piece of software (or even
    hardware) is not properly documented.

    As a trivial example, it's usually easier to figure out how to interface
    C code to a Fortran program by looking at the Fortran compiler output
    than by digging into the documentation.

    Dan
    --
    Dan Pop
    DESY Zeuthen, RZ group
    Email:
     
    Dan Pop, Feb 11, 2004
    #10
  11. Darrell Grainger wrote:

    (snip)

    > If you have actual assembly source code and want to turn it into C source
    > code that might actually be harder. The market for people who know C but
    > have some assembly code is a lot smaller than people who want to reverse
    > engineer binaries. It would also be specific to the assembler and the
    > operating system. Maybe the search for reverse engineering might find
    > something but the results will be about the same or worse than going from
    > binary to C source. If you cannot find an assembly language to C source
    > converter you can try getting an assembler, create a binary then use
    > machine language to C source converts.


    It was more popular some years ago when some assembly programs
    needed Y2K fixes. Some decided if they were going to work on them
    at all they might use more modern machines. The result might be C
    that is about as readable as the assembly language. Maybe C variables
    named after each register, and then operations are done to those
    variables as they would be to the registers of the source machine.

    -- glen
     
    glen herrmannsfeldt, Feb 13, 2004
    #11
  12. Dan Pop wrote:

    (snip regarding reverse engineering)

    > It's an excellent exercise for anyone heavily involved in assembly
    > programming. And, occasionally, a must if a piece of software (or even
    > hardware) is not properly documented.


    > As a trivial example, it's usually easier to figure out how to interface
    > C code to a Fortran program by looking at the Fortran compiler output
    > than by digging into the documentation.


    Especially if the compiler will generate the assembly code
    in people readable form, as most will. Though it might
    take more work to find the special cases and exceptions.

    -- glen
     
    glen herrmannsfeldt, Feb 13, 2004
    #12
  13. Glen Richards

    Dan Pop Guest

    In <hZ_Wb.20166$jk2.64393@attbi_s53> glen herrmannsfeldt <> writes:

    >Dan Pop wrote:
    >
    >(snip regarding reverse engineering)
    >
    >> It's an excellent exercise for anyone heavily involved in assembly
    >> programming. And, occasionally, a must if a piece of software (or even
    >> hardware) is not properly documented.

    >
    >> As a trivial example, it's usually easier to figure out how to interface
    >> C code to a Fortran program by looking at the Fortran compiler output
    >> than by digging into the documentation.

    >
    >Especially if the compiler will generate the assembly code
    >in people readable form, as most will.


    Even if it doesn't, there may be tools that "reverse engineer" object
    files into highly readable assembly, because the symbol table is
    present in the file. E.g. objdump from the GNU binutils, but I remember
    using a similar tool under MSDOS, too.

    >Though it might take more work to find the special cases and exceptions.


    The idea is that you investigate the cases that are relevant to you.
    If you need to pass 2 double precision numbers and one integer number
    to the C routine, you couldn't care less about how Fortran passes strings.
    You simply write the function/subroutine call of interest to you and
    compile it. By examining the generated code, you know what the C
    function will receive.

    Dan
    --
    Dan Pop
    DESY Zeuthen, RZ group
    Email:
     
    Dan Pop, Feb 13, 2004
    #13
  14. Glen Richards

    Bob Sheff Guest

    On 11 Feb 2004 14:57:06 GMT, (Darrell
    Grainger) wrote:

    >On Wed, 11 Feb 2004, Glen Richards wrote:
    >
    >> Is there a way to do this. I mean there is a company who converts asm to
    >> their wsl language and then from that to c is there a way that we can do
    >> this?

    >
    >Normally people ask if you can convert machine language to C source. Two
    >reasons for this. First is that I have the binaries but lost the source
    >code (it happens even with backups). The second is that I have someone
    >else's binaries and I want to reverse engineer them.


    Binary to Asm is often difficult, requireing many person-oriented passes
    with a disassembler, making the judgements: (often for each byte!)
    is that BYTE DATA or part of an INSTRUCTION?
    When you decide it is an instruction, because of the flow, is the
    referenced word an "long int", "float", pointer or some struct or array
    base-address?
    This process can only be verified when the resultant Assy source is
    understandable, assembled and then linked into the IDENTICAL core image the
    original program had.
    I have proposed an instruction interpretor which could mark each byte as it
    is used by each instruction while running the code (but i've never seen
    one!).
    Most disassemblers dont work(by themselves)! -- especially with variable
    length instruction formats (like x86!)

    >
    >If you want to go from machine code to C source there are programs out
    >there that will do something. All are operating system specific and most
    >are compiler specific as well.

    A person who knows what a compiler will generate for each statement can
    de-compile the assembly source fairly easily, that person can also write a
    program to do the same thing more rapidly

    > Just do a search on "reverse engineer <your
    >OS> <your compiler>" and you might find something. The source code they
    >product is difficult to read and next to impossible to maintain. It is
    >often easier to reverse engineer the requirements and write the program
    >from scratch.

    This depends a lot on the AMOUNT of programs you need to de-compile
    certainly Hundreds of lines, maybe a Thousand lines and NOT MILLIONS of
    lines.

    >
    >If you have actual assembly source code and want to turn it into C source
    >code that might actually be harder.

    Applied Conversion Technologies (www.actworld.com) was originally started
    to exploit the technology I developed to translate 45MB of DG NOVA assembly
    to C to move a CAM system to the PC/AT platform in the 80's. Much of the
    Assy source contained comments that were useful in maintaining the
    translated C source, some was not. The main features were simularities in
    the programs that could be recognized and consistently translated.
    Another project: involving the CDC 469 (Phalanx Gun) computer Assy to Ada
    required discarding lots of comments relating to fixed point arithmetic
    magnitue which was irrelevant when variables were re-cast to floating
    point.

    > The market for people who know C but
    >have some assembly code is a lot smaller than people who want to reverse
    >engineer binaries.


    RIGHT!

    > It would also be specific to the assembler and the
    >operating system. Maybe the search for reverse engineering might find
    >something but the results will be about the same or worse than going from
    >binary to C source.

    absolutely NOT, there is information in the Assy source that shouldn't be
    lost! However, each case will be different and custom for each
    programmer/compiler and the effort expended to extract the design will be
    a judgement of the business-persons involved.

    > If you cannot find an assembly language to C source
    >converter you can try getting an assembler, create a binary then use
    >machine language to C source converts.

    one step forward, TWO or more back!

    >
    >Bottom line, it is usually more effort to maintain the resulting source
    >code then it would be to write the application from scratch.

    UNLESS you factor in the NEWLY introduced bugs while writing fro SCRATCH.
    also: "better the bugs you know than the bugs you haven't met yet"

    You must also factor in advances in interface design:
    Does Visual XX replace all that code with a few mouse KLIKS? and MegaBytes
    of DLL

    Further, you must consider the goodness in moving forward from previous
    designs accurately translated (warts and all), and not "re-inventing the
    wheel".

    Bob Sheff; PBgeek at att dot net
    Independent Consultant:
    Software(Pascal,PL/M,CHILL,FORTRAN,..assy) Conversion to C/C++
    please do not reply to or
     
    Bob Sheff, Feb 16, 2004
    #14
  15. Glen Richards

    m477hi45

    Joined:
    Feb 8, 2010
    Messages:
    1
    Asm to C translator

    A free project to translate assembler code into C will startup soon:

    read more here: hxxp:// www . asm2c.gnx.at


    of course:
    their output (C source) will not look like the original code..
    this tool is for your own programs, written in assambly..

    but i think it will also work for disassambled code from binarys...
     
    m477hi45, Feb 8, 2010
    #15
  16. Glen Richards

    Simon Marsden

    Joined:
    Nov 1, 2010
    Messages:
    1
    Location:
    East Sussex, UK
    It's not correct to say that converting assembly language code back to C is impossible, or that the results are always very hard to understand.

    It's true that conversion is a hard problem to crack, but it is entirely possible with the right tools.

    Our company, MicroAPL, has a software tool called Relogix which reverse-engineers assembly code and produces C. We aim to get close to what a human programmer might write - i.e. readable, maintainable code.

    To judge how well we do, take a look at some of the examples on our web site. These are all automatic translations produced by Relogix, before our engineers perform a post-translation cleanup.
     
    Simon Marsden, Nov 1, 2010
    #16
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Al Ponomarev
    Replies:
    3
    Views:
    489
    Ken Cox [Microsoft MVP]
    May 3, 2004
  2. Edwin Knoppert

    define byte asm substitute?

    Edwin Knoppert, Jan 11, 2006, in forum: ASP .Net
    Replies:
    0
    Views:
    530
    Edwin Knoppert
    Jan 11, 2006
  3. toddneumiller

    ASM Help

    toddneumiller, Nov 6, 2003, in forum: Java
    Replies:
    8
    Views:
    548
  4. Francesco Devittori

    ASM (vs. BCEL) - can I do this?

    Francesco Devittori, Dec 20, 2005, in forum: Java
    Replies:
    2
    Views:
    1,369
    Francesco Devittori
    Dec 21, 2005
  5. Oliver Batchelor
    Replies:
    1
    Views:
    387
    Frank Schmitt
    Jul 22, 2003
Loading...

Share This Page