Interfacing C++ and assembler code

Discussion in 'C++' started by Rune Allnor, Dec 26, 2009.

  1. Rune Allnor

    Rune Allnor Guest

    Hi all.

    When one mixes C++ and assmbler code one needs to
    know a few details about the way the CPU works. In particular,
    one would need to know

    1) what registers and variables must be left alone
    2) which variables and registers might be manipulated
    inside the routine if the contents are restored before
    the routine terminates
    3) what registers and variables one is free to play with at will.

    Where can one find information about such details?
    I am particularly interested in stuff relating to VS2008.

    Rune
    Rune Allnor, Dec 26, 2009
    #1
    1. Advertising

  2. Rune Allnor

    Jonathan Lee Guest

    On Dec 26, 11:37 am, Rune Allnor <> wrote:
    > Hi all.
    >
    > When one mixes C++ and assmbler code one needs to
    > know a few details about the way the CPU works. In particular,
    > one would need to know
    >
    > 1) what registers and variables must be left alone
    > 2) which variables and registers might be manipulated
    >    inside the routine if the contents are restored before
    >    the routine terminates
    > 3) what registers and variables one is free to play with at will.
    >
    > Where can one find information about such details?
    > I am particularly interested in stuff relating to VS2008.
    >
    > Rune


    I don't use visual studio, but a Google search brought this
    up:

    http://msdn.microsoft.com/en-us/library/4ks26t93(VS.80).aspx

    --Jonathan
    Jonathan Lee, Dec 26, 2009
    #2
    1. Advertising

  3. Rune Allnor wrote:
    > Hi all.
    >
    > When one mixes C++ and assmbler code one needs to
    > know a few details about the way the CPU works. In particular,
    > one would need to know
    >
    > 1) what registers and variables must be left alone
    > 2) which variables and registers might be manipulated
    > inside the routine if the contents are restored before
    > the routine terminates
    > 3) what registers and variables one is free to play with at will.
    >
    > Where can one find information about such details?
    > I am particularly interested in stuff relating to VS2008.
    >
    > Rune


    This is implementation of my Window class, I think, everything is clear
    here, how you can do it in win32, unfortunately microsoft has banned
    assembler from 64 bit environment there you can only
    write separate assembler modules:

    #include "Window.h"

    class Window::Thunk{
    friend class Window;
    public:
    Thunk(Window*);
    void SetDestroyed(){ destroyed_ = true; }
    void AddRef()
    {
    ++ref_;
    }
    void ReleaseRef(bool dest = false)
    {
    --ref_;
    if(ref_ <= 0)
    if(dest)delete this;
    else destroyed_=true;
    }
    ~Thunk(){ delete[] code_; instances_--;}
    private:
    static void Destroy();
    Thunk(const Thunk&);
    Thunk& operator=(const Thunk&);
    BYTE* code_;
    bool destroyed_;
    int ref_;
    class Test{
    public:
    ~Test(){ if (Thunk::instances_)MessageBox(NULL,"THUNK
    INSTANCES","",MB_OK);}
    };
    static Test test;
    static int instances_;
    };

    Window::Thunk::Test Window::Thunk::test;
    Window::Test Window::test;

    int Window::instances_ = 0;
    int Window::Thunk::instances_ = 0;

    Window::Window(bool isDlg)
    :hwnd_(0),orig_proc_(0),wnd_class_(0),thunk_(0),ref_(1)
    {
    thunk_ = new Thunk(this);
    wnd_proc_ = (WNDPROC)thunk_->code_;
    instances_++;
    }

    Window::~Window()
    {
    instances_--;
    if(IsWindow())
    {
    if(orig_proc_)
    SetWindowLongPtr(GWLP_WNDPROC,(LONG_PTR)orig_proc_);
    else
    SetWindowLongPtr(GWLP_WNDPROC,(LONG_PTR)DefWindowProc);
    DestroyWindow();
    }
    if(wnd_class_)::UnregisterClass((LPCTSTR)wnd_class_,(HINSTANCE)GetWindowLongPtr(GWLP_HINSTANCE));

    if(thunk_)thunk_->ReleaseRef(true);
    }

    void Window::Create(DWORD dwExStyle,
    LPCTSTR lpClassName,
    LPCTSTR lpWindowName,
    DWORD dwStyle,
    int x,
    int y,
    int nWidth,
    int nHeight,
    HWND hWndParent,
    HMENU hMenu,
    HINSTANCE hInstance,
    LPVOID lpParam
    )

    {
    if(hwnd_)
    {
    MessageBox(NULL,"Window already created","",MB_ICONEXCLAMATION | MB_OK);
    return;
    }
    if(!lpClassName && !wnd_class_)
    {
    WNDCLASSEX wincl = RegisterClass(wnd_proc_);
    if (wnd_class_ = ::RegisterClassEx (&wincl),!wnd_class_)
    {
    DWORD error = GetLastError();
    char buf[4096];
    FormatMessage(FORMAT_MESSAGE_FROM_SYSTEM, NULL, error, 0, buf, sizeof
    buf,NULL);
    throw std::runtime_error(buf);
    }
    }

    hwnd_ = ::CreateWindowEx(dwExStyle,
    lpClassName?lpClassName:(LPCTSTR)wnd_class_,
    lpWindowName,
    dwStyle,
    x,
    y,
    nWidth,
    nHeight,
    hWndParent,
    hMenu,
    hInstance,
    lpParam);
    if (!hwnd_)
    {
    DWORD error = GetLastError();
    char buf[4096];
    FormatMessage(FORMAT_MESSAGE_FROM_SYSTEM, NULL, error, 0, buf, sizeof
    buf,NULL);
    throw std::runtime_error(buf);
    }
    WNDPROC proc =
    (WNDPROC)SetWindowLongPtr(GWLP_WNDPROC,(LONG_PTR)wnd_proc_);
    if (proc != wnd_proc_)orig_proc_ = proc;
    }

    LRESULT CALLBACK Window::WindowProc(Window* pThis,HWND hwnd,UINT
    message,WPARAM wParam,LPARAM lParam)
    {
    LRESULT rc;Thunk* pThunk=pThis->thunk_;
    if(pThis->hwnd_ != hwnd)pThis->hwnd_ = hwnd;
    try
    {
    AutoRef<Thunk> apThunk(pThunk);
    apThunk->AddRef();
    rc = pThis->ProcessMessages(hwnd,message,wParam,lParam);
    }catch(...)
    {
    if(pThunk->destroyed_)delete pThunk;
    throw;
    }
    return rc;
    }

    __declspec(naked) LRESULT CALLBACK Window::CodeProc(HWND hwnd,UINT
    message,WPARAM wParam,LPARAM lParam)
    {
    Window* pThis;Thunk* pThunk;
    LRESULT rc;
    __asm
    {
    mov eax, end;
    sub eax, begin; // eax == size
    mov ebx, tag;
    sub ebx, begin; // ebx == tag offset
    mov ecx, begin;
    ret;
    }
    begin:
    __asm
    {
    push ebp;
    mov ebp,esp;
    sub esp,__LOCAL_SIZE;
    }

    __asm mov dword ptr [pThis],0x0;
    tag:
    pThunk = pThis->thunk_;
    __asm // rc = WindowProc(pThis,hwnd,message,wParam,lParam);
    {
    push dword ptr[lParam];
    push dword ptr[wParam];
    push dword ptr[message];
    push dword ptr[hwnd];
    push dword ptr[pThis];
    mov edx, dword ptr[Window::WindowProc];
    call edx;
    mov dword ptr[rc], eax;
    }

    if(pThunk->destroyed_)
    {
    __asm
    {
    mov eax, dword ptr[rc];
    mov ecx, dword ptr[pThunk];
    mov esp,ebp;
    pop ebp;
    mov edx, dword ptr[Window::Thunk::Destroy];
    jmp edx; // we don't wont Destroy to return since this function will
    be deleted
    }
    }

    __asm
    {
    mov eax,dword ptr [rc];
    mov esp,ebp;
    pop ebp;
    ret 0x10; // __stdcall stack cleanup
    }
    end:;
    }

    Window::Thunk::Thunk(Window* pThis):destroyed_(false),ref_(1)
    {
    instances_++;
    size_t lsize,lmod,lbegin;
    __asm
    {
    call Window::CodeProc;
    mov dword ptr[lsize], eax;
    mov dword ptr[lmod], ebx;
    mov dword ptr[lbegin], ecx;
    }
    code_ = new BYTE[lsize];
    void *pDst = code_;

    __asm //memcpy(pDst,code,lsize);
    {
    mov esi, dword ptr[lbegin];
    mov edi, dword ptr[pDst];
    mov ecx, dword ptr[lsize];
    cld;
    rep movsb;
    }
    __asm // modify mov var,0 to mov var,address
    {
    mov eax, dword ptr[pThis];
    mov ebx, dword ptr[pDst];
    add ebx, dword ptr[lmod];
    mov dword ptr[ebx-4], eax;
    }
    DWORD oldprotect;
    if(!VirtualProtect(code_, lsize, PAGE_EXECUTE_READWRITE, &oldprotect))
    {
    DWORD error = GetLastError();
    char buf[4096];
    FormatMessage(FORMAT_MESSAGE_FROM_SYSTEM, NULL, error, 0, buf, sizeof
    buf,NULL);
    delete code_;
    throw std::runtime_error(buf);
    }
    FlushInstructionCache(GetCurrentProcess(), code_, lsize);
    }

    __declspec(naked) void Window::Thunk::Destroy()
    {
    Thunk* pThunk;
    __asm
    {
    push ebp;
    mov ebp,esp;
    sub esp,__LOCAL_SIZE;
    push eax; // save rc
    mov dword ptr[pThunk],ecx
    }
    delete pThunk;
    __asm
    {
    pop eax; // restore rc
    mov esp,ebp;
    pop ebp;
    ret 0x10; //// __stdcall stack cleanup from code proc
    }
    }

    Greets, hope this helps...

    --
    http://maxa.homedns.org/
    Branimir Maksimovic, Dec 26, 2009
    #3
  4. "Rune Allnor" <> wrote in message
    news:...
    > Hi all.
    >
    > When one mixes C++ and assmbler code one needs to
    > know a few details about the way the CPU works. In particular,
    > one would need to know
    >
    > 1) what registers and variables must be left alone
    > 2) which variables and registers might be manipulated
    > inside the routine if the contents are restored before
    > the routine terminates
    > 3) what registers and variables one is free to play with at will.
    >
    > Where can one find information about such details?
    > I am particularly interested in stuff relating to VS2008.
    >


    http://en.wikipedia.org/wiki/Calling_convention

    and:
    http://en.wikipedia.org/wiki/X86_calling_conventions

    these give a general overview and provides a few useful links.

    for example:
    http://www.agner.org/optimize/calling_conventions.pdf


    however, if you are intending to mix C++ and ASM, it is suggested to use the
    'extern "C"' modifier for both imported and exported code, as this makes the
    job a lot easier.

    extern "C" {
    void foo() //foo does not use name mangling
    {
    ...
    }
    };

    void bar() //but bar may be name-mangled
    {
    ...
    }


    particularly of note is 'cdecl' (Win32) and the Win64 calling convention.

    if 'extern "C"' is not used, then one has to deal with the C++ ABI, which is
    generally compiler specific, and also more complicated (may involve name
    mangling, additional "hidden" parameters, concern over things like object
    and vtable layout, ... which can be generally avoided by pretending all is
    C...).


    > Rune
    BGB / cr88192, Dec 27, 2009
    #4
  5. Rune Allnor

    Jorgen Grahn Guest

    On Sat, 2009-12-26, Branimir Maksimovic wrote:
    ....
    > This is implementation of my Window class, I think, everything is clear
    > here, how you can do it in win32, unfortunately microsoft has banned
    > assembler from 64 bit environment there you can only
    > write separate assembler modules:


    I don't do Win32 programming, but your example makes it reasonably
    clear that they did *not* ban assembly; they simply removed the C++
    extensions for inline assembly from their compiler. Some would say
    that was a wise choice.

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
    Jorgen Grahn, Dec 27, 2009
    #5
  6. Jorgen Grahn wrote:
    > On Sat, 2009-12-26, Branimir Maksimovic wrote:
    > ...
    >> This is implementation of my Window class, I think, everything is clear
    >> here, how you can do it in win32, unfortunately microsoft has banned
    >> assembler from 64 bit environment there you can only
    >> write separate assembler modules:

    >
    > I don't do Win32 programming, but your example makes it reasonably
    > clear that they did *not* ban assembly; they simply removed the C++
    > extensions for inline assembly from their compiler. Some would say
    > that was a wise choice.


    They say they did that because of portability. Actually, assembler
    is not a problem there, rather their win32 api.
    Having to write separate assembler modules forces you to write
    lot of assembler, except just on places where winapi
    forces you to use assembler anyway ;)
    Xlib is better designed that that thing...

    Greets

    --
    http://maxa.homedns.org/
    Branimir Maksimovic, Dec 28, 2009
    #6
  7. Rune Allnor

    Bo Persson Guest

    Jorgen Grahn wrote:
    > On Sat, 2009-12-26, Branimir Maksimovic wrote:
    > ...
    >> This is implementation of my Window class, I think, everything is
    >> clear here, how you can do it in win32, unfortunately microsoft
    >> has banned assembler from 64 bit environment there you can only
    >> write separate assembler modules:

    >
    > I don't do Win32 programming, but your example makes it reasonably
    > clear that they did *not* ban assembly; they simply removed the C++
    > extensions for inline assembly from their compiler. Some would say
    > that was a wise choice.
    >


    Yes, most of the "interesting" instructions are available as compiler
    intrinsics, so you can still use them.

    Mixing a few assembly instructions in the middle of a C or C++
    function tended to disturb the optimizer so that the surrounding code
    was less well optimized. That took away most of the advantage of using
    assembly to gain extra speed, so any performance bottlenecks would
    generally have to be coded separately anyway.


    Bo Persson
    Bo Persson, Dec 28, 2009
    #7
  8. Rune Allnor

    Pavel Guest

    Bo Persson wrote:
    > Jorgen Grahn wrote:
    >> On Sat, 2009-12-26, Branimir Maksimovic wrote:
    >> ...
    >>> This is implementation of my Window class, I think, everything is
    >>> clear here, how you can do it in win32, unfortunately microsoft
    >>> has banned assembler from 64 bit environment there you can only
    >>> write separate assembler modules:

    >>
    >> I don't do Win32 programming, but your example makes it reasonably
    >> clear that they did *not* ban assembly; they simply removed the C++
    >> extensions for inline assembly from their compiler. Some would say
    >> that was a wise choice.
    >>

    >
    > Yes, most of the "interesting" instructions are available as compiler
    > intrinsics, so you can still use them.
    >
    > Mixing a few assembly instructions in the middle of a C or C++
    > function tended to disturb the optimizer so that the surrounding code
    > was less well optimized. That took away most of the advantage of using
    > assembly to gain extra speed

    I am not sure about Visual C++ in particular but in case of gcc it is
    currently not quite so. Gcc inline assembler provides a nice set of
    means for inline assembler snippet to cooperate with the optimizer.

    In my practice, I often use "GCC-Inline-Assembly-HOWTO" by Sandeep.S and
    I am usually satisfied with the generated code. See for example
    http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#ss5.3
    for the discussion of how to let the compiler know which exactly
    registers are clobbered by an inline assembler fragment.

    , so any performance bottlenecks would
    > generally have to be coded separately anyway.

    In my practice, the required assembler fragments are usually small (most
    often, I resort to them to use a particularly beneficial instruction
    that compiler would not be able to infer); putting them in separate
    included files is sometimes feasible but not to separate compilation
    units. At my desired optimization level, the compilers I use perform
    global in-lining within a compilation unit (I mean optimizer's
    opportunistic inlining which is not limited by C++ functions declared
    inline); having assembler snippets in separate units would prevent this
    from happening.

    >
    >
    > Bo Persson
    >
    >
    Pavel, Dec 29, 2009
    #8
  9. Rune Allnor

    Bo Persson Guest

    Pavel wrote:
    > Bo Persson wrote:
    >> Jorgen Grahn wrote:
    >>> On Sat, 2009-12-26, Branimir Maksimovic wrote:
    >>> ...
    >>>> This is implementation of my Window class, I think, everything is
    >>>> clear here, how you can do it in win32, unfortunately microsoft
    >>>> has banned assembler from 64 bit environment there you can only
    >>>> write separate assembler modules:
    >>>
    >>> I don't do Win32 programming, but your example makes it reasonably
    >>> clear that they did *not* ban assembly; they simply removed the
    >>> C++ extensions for inline assembly from their compiler. Some
    >>> would say that was a wise choice.
    >>>

    >>
    >> Yes, most of the "interesting" instructions are available as
    >> compiler intrinsics, so you can still use them.
    >>
    >> Mixing a few assembly instructions in the middle of a C or C++
    >> function tended to disturb the optimizer so that the surrounding
    >> code was less well optimized. That took away most of the advantage
    >> of using assembly to gain extra speed

    > I am not sure about Visual C++ in particular but in case of gcc it
    > is currently not quite so. Gcc inline assembler provides a nice set
    > of means for inline assembler snippet to cooperate with the
    > optimizer.
    > In my practice, I often use "GCC-Inline-Assembly-HOWTO" by
    > Sandeep.S and I am usually satisfied with the generated code. See
    > for example
    > http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#ss5.3
    > for the discussion of how to let the compiler know which exactly
    > registers are clobbered by an inline assembler fragment.
    > , so any performance bottlenecks would
    >> generally have to be coded separately anyway.

    > In my practice, the required assembler fragments are usually small
    > (most often, I resort to them to use a particularly beneficial
    > instruction that compiler would not be able to infer); putting them
    > in separate included files is sometimes feasible but not to
    > separate compilation units. At my desired optimization level, the
    > compilers I use perform global in-lining within a compilation unit
    > (I mean optimizer's opportunistic inlining which is not limited by
    > C++ functions declared inline); having assembler snippets in
    > separate units would prevent this from happening.


    Yes, but the interesting instructions that do not map to C or C++
    constructs are available as compiler intrinsics (equivalent to
    __builtin_x for gcc). These instructions are known to the optimizer,
    and that knowledge is taken advantage of.

    http://msdn.microsoft.com/en-us/library/azcs88h2(VS.100).aspx


    Therefore the lack of inline assembly for x64 is hardly ever
    important.


    Bo Persson
    Bo Persson, Dec 29, 2009
    #9
  10. Bo Persson wrote:
    > Pavel wrote:
    >> Bo Persson wrote:
    >>> Jorgen Grahn wrote:
    >>>> On Sat, 2009-12-26, Branimir Maksimovic wrote:
    >>>> ...
    >>>>> This is implementation of my Window class, I think, everything is
    >>>>> clear here, how you can do it in win32, unfortunately microsoft
    >>>>> has banned assembler from 64 bit environment there you can only
    >>>>> write separate assembler modules:
    >>>> I don't do Win32 programming, but your example makes it reasonably
    >>>> clear that they did *not* ban assembly; they simply removed the
    >>>> C++ extensions for inline assembly from their compiler. Some
    >>>> would say that was a wise choice.
    >>>>
    >>> Yes, most of the "interesting" instructions are available as
    >>> compiler intrinsics, so you can still use them.
    >>>
    >>> Mixing a few assembly instructions in the middle of a C or C++
    >>> function tended to disturb the optimizer so that the surrounding
    >>> code was less well optimized. That took away most of the advantage
    >>> of using assembly to gain extra speed

    >> I am not sure about Visual C++ in particular but in case of gcc it
    >> is currently not quite so. Gcc inline assembler provides a nice set
    >> of means for inline assembler snippet to cooperate with the
    >> optimizer.
    >> In my practice, I often use "GCC-Inline-Assembly-HOWTO" by
    >> Sandeep.S and I am usually satisfied with the generated code. See
    >> for example
    >> http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#ss5.3
    >> for the discussion of how to let the compiler know which exactly
    >> registers are clobbered by an inline assembler fragment.
    >> , so any performance bottlenecks would
    >>> generally have to be coded separately anyway.

    >> In my practice, the required assembler fragments are usually small
    >> (most often, I resort to them to use a particularly beneficial
    >> instruction that compiler would not be able to infer); putting them
    >> in separate included files is sometimes feasible but not to
    >> separate compilation units. At my desired optimization level, the
    >> compilers I use perform global in-lining within a compilation unit
    >> (I mean optimizer's opportunistic inlining which is not limited by
    >> C++ functions declared inline); having assembler snippets in
    >> separate units would prevent this from happening.

    >
    > Yes, but the interesting instructions that do not map to C or C++
    > constructs are available as compiler intrinsics (equivalent to
    > __builtin_x for gcc). These instructions are known to the optimizer,
    > and that knowledge is taken advantage of.
    >
    > http://msdn.microsoft.com/en-us/library/azcs88h2(VS.100).aspx
    >
    >
    > Therefore the lack of inline assembly for x64 is hardly ever
    > important.
    >
    >


    It is important, because with __declspec(naked) you could
    implement function in assembler, and use c++ as macro assembler.
    Problem is that if you want to write something that you cannot
    do in c++, like to thunk winprocs, in 64 bit environment
    you have to embed binary code, instead of assembly, which is
    step backwards...assembler is easier to read....

    You don;t wont optimizer to mess with code in this case anyway...
    For example my window remembers pointer in run time generated
    thunk (same thing is implemented in WTL, but they use
    binary code, instead of assembly)...
    This is faster way of implementing it than using global table of
    window pointers like mfc does, because there is no lookup
    every time proc is called...

    Greets

    --
    http://maxa.homedns.org/
    Branimir Maksimovic, Dec 29, 2009
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Alex Vinokur
    Replies:
    7
    Views:
    429
    Jerry Coffin
    Sep 21, 2003
  2. Neil Morris

    linking assembler code to C via C library

    Neil Morris, Nov 30, 2003, in forum: C Programming
    Replies:
    4
    Views:
    394
    glen herrmannsfeldt
    Dec 1, 2003
  3. Replies:
    3
    Views:
    350
  4. a_bhatt
    Replies:
    0
    Views:
    1,232
    a_bhatt
    Mar 8, 2011
  5. Mark Lawrence

    Re: interfacing with x86_64 assembler

    Mark Lawrence, Aug 31, 2012, in forum: Python
    Replies:
    7
    Views:
    209
    Ramchandra Apte
    Sep 2, 2012
Loading...

Share This Page