building extension modules, and linking

Discussion in 'Ruby' started by John Gabriele, Sep 13, 2006.

  1. When you load an extension module, what's the mechanism that makes
    those C calls (the ones that call into the Ruby API) actually get
    connected to the currently-running instance of ruby?

    When I build C code (not an extension module) that uses some external
    functions -- say, in a shared lib -- the compiler finds the headers at
    compile-time. At link-edit time, the compiler checks that the code
    fits the shared lib (i.e., makes sure the calls it saw declared in the
    headers (and defined in my source) match up with the shared lib's
    ABI). At runtime (dynamic link time), the OS hunts down the .so file,
    loads it, and gives my program a connection to it. I've got that much.

    But when we're building an extension module, how to you do tell GCC
    (at link-edit time) that you want your code to link to (at runtime)
    what's already loaded and running -- that is, to link to the ruby
    interpreter -- rather than to some shared lib somewhere?

    Incidentally, I notice that I don't even have a libruby.so anywhere on
    my system. I've got an /opt/ruby-1.8.4/lib/libruby-static.a though.
    That makes sense to me I suppose, since there's only one app (i.e.,
    ruby) that will need to load that library, and you want each instance
    of ruby to have its own private memory structures anyway. But I don't
    think I'm supposed to link my extension module with that static
    library...

    Any insights or words of wisdom are most appreciated.

    Thanks,
    ---John
     
    John Gabriele, Sep 13, 2006
    #1
    1. Advertising

  2. John Gabriele

    Tim Becker Guest

    > But when we're building an extension module, how to you do tell GCC
    > (at link-edit time) that you want your code to link to (at runtime)
    > what's already loaded and running -- that is, to link to the ruby
    > interpreter -- rather than to some shared lib somewhere?


    I think the point you're missing is that the runtime the interpreter
    loads your extension, not the other way around. Else the extension
    would just pop into existence and then stumble around looking for a
    running interpreter.

    If you're interested in how the interpreter goes about loading your
    extension technically, search around for 'dlopen' (at least on a Linux
    system).

    -tim
     
    Tim Becker, Sep 13, 2006
    #2
    1. Advertising

  3. John Gabriele

    Lyle Johnson Guest

    On 9/13/06, John Gabriele <> wrote:

    > When you load an extension module, what's the mechanism that makes
    > those C calls (the ones that call into the Ruby API) actually get
    > connected to the currently-running instance of ruby?


    When you use the "require" method to load an extension module, Ruby
    takes the name of the feature that you're trying to require and looks
    for a shared library of that name somewhere in its library load path.
    So for example, if you type:

    require 'foobar'

    Ruby's going to try to find foobar.so (or foobar.bundle, or whatever's
    appropriate for your operating system) somewhere in the $LOAD_PATH.

    Ruby uses an OS-specific function call to dynamically load that shared
    library into memory, and another OS-specific function call to obtain a
    pointer to a function in that library named "Init_foobar". (If you
    want more specifics about which functions Ruby users, check out the
    dln.c file in the Ruby source code). If Ruby fails to get a pointer to
    that Init_foobar() function, the require operation's going to fail.

    One Ruby has a pointer to the Init_foobar() function that's defined in
    the foobar.so shared library -- it calls it! And that's where you, the
    extension writer come in. You are the person who actually defines the
    Init_foobar() function for initializing your extension module. In that
    function, you should make various calls into Ruby's extension API to
    define the modules, classes and methods that make up your extension
    module.

    Hope this helps,

    Lyle
     
    Lyle Johnson, Sep 13, 2006
    #3
  4. On 9/13/06, Tim Becker <> wrote:
    > > But when we're building an extension module, how to you do tell GCC
    > > (at link-edit time) that you want your code to link to (at runtime)
    > > what's already loaded and running -- that is, to link to the ruby
    > > interpreter -- rather than to some shared lib somewhere?

    >
    > I think the point you're missing is that the runtime the interpreter
    > loads your extension, not the other way around.


    Right. But, before that -- at link-edit time, when I build the
    extension module with "gcc -shared", how exactly do I patiently
    explain to gcc, "yes there's functions in this code that start with
    rb_, yes they're described in ruby.h, no you may not look at the code
    they'll be calling at runtime. Sorry."? I guess I'm only familiar with
    the common case where you're telling gcc to link up with other libs
    (via "-L" to tell it non-standard places to look, and "-l" to specify
    the shared libs) -- at link-edit time, doesn't gcc always have to see
    the code that *your* code will later be calling?

    > Else the extension
    > would just pop into existence and then stumble around looking for a
    > running interpreter.
    >
    > If you're interested in how the interpreter goes about loading your
    > extension technically, search around for 'dlopen' (at least on a Linux
    > system).
    >


    Yes. I've glanced at dlopen in the past. C code uses it when it wants
    to load other libs at runtime. It looks to me like dlopen asks the OS
    for a given object, and then the OS hunts around, finds it, loads it,
    and then hands back some kind of file pointer to it. So, I'm guessing
    ruby uses dlopen to load extension modules.

    Thanks,
    ---John
     
    John Gabriele, Sep 13, 2006
    #4
  5. On 9/13/06, Lyle Johnson <> wrote:
    > On 9/13/06, John Gabriele <> wrote:
    >
    > > When you load an extension module, what's the mechanism that makes
    > > those C calls (the ones that call into the Ruby API) actually get
    > > connected to the currently-running instance of ruby?

    >
    > When you use the "require" method to load an extension module,
    > [snip insightful explanation]
    > Hope this helps,


    Thanks for the explanation Lyle. I'm sorry, but perhaps I was unclear
    (and also still not understanding this). I'm looking to find out the
    mechanism at work *at link-edit time*, when you're building the
    extension module. I mean, what are the args necessary to pass to gcc
    to tell it that, when your code makes those rb_foo calls, those calls
    are actually supposed to bind to something besides a .so file sitting
    on your harddisk?

    I can at least guess that, when Ruby loads the extension module, at
    that point it can probably do some magic to make sure the C calls in
    the extension module actually call code that's already loaded in
    memory (from inside the ruby binary itself). But what I'd like to
    understand is how to tell gcc this is the way it's going to go happen
    at runtime.

    Thanks,
    ---John
     
    John Gabriele, Sep 13, 2006
    #5
  6. John Gabriele wrote:
    > On 9/13/06, Lyle Johnson <> wrote:
    >> On 9/13/06, John Gabriele <> wrote:
    >>
    >> > When you load an extension module, what's the mechanism that makes
    >> > those C calls (the ones that call into the Ruby API) actually get
    >> > connected to the currently-running instance of ruby?

    >>
    >> When you use the "require" method to load an extension module,
    >> [snip insightful explanation]
    >> Hope this helps,

    >
    > Thanks for the explanation Lyle. I'm sorry, but perhaps I was unclear
    > (and also still not understanding this). I'm looking to find out the
    > mechanism at work *at link-edit time*, when you're building the
    > extension module. I mean, what are the args necessary to pass to gcc
    > to tell it that, when your code makes those rb_foo calls, those calls
    > are actually supposed to bind to something besides a .so file sitting
    > on your harddisk?


    I'm far from an expert on dynamic linking so I can't tell you how the
    mechanism really works, but I think what you're talking about is the
    -shared option to gcc. From man gcc:

    -shared
    Produce a shared object which can then be linked with other objects
    to form an executable. Not all systems support this option. For
    predictable results, you must also specify the same set of options
    that were used to generate code (-fpic, -fPIC, or model suboptions)
    when you specify this option.[1]

    This option is supplied automatically in the Makefile that is generated
    when you use the usual extconf.rb and mkmf.rb approach to build an
    extension.

    --
    vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407
     
    Joel VanderWerf, Sep 13, 2006
    #6
  7. Hello !


    > But when we're building an extension module, how to you do tell GCC
    > (at link-edit time) that you want your code to link to (at runtime)
    > what's already loaded and running -- that is, to link to the ruby
    > interpreter -- rather than to some shared lib somewhere?


    Well, this is highly platform-dependant. If I understand you right,
    you're asking how the system knows where to find the rb_* functions
    called from within your C code, even if you didn't link with the
    appropriate library ? The answer is relatively simple. The Makefile
    produced by extconf.rb contain linker-specific flags to say : "don't
    bother to look for missing symbols at link-time, do this at run-time".
    What happens next is when your extension is loaded, the dynamic linker
    see there are missing symbols, and it looks for them in the current
    namespace. From within a ruby interpreter, the symbols are already here
    and everything goes fine. But if for some reasons it doesn't find
    symbols, your program will crash with an undefined symbol problem.

    Just as an example try, with a ruby extension

    int main()
    {
    void * a;
    void (* func)();
    a = dlopen("ruby_extension.so");
    func = dlsym(a, "Init_extension"); /* might already crash here, make
    sure to use the right Init_ function */
    func(); /* will crash here if not before */
    }

    If you want to inspect this in more details, I advise you to have a
    look at nm and especially the output of nm -D on a shared object (small,
    if possible).

    Cheers !

    Vince
     
    Vincent Fourmond, Sep 14, 2006
    #7
  8. On 9/13/06, Lyle Johnson <> wrote:
    > On 9/13/06, John Gabriele <> wrote:
    >
    > > Thanks for the explanation Lyle. I'm sorry, but perhaps I was unclear
    > > (and also still not understanding this). I'm looking to find out the
    > > mechanism at work *at link-edit time*, when you're building the
    > > extension module. [snip]

    >
    > Oh, OK. Well, I can't tell you the *specific* arguments because
    > they're highly platform-dependent. But the easiest way (IMO) to find
    > out what they should be is to write an extconf.rb script (a standard
    > fixture for any Ruby extension) and then run that. [snip]


    Ok. I think I partly get it now. Details follow for anyone who's interested.

    To save some time, I grabbed the sample extension from
    http://www.rubyinside.com/how-to-create-a-ruby-extension-in-c-in-under-5-minutes-100.html
    and it builds and runs fine (note, I've got Ruby installed in /opt/ruby-1.8.4) :

    ==== snip ====
    module-experiment/MyTest$ ruby extconf.rb
    creating Makefile

    module-experiment/MyTest$ ls
    extconf.rb Makefile MyTest.c

    module-experiment/MyTest$ make
    gcc -fPIC -g -O2 -I. -I/opt/ruby-1.8.4/lib/ruby/1.8/i686-linux
    -I/opt/ruby-1.8.4/lib/ruby/1.8/i686-linux -I. -c MyTest.c
    MyTest.c:23:2: warning: no newline at end of file

    gcc -shared -L'/opt/ruby-1.8.4/lib' -Wl,-R'/opt/ruby-1.8.4/lib' -o
    mytest.so MyTest.o -ldl -lcrypt -lm -lc

    module-experiment/MyTest$ ls -lh mytest.so
    -rwxr-xr-x 1 john john 8.2K 2006-09-14 02:48 mytest.so

    module-experiment/MyTest$ cd ..

    module-experiment$ ls
    MyTest mytest.rb

    module-experiment$ ruby mytest.rb
    10
    ==== /snip ====


    The compile command is simple, though contains some harmless redundancy.

    The fancy options in the linker command (this is Linux-/GCC-/ELF-specific) are:

    * The "-shared" tells the link-editor to build a shared object (a .so file).

    * The "-L" simply tells gcc where it can find libraries to link to at
    link-edit time. As you can see from the size of mytest.so (8.2 kB),
    it's certainly not statically linking in my libruby-static.a. Note
    that in the MyTest.c file, there's at least one call to a rb_foo
    function, so I'd think that gcc at least needs to *look* at
    libruby-static.a...

    * The "-Wl,-R..." option means to pass the "-R'/opt/ruby-1.8.4/lib'"
    option to the link-editor. Looking it up (see "man ld"), I see that it
    means for the link-editor to add the /opt/ruby-1.8.4/lib directory to
    the runtime search path, and also, as the docs say: "The -rpath option
    is also used when locating shared objects which are needed by shared
    objects explicitly included in the link; see the description of the
    -rpath-link option."

    * Then there's that "-ldl"...

    Anyhow, this is where things are still fuzzy for me (though it could
    be that it's 3:30 in the morning). How gcc builds mytest.so so it can
    later hook into ruby is probably accomplished by some combination of
    that -R option, that libdl.so shared lib (which seems to supply
    dlopen()), and maybe even the /opt/ruby-1.8.4/lib/libruby-static.a
    static lib. Not sure. Anyway, there seems to be some semi-deep
    GNU/Linux magic happening here.

    Thanks!
    ---John
     
    John Gabriele, Sep 14, 2006
    #8
  9. On 9/14/06, Vincent Fourmond <> wrote:
    >
    > Hello !


    Hi Vincent. Thanks for the reply.

    >
    > > But when we're building an extension module, how to you do tell GCC
    > > (at link-edit time) that you want your code to link to (at runtime)
    > > what's already loaded and running -- that is, to link to the ruby
    > > interpreter -- rather than to some shared lib somewhere?

    >
    > Well, this is highly platform-dependant. If I understand you right,
    > you're asking how the system knows where to find the rb_* functions
    > called from within your C code, even if you didn't link with the
    > appropriate library ? The answer is relatively simple. The Makefile
    > produced by extconf.rb contain linker-specific flags to say : "don't
    > bother to look for missing symbols at link-time, do this at run-time".


    Ohhhhhhhhhhhhhhhhhhhh.

    Ok. Hm. Well then. Maybe that's the point of that "-R" link-editor arg
    (mentioned in that other post I just made a few minutes ago before
    seeing this one). The ld docs on -R/-rpath don't specifically say what
    you so clearly express above, but they might *imply* that if read
    under the right intensity lights, wearing those cheap bi-colored
    3D-movie glasses, while howling under a full-moon... :)

    > What happens next is when your extension is loaded, the dynamic linker
    > see there are missing symbols, and it looks for them in the current
    > namespace. From within a ruby interpreter, the symbols are already here
    > and everything goes fine. But if for some reasons it doesn't find
    > symbols, your program will crash with an undefined symbol problem.


    Got it.

    > Just as an example try, with a ruby extension
    >
    > int main()
    > {
    > void * a;
    > void (* func)();
    > a = dlopen("ruby_extension.so");
    > func = dlsym(a, "Init_extension"); /* might already crash here, make
    > sure to use the right Init_ function */
    > func(); /* will crash here if not before */
    > }
    >
    > If you want to inspect this in more details, I advise you to have a
    > look at nm and especially the output of nm -D on a shared object (small,
    > if possible).
    >
    > Cheers !
    >
    > Vince


    Thanks again Vince! :)

    ---John
     
    John Gabriele, Sep 14, 2006
    #9
  10. Hi !

    Ok. Hm. Well then. Maybe that's the point of that "-R" link-editor arg
    > (mentioned in that other post I just made a few minutes ago before
    > seeing this one). The ld docs on -R/-rpath don't specifically say what
    > you so clearly express above, but they might *imply* that if read
    > under the right intensity lights, wearing those cheap bi-colored
    > 3D-movie glasses, while howling under a full-moon... :)


    Did you try turning your screen upside down ?

    > Thanks again Vince! :)


    No problem - I got lots of trouble (and experience) with trying to
    port some stuff from Linux to MacOS, where the dynamic loader doesn't
    function at all the same way... Awful.

    Good day to all !

    Vince
     
    Vincent Fourmond, Sep 14, 2006
    #10
  11. If you run nm on any extension shared object, you'll generally see
    that the link-editor simply marks as unresolved the references in your
    code to functions in the Ruby runtime libraries. As someone else
    pointed out, these symbols are already available in-process (because
    the Ruby runtime defines them) when your extension is dynamically
    loaded. So the linker doesn't need to know where they're coming from.

    By chance are you primarily a Java programmer? I ask because the Java
    build system is very picky about inspecting code that is referenced by
    your code, in ways that C is not.
     
    Francis Cianfrocca, Sep 14, 2006
    #11
  12. On 9/14/06, Francis Cianfrocca <> wrote:
    > If you run nm on any extension shared object, you'll generally see
    > that the link-editor simply marks as unresolved the references in your
    > code to functions in the Ruby runtime libraries.


    Ah! Neat! Thanks for pointing that out.

    > [sinp]
    >
    > By chance are you primarily a Java programmer? I ask because the Java
    > build system is very picky about inspecting code that is referenced by
    > your code, in ways that C is not.


    I do some Java, but not enough to know where you're coming from in
    that comment. :) I only brought up the original question because I'm
    working on an extension module and wanted to know how the linker trick
    was done (and also thought others would benefit from knowing as well).

    ---John
     
    John Gabriele, Sep 15, 2006
    #12
  13. On 9/14/06, Vincent Fourmond <> wrote:
    >
    > Hi !
    >
    > Ok. Hm. Well then. Maybe that's the point of that "-R" link-editor arg
    > > (mentioned in that other post I just made a few minutes ago before
    > > seeing this one). The ld docs on -R/-rpath don't specifically say what
    > > you so clearly express above, but they might *imply* that if read
    > > under the right intensity lights, wearing those cheap bi-colored
    > > 3D-movie glasses, while howling under a full-moon... :)

    >
    > Did you try turning your screen upside down ?


    Turns out (as was pointed out to me elsewhere), for gcc, the default
    seems to be to allow unresolved symbols for shared libraries, and that
    the -Rdir_name option to ld may actually not even be relevant here,
    since the symbols don't get resolved to that static lib anyway (I'll
    have to do more experimentation there).

    My guess that the ld docs on -rpath might've been less than clear
    seems to have been incorrect.

    Thanks,
    ---John
     
    John Gabriele, Sep 17, 2006
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Joonas Paalasmaa
    Replies:
    0
    Views:
    1,069
    Joonas Paalasmaa
    Sep 5, 2003
  2. Anand
    Replies:
    3
    Views:
    899
    Tim Daneliuk
    Nov 8, 2003
  3. John Hunter
    Replies:
    1
    Views:
    345
    =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=
    Dec 13, 2004
  4. Bo Peng
    Replies:
    3
    Views:
    345
    Bo Peng
    Dec 22, 2006
  5. Hal Vaughan
    Replies:
    7
    Views:
    481
Loading...

Share This Page