HOWTO: "catching" a segfault in a ruby/dl C library

Discussion in 'Ruby' started by Ken Bloom, Jul 26, 2006.

  1. Ken Bloom

    Ken Bloom Guest

    For my research, I've written bindings for the link-grammar[1] library
    in ruby, and I am using them to parse all of the sentences in a corpus
    of text and insert the parses into a database. We begin with code
    which works roughly as follows:


    #for simplicity, it doesn't matter what the Something is.$dbh) #Dictionary is an object wrapping the C library

    def putindatabase link
    #for our purposes, it doesn't really matter what this does
    #except to say it uses $dbh which we opened before

    def parse sentencetext
    sentence=d.parse(sentencetext) #sentence also wraps the C library
    sentence.linkage[0].links.each do |link|
    putindatabase link

    #sentences.each fetches every sentence from the database and yields each
    #one to the block. it keeps an open connection between yields
    sentences.each do |sentence|
    parse sentence

    Now, the link-grammar library that I'm using has a bug. It
    segfaults[2] while freeing resources under some conditions that I
    haven't quite figured out enough to fix in the C library itself.
    Now, arguably it would be nice to catch the segfault as though it's an
    exception, so that we could move on to the next sentence and get on
    with our lives. But ruby/dl doesn't let us do that, and even if
    ruby/dl did let us do that, it could leave the link-grammar library in
    an inconsistent state. So we'll do the next best thing. We'll fork a
    subprocess to handle each sentence. This will solve a few problems:

    * If a sentence fails, we'll be able to move on to the next one.
    * A sentence won't fail before being put in the database, since the
    problem occurs when freeing the resources used to parse the sentence.
    * We probably won't segfault at all because this only occurs under
    complicated circumstances which seem to involve the fact that you've
    parsed more than one sentence with the same dictionary.
    * We don't have to clean up properly at all, since the termination of the
    child process after each sentence automatically takes care of that
    for us. (The link-grammar library doesn't allocate any resources
    that the OS doesn't know how to dispose of.) This may make
    subprocess termination faster.

    So we would like to change our code to say:

    sentences.each do |sentence|
    Process.waitpid fork {parse sentence}

    (for that last bullet point, we'd also need to edit the link-grammar
    bindings, but I won't worry you with the details of that. I haven't
    actually implemented it yet.)

    That's all, right?
    Oy, vey! Testing this, we quickly see that DBI can't put anything in
    the database, except for the first sentence. Why? Because when the
    child process exits, it closes the database connection, which affects
    the parent too. (i.e. DBI isn't fork-safe)

    But it turns out DRb is fork-safe, so I create another process of
    "middleware" and have that be responsible for the database connection:

    serverpid = fork do
    Signal.trap("INT"){exit}{deny all allow})
    sleep 1 #wait for the server to be setup before continuing

    ## all of the previous code like before
    ## and then at the end, we kill the DRb server thread:


    Now, we have a painless way to make DBI fork-safe. (Note that DBI
    still isn't *thread safe*, and this only works because I'm keeping all
    of the child processes serially ordered, but it's not a bad
    modification technique to handle this kind of error.)


    Ken Bloom. PhD candidate. Linguistic Cognition Laboratory.
    Department of Computer Science. Illinois Institute of Technology.
    Ken Bloom, Jul 26, 2006
    1. Advertisements

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Donn Ingle
    Brian Cole
    Nov 28, 2007
  2. Olivier Grisel
    Olivier Grisel
    Nov 25, 2008
  3. Andrey Vul
    Richard Bos
    Jul 30, 2010
  4. George Moschovitis

    ruby-1.8.0.p3 segfault

    George Moschovitis, Jul 9, 2003, in forum: Ruby
    George Moschovitis
    Jul 9, 2003
  5. Nathaniel Talbott

    Segfault in Ruby/ODBC

    Nathaniel Talbott, Nov 7, 2003, in forum: Ruby
    Hidetoshi NAGAI
    Nov 10, 2003

Share This Page