HOWTO: "catching" a segfault in a ruby/dl C library

K

Ken Bloom

For my research, I've written bindings for the link-grammar[1] library
in ruby, and I am using them to parse all of the sentences in a corpus
of text and insert the parses into a database. We begin with code
which works roughly as follows:

$dbh=DBI.connect(...)

#for simplicity, it doesn't matter what the Something is.
sentences=Something.new($dbh)

d=Dictionary.new #Dictionary is an object wrapping the C library

def putindatabase link
#for our purposes, it doesn't really matter what this does
#except to say it uses $dbh which we opened before
end

def parse sentencetext
sentence=d.parse(sentencetext) #sentence also wraps the C library
sentence.linkage[0].links.each do |link|
putindatabase link
end
end

#sentences.each fetches every sentence from the database and yields each
#one to the block. it keeps an open connection between yields
sentences.each do |sentence|
parse sentence
end

Now, the link-grammar library that I'm using has a bug. It
segfaults[2] while freeing resources under some conditions that I
haven't quite figured out enough to fix in the C library itself.
Now, arguably it would be nice to catch the segfault as though it's an
exception, so that we could move on to the next sentence and get on
with our lives. But ruby/dl doesn't let us do that, and even if
ruby/dl did let us do that, it could leave the link-grammar library in
an inconsistent state. So we'll do the next best thing. We'll fork a
subprocess to handle each sentence. This will solve a few problems:

* If a sentence fails, we'll be able to move on to the next one.
* A sentence won't fail before being put in the database, since the
problem occurs when freeing the resources used to parse the sentence.
* We probably won't segfault at all because this only occurs under
complicated circumstances which seem to involve the fact that you've
parsed more than one sentence with the same dictionary.
* We don't have to clean up properly at all, since the termination of the
child process after each sentence automatically takes care of that
for us. (The link-grammar library doesn't allocate any resources
that the OS doesn't know how to dispose of.) This may make
subprocess termination faster.

So we would like to change our code to say:

sentences.each do |sentence|
Process.waitpid fork {parse sentence}
end

(for that last bullet point, we'd also need to edit the link-grammar
bindings, but I won't worry you with the details of that. I haven't
actually implemented it yet.)

That's all, right?
Oy, vey! Testing this, we quickly see that DBI can't put anything in
the database, except for the first sentence. Why? Because when the
child process exits, it closes the database connection, which affects
the parent too. (i.e. DBI isn't fork-safe)

But it turns out DRb is fork-safe, so I create another process of
"middleware" and have that be responsible for the database connection:

serverpid = fork do
dbh=DBI.connect(...)
Signal.trap("INT"){exit}
acl=ACL.new(%w{deny all allow 127.0.0.1})
DRb.install_acl(acl)
DRb.start_service('druby://localhost:9001',dbh)
DRb.thread.join
end
sleep 1 #wait for the server to be setup before continuing
DRb.start_service
$dbh=DRbObject.new(nil,'druby://localhost:9001')

## all of the previous code like before
## and then at the end, we kill the DRb server thread:

Process.kill("INT",serverpid)

Now, we have a painless way to make DBI fork-safe. (Note that DBI
still isn't *thread safe*, and this only works because I'm keeping all
of the child processes serially ordered, but it's not a bad
modification technique to handle this kind of error.)

Footnotes:
[1] http://www.link.cs.cmu.edu/link/
http://www.abisource.com/downloads/link-grammar
[2] http://bugzilla.abisource.com/show_bug.cgi?id=10391
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top