Ruby 1.9, threads and FreeBSD 5

E

Eric Jacoboni

Hi,

Considering the following theory code:

require "thread"

ping = ConditionVariable.new
pong = ConditionVariable.new
mutex = Mutex.new

1.upto(10) do # 10 threads pong
Thread.new do
mutex.synchronize do
ping.wait(mutex)
puts("Pong...")
pong.signal
end
end
end

1.upto(10) do # 10 threads ping
Thread.new do
mutex.synchronize do
pong.wait(mutex)
puts("Ping...")
ping.signal
end
end
end

pong.signal # Go!

Thread.list.each { |t| t.join if t != Thread.main }


This code works as expected with Ruby 1.8 on FreeBSD and OS X :

% /usr/bin/ruby ping_pong_cond.rb
Ping...Pong...
Ping...Pong...
Ping...Pong...
Ping...Pong...
Ping...Pong...
Ping...Pong...
Ping...Pong...
Ping...Pong...
Ping...Pong...
Ping...Pong...


But it blocks with Ruby 1.9 on both OS :

% ruby ping_pong_cond.rb
^Cping_pong_cond.rb:30:in `join': Interrupt
from ping_pong_cond.rb:30:in `block in <main>'
from ping_pong_cond.rb:30:in `each'
from ping_pong_cond.rb:30:in `<main>'

Furthermore, it works fine with Ruby 1.9 on Vista.

As i know there is some change in Ruby threads/Native threads between
1.8 and 1.9, i suspect this change could be the culprit...

Any clue?

Thanks
 
M

MenTaLguY

1.upto(10) do # 10 threads ping
Thread.new do
mutex.synchronize do
pong.wait(mutex)
puts("Ping...")
ping.signal
end
end
end

pong.signal # Go!

Your code is buggy: there is a race condition such that the initial #signal
can get called before any corresponding #waits. That it superficially appeared
to work consistently with 1.8 was an accident of thread scheduling. Adding
a 'sleep' beforehand, as another poster suggested, makes it more likely to
work but does not actually fix the bug.

The correct approach is to use a synchronization primitive which queues
notifications, for example a semaphore or a queue, rather than a condition
variable. A queued notification can't get "missed" like this.

-mental
 
M

MenTaLguY

The correct approach is to use a synchronization primitive which queues
notifications, for example a semaphore or a queue, rather than a condition
variable. A queued notification can't get "missed" like this.

Specifically, if "Foo::Semaphore" were a counted semaphore class:

ping = Foo::Semaphore.new(0)
pong = Foo::Semaphore.new(0)

def event(wait, notify, message)
wait.down
puts message
notify.up
end

threads = (1..10).map {
[ Thread.new { event(pong, ping, "Ping...") },
Thread.new { event(ping, pong, "Pong...") } ]
}.flatten

pong.up

threads.each { |t| t.join }

Unfortunately there aren't any good 1.9/JRuby-friendly semaphore
implementations yet. Here is a very simple portable one (public domain):

require 'thread'

class PortableSemaphore
def initialize(count=0)
@lock = Mutex.new
@nonzero = ConditionVariable.new
@count = count
end

def down
@lock.synchronize do
@nonzero.wait @lock until @count.nonzero?
@count -= 1
end
self
end

def up
@lock.synchronize do
@count += 1
@nonzero.signal
end
self
end
end

Note that this is how condition variables are intended to be used --
not directly, but as building blocks for more useful primitives.

-mental
 
E

Eric Jacoboni

MenTaLguY said:
Your code is buggy: there is a race condition such that the initial #signal
can get called before any corresponding #waits.

Gosh... you're right.

Thanks to Guy for the sleep trick and for your PortableSemaphore
implementation: i'm gonna investigate it further.
 
M

MenTaLguY

Thanks to Guy for the sleep trick and for your PortableSemaphore
implementation: i'm gonna investigate it further.

Please note that you shouldn't ever use the sleep trick in
production code -- it merely hides problems during testing
when they can still occur under production load.

(I emphasize this, because the sleep trick seems to be fairly
popular as an "easy fix"; even I'm guilty of using it a lot
in the past...)

-mental
 
E

Eric Jacoboni

Mental said:
Please note that you shouldn't ever use the sleep trick in
production code -- it merely hides problems during testing
when they can still occur under production load.

Oh yes, i know synchronization should never relies on temporisations...
In this case, it's useful to point the bug you mention in my use of
#signal.

I've had wrote a general semaphore implementation using IO.pipe but your
PortableSemaphore is way more elegant... thanks a lot (i just wonder if
#up/#down honor the FIFO policy)

BTW, as for Queues : i admit i never use them.
 
M

MenTaLguY

I've had wrote a general semaphore implementation using IO.pipe

That can still be useful sometimes -- for example, I wrote a
concurrent-selectable gem which provides latch, semaphore, and
channel (queue) implementations which can be passed as arguments to
IO.select, libev, etc. because they use IO.pipe underneath.
PortableSemaphore is way more elegant... thanks a lot (i just wonder
if #up/#down honor the FIFO policy)

It depends upon the implementation of ConditionVariable#wait. Some
Ruby implementations will wake threads in the order the threads called
#wait, and some will not. Most are roughly FIFO but not 100% "fair".

Fairness actually involves a tradeoff: while unfair blocking
primitives can sometimes lead to starvation (as sufficiently
greedy threads could keep "jumping the queue"), fair primitives
are more likely to have problems with convoying[1].

-mental

[1] Google "lock convoying"
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top