Ruby & Threads

M

Michael Boutros

Hello all,

I'm building an application that has to branch out and call about 10
other Ruby scripts. Since each script will run for a few seconds,
waiting for each one to finish will take a while, which is too much. So,
I've been looking into threads and I have a system that's working (in
tests), but I have a few questions. First of all, the system:

require 'enumerator'

holder = []

array = (1..10).to_a
puts array.inspect

array.each_slice(3) do |group|
group.each do |number|
@thread = Thread.new do
puts "Starting #{number}...\n"
sleep(5)
holder << number
end
end
end

@thread.join
puts holder.inspect

In theory, the output should look something like this:

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Starting 1...
Starting 2...
Starting 3...
Starting 4...
Starting 5...
Starting 6...
Starting 7...
Starting 8...
Starting 9...
Starting 10...
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

However, sometimes the second thread may finish before the first, etc.,
but the order doesn't matter. What matters is that the script's
execution time just went from over 20 seconds to under two! However, as
you can see, I have to call '@thread.join'. I do this because if I
don't, the script will exit before all of the threads are done
executing, so holder is always an emtpy array. Am I right? Or is there
some other way to keep the main script from exiting until all the
threads are done? Is there anything else I'm doing wrong?

Thanks,
Michael Boutros
 
J

Joel VanderWerf

Michael said:
require 'enumerator'

holder = []

array = (1..10).to_a
puts array.inspect

array.each_slice(3) do |group|
group.each do |number|
@thread = Thread.new do
puts "Starting #{number}...\n"
sleep(5)
holder << number
end
end
end

@thread.join

#join is definitely a good idea, because otherwise (as you observed) the
main thread will exit before the others have finished, but you are
overwriting the @thread variable on each iteration through the loop.

The usual idiom for this is something like:

threads = array.map { Thread.new {...} }
threads.each {|th| th.join}
 
M

Michael Boutros

Joel said:
#join is definitely a good idea, because otherwise (as you observed) the
main thread will exit before the others have finished, but you are
overwriting the @thread variable on each iteration through the loop.

The usual idiom for this is something like:

threads = array.map { Thread.new {...} }
threads.each {|th| th.join}

Joel,

Initially I meant to do that because I thought that I would only need to
"join" one thread to get them all to continue, until I realized that
some might finish before others, so I altered the code to the method
that you described.
 
E

Erik Veenstra

In plain Ruby, you might want to rewrite this to a more
functional style:

holder =
array.collect do |number|
Thread.new do
puts "Starting #{number}...\n"
sleep(5)
number
end
end.collect do |thread|
thread.value
end

And, using ThreadLimiter [1,2], you can reduce it to:

require "threadlimiter"

holder =
array.threaded_collect do |number|
puts "Starting #{number}...\n"
sleep(5)
number
end

gegroet,
Erik V. - http://www.erikveen.dds.nl/

[1] http://www.erikveen.dds.nl/threadlimiter/doc/index.html
[2] http://rubyforge.org/projects/threadlimiter/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top