J
John Stoffel
Hi,
I've been slowly hacking on my parallel recursive readdir()
implementation using a limited number of parallel processes, since as
you all know, Ruby threads are not truly parallel.
Anyway, I'm now getting wierd errors like the following:
Starting DRb on server
Exception `DRb:
RbBadScheme' at /usr/lib/ruby/1.8/drb/drb.rb:814 -
drbunix:///tmp/stoffj-test_17986
Exception `DRb:
RbBadScheme' at /usr/lib/ruby/1.8/drb/drb.rb:814 -
drbunix:///tmp/stoffj-test_17986
Exception `DRb:
RbBadScheme' at /usr/lib/ruby/1.8/drb/drb.rb:814 -
drbunix:///tmp/stoffj-test_17986
counter.count=1
And it's not really clear to my *why* I'm getting these errors, or how I
can trap them so I'm not bothered by them. The basic code uses both DRb
and Slave libraries to make the interprocess communication simpler, and
to have a central counting server to regulate the number of
sub-processes I'll be using at any one point in time.
So, first off, here's a full set of output of my code on a test
directory.
Thanks for any hints or suggestions. I *think* I need to use begin ...
rescue ... end blocks possibly, but it's not clear where. Oh yeah, I'm
running this all on a CentOS 5.2 (RHEL5.2) Final system, against a small
350Mb test directory tree.
Thanks,
John
------------------------------- log ----------------------------------
$ ./readdir-drb.rb tmp
Starting DRb on server
Exception `DRb:
RbBadScheme' at /usr/lib/ruby/1.8/drb/drb.rb:814 -
drbunix:///tmp/stoffj-test_17986
Exception `DRb:
RbBadScheme' at /usr/lib/ruby/1.8/drb/drb.rb:814 -
drbunix:///tmp/stoffj-test_17986
Exception `DRb:
RbBadScheme' at /usr/lib/ruby/1.8/drb/drb.rb:814 -
drbunix:///tmp/stoffj-test_17986
counter.count=1
Threaded!
Exception `DRb:
RbBadScheme' at /usr/lib/ruby/1.8/drb/drb.rb:814 -
drbunix:///tmp/slave_proc_537506070_17986_17988_0_0.125356081811465
Exception `DRb:
RbBadScheme' at /usr/lib/ruby/1.8/drb/drb.rb:814 -
drbunix:///tmp/slave_proc_537506070_17986_17988_0_0.125356081811465
Exception `DRb:
RbBadScheme' at /usr/lib/ruby/1.8/drb/drb.rb:814 -
drbunix:///tmp/slave_proc_537506070_17986_17988_0_0.125356081811465
Threaded!
Exception `DRb:
RbBadScheme' at /usr/lib/ruby/1.8/drb/drb.rb:814 -
drbunix:///tmp/slave_proc_537512250_17988_17989_0_0.20115187038118
Exception `DRb:
RbBadScheme' at /usr/lib/ruby/1.8/drb/drb.rb:814 -
drbunix:///tmp/slave_proc_537512250_17988_17989_0_0.20115187038118
Exception `DRb:
RbBadScheme' at /usr/lib/ruby/1.8/drb/drb.rb:814 -
drbunix:///tmp/slave_proc_537512250_17988_17989_0_0.20115187038118
size = 372104960
size = 373477128
Exception `NameError' at /usr/lib/ruby/site_ruby/1.8/slave.rb:409 -
uninitialized constant Slave:
Bb
Exception `RuntimeError' at /usr/lib/ruby/site_ruby/1.8/slave.rb:519 -
already shutdown
exit (SystemExit)
/usr/lib/ruby/site_ruby/1.8/slave.rb:265:in `exit'
/usr/lib/ruby/site_ruby/1.8/slave.rb:265:in `cling'
/usr/lib/ruby/site_ruby/1.8/slave.rb:259:in `call'
/usr/lib/ruby/site_ruby/1.8/slave.rb:259:in `on_cut'
/usr/lib/ruby/site_ruby/1.8/slave.rb:251:in `initialize'
/usr/lib/ruby/site_ruby/1.8/slave.rb:251:in `new'
/usr/lib/ruby/site_ruby/1.8/slave.rb:251:in `on_cut'
/usr/lib/ruby/site_ruby/1.8/slave.rb:265:in `cling'
/usr/lib/ruby/site_ruby/1.8/slave.rb:415:in `initialize'
/usr/lib/ruby/site_ruby/1.8/slave.rb:593:in `new'
/usr/lib/ruby/site_ruby/1.8/slave.rb:593:in `object'
/readdir-drb.rb:114:in `readdir'
/readdir-drb.rb:86:in `foreach'
/readdir-drb.rb:86:in `readdir'
/readdir-drb.rb:207
Total size: 373477128 B
Total size: 364723 KB
Total size: 356 MB
Total size: 0 GB
Exception `Errno::ENOENT' at /usr/lib/ruby/1.8/fileutils.rb:1281 - No
such file or directory -
/tmp/slave_proc_537506070_17986_17988_0_0.125356081811465
Exception `RuntimeError' at /usr/lib/ruby/site_ruby/1.8/slave.rb:519 -
already shutdown
rt3.taec.com:~/src/Tools/philesight-20081120$ Exception `NameError' at
/usr/lib/ruby/site_ruby/1.8/slave.rb:409 - uninitialized constant
Slave:
Bb
Exception `Errno::ENOENT' at /usr/lib/ruby/1.8/fileutils.rb:1281 - No
such file or directory -
/tmp/slave_proc_537512250_17988_17989_0_0.20115187038118
Exception `IOError' at /usr/lib/ruby/1.8/drb/unix.rb:92 - stream closed
Exception `Errno::EBADF' at /usr/lib/ruby/1.8/drb/unix.rb:92 - Bad file
descriptor - ///tmp/slave_proc_537512250_17988_17989_0_0.20115187038118
/usr/lib/ruby/1.8/drb/unix.rb:92:in `close': Bad file descriptor -
///tmp/slave_proc_537512250_17988_17989_0_0.20115187038118
(Errno::EBADF)
from /usr/lib/ruby/1.8/drb/unix.rb:92:in `close'
from /usr/lib/ruby/1.8/drb/drb.rb:1433:in `run'
from /usr/lib/ruby/1.8/drb/drb.rb:1427:in `start'
from /usr/lib/ruby/1.8/drb/drb.rb:1427:in `run'
from /usr/lib/ruby/1.8/drb/drb.rb:1347:in `initialize'
from /usr/lib/ruby/1.8/drb/drb.rb:1627:in `new'
from /usr/lib/ruby/1.8/drb/drb.rb:1627:in `start_service'
from /usr/lib/ruby/site_ruby/1.8/slave.rb:396:in `initialize'
... 28 levels...
from ./readdir-drb.rb:114:in `readdir'
from ./readdir-drb.rb:86:in `foreach'
from ./readdir-drb.rb:86:in `readdir'
from ./readdir-drb.rb:207
--------------------------- source
--------------------------------------
And here's my source code, excuse my lack of Ruby knowledge, I'm a newb
to Ruby, though not to programming. Hopefully that is shown. :]
#!/usr/bin/ruby
require 'getoptlong'
require 'thread'
require 'slave'
require 'drb'
require 'drb/unix'
$VERSION = "v1.0";
$max_slaves = 3
$count = 50
# Local Socket - need to worry about multiple runs of this script, so
add on
# unique PID.
$URI = "drbunix:///tmp/stoffj-test_" + Process.pid.to_s
slave_cnt_mutex = Mutex.new
opts = GetoptLong.new(
[ "--help", "-h", GetoptLong::NO_ARGUMENT],
[ "--kids", "-k", GetoptLong::REQUIRED_ARGUMENT]
)
#---------------------------------------------------------------------
class Counter
def initialize(max=1)
@slaves = []
@max = max
@count = 1
@count_mutex = Mutex.new
end
def count
@count
end
def max
@max
end
# Increment the count of slaves, returning 1 if incremented, 0 if not.
def increment
ok = nil
@count_mutex.synchronize do
if (@count < @max) then
@count += 1
ok = 1
end
end
ok
end
# Decrement the count of slaves
def decrement
@count_mutex.synchronize do
if (@count > 1) then
@count -= 1
end
end
@count
end
end
#---------------------------------------------------------------------
class ReadDir
def initialize(server)
@server = server
# Slave pool for this level of readdir() recursion.
@kids = []
end
def readdir(dir)
#puts "readdir(#{dir})"
size_file = {}
size_dir = {}
size_total = 0
# Traverse the directory and collect the size of all files and
# directories
begin
Dir.foreach(dir) do |f|
#print " #{f},"
if(f != "." && f != "..") then
f_full = addpath(dir, f)
stat = File.lstat(f_full)
if(!stat.symlink?) then
if(stat.file?) then
#puts " File: #{f}"
size = File.size(f_full)
size_file[f] = size
size_total += size
end
if(stat.directory?) then
#puts "DIR= #{f}"
if (@server.max <= 1) then
puts " no threads."
size = readdir(f_full)
if (size > 0) then
size_dir[f] = size
size_total += size
end
else
ok = @server.increment
if (ok)
puts " Threaded!"
@kids << Slave.object
async => true) {
size = readdir(f_full)
puts "size = #{size}"
# Duh... return the size from the slave properly
size
}
else
#puts " no free threads, do anyway"
size = readdir(f_full)
if(size > 0) then
size_dir[f] = size
size_total += size
end
end
end
end
end
end
end
end
@kids.each { |kid|
size_total += kid.value
}
#Puts "Dir: #{dir} = #{size_total}"
return size_total
end
end
#---------------------------------------------------------------------
# Read a directory and add to the database; this function is recursive
# for sub-directories
#---------------------------------------------------------------------
def usage
puts
puts "usage: readdir-drb [--kids NUM] <dir>"
puts " defaults to #{$max_kids} children"
puts
puts " version: #{$version}"
puts
end
#---------------------------------------------------------------------
def addpath(a, b)
return a + b if(a =~ /\/$/)
return a + "/" + b
end
#---------------------------------------------------------------------
# Main
#---------------------------------------------------------------------
$DEBUG = true
opts.each do |opt,arg|
case opt
when "--kids"
$max_slaves = arg.to_i
else
usage
exit
end
end
if ARGV.length != 1
puts "Missing dir argument (try --help)"
exit 0
end
dir = ARGV.shift
# Start the DRb service.
puts "Starting DRb on server"
DRb.start_service $URI, Counter.new($max_slaves)
# Child
# Fire up the first slave process which will do the work of readdir()
DRb.start_service
counter = DRbObject.new_with_uri $URI
puts "counter.count=#{counter.count}"
# Fire up a new Kid Class readdir.
kid = ReadDir.new(counter)
# Now let's try to do a recursive readdir() algorith with threads.
size = kid.readdir(dir)
sizekb = size / 1024;
sizemb = sizekb / 1024;
sizegb = sizemb / 1024;
puts ""
puts "Total size: #{size} B"
puts "Total size: #{sizekb} KB"
puts "Total size: #{sizemb} MB"
puts "Total size: #{sizegb} GB"
I've been slowly hacking on my parallel recursive readdir()
implementation using a limited number of parallel processes, since as
you all know, Ruby threads are not truly parallel.
Anyway, I'm now getting wierd errors like the following:
Starting DRb on server
Exception `DRb:
drbunix:///tmp/stoffj-test_17986
Exception `DRb:
drbunix:///tmp/stoffj-test_17986
Exception `DRb:
drbunix:///tmp/stoffj-test_17986
counter.count=1
And it's not really clear to my *why* I'm getting these errors, or how I
can trap them so I'm not bothered by them. The basic code uses both DRb
and Slave libraries to make the interprocess communication simpler, and
to have a central counting server to regulate the number of
sub-processes I'll be using at any one point in time.
So, first off, here's a full set of output of my code on a test
directory.
Thanks for any hints or suggestions. I *think* I need to use begin ...
rescue ... end blocks possibly, but it's not clear where. Oh yeah, I'm
running this all on a CentOS 5.2 (RHEL5.2) Final system, against a small
350Mb test directory tree.
Thanks,
John
------------------------------- log ----------------------------------
$ ./readdir-drb.rb tmp
Starting DRb on server
Exception `DRb:
drbunix:///tmp/stoffj-test_17986
Exception `DRb:
drbunix:///tmp/stoffj-test_17986
Exception `DRb:
drbunix:///tmp/stoffj-test_17986
counter.count=1
Threaded!
Exception `DRb:
drbunix:///tmp/slave_proc_537506070_17986_17988_0_0.125356081811465
Exception `DRb:
drbunix:///tmp/slave_proc_537506070_17986_17988_0_0.125356081811465
Exception `DRb:
drbunix:///tmp/slave_proc_537506070_17986_17988_0_0.125356081811465
Threaded!
Exception `DRb:
drbunix:///tmp/slave_proc_537512250_17988_17989_0_0.20115187038118
Exception `DRb:
drbunix:///tmp/slave_proc_537512250_17988_17989_0_0.20115187038118
Exception `DRb:
drbunix:///tmp/slave_proc_537512250_17988_17989_0_0.20115187038118
size = 372104960
size = 373477128
Exception `NameError' at /usr/lib/ruby/site_ruby/1.8/slave.rb:409 -
uninitialized constant Slave:
Exception `RuntimeError' at /usr/lib/ruby/site_ruby/1.8/slave.rb:519 -
already shutdown
exit (SystemExit)
/usr/lib/ruby/site_ruby/1.8/slave.rb:265:in `exit'
/usr/lib/ruby/site_ruby/1.8/slave.rb:265:in `cling'
/usr/lib/ruby/site_ruby/1.8/slave.rb:259:in `call'
/usr/lib/ruby/site_ruby/1.8/slave.rb:259:in `on_cut'
/usr/lib/ruby/site_ruby/1.8/slave.rb:251:in `initialize'
/usr/lib/ruby/site_ruby/1.8/slave.rb:251:in `new'
/usr/lib/ruby/site_ruby/1.8/slave.rb:251:in `on_cut'
/usr/lib/ruby/site_ruby/1.8/slave.rb:265:in `cling'
/usr/lib/ruby/site_ruby/1.8/slave.rb:415:in `initialize'
/usr/lib/ruby/site_ruby/1.8/slave.rb:593:in `new'
/usr/lib/ruby/site_ruby/1.8/slave.rb:593:in `object'
/readdir-drb.rb:114:in `readdir'
/readdir-drb.rb:86:in `foreach'
/readdir-drb.rb:86:in `readdir'
/readdir-drb.rb:207
Total size: 373477128 B
Total size: 364723 KB
Total size: 356 MB
Total size: 0 GB
Exception `Errno::ENOENT' at /usr/lib/ruby/1.8/fileutils.rb:1281 - No
such file or directory -
/tmp/slave_proc_537506070_17986_17988_0_0.125356081811465
Exception `RuntimeError' at /usr/lib/ruby/site_ruby/1.8/slave.rb:519 -
already shutdown
rt3.taec.com:~/src/Tools/philesight-20081120$ Exception `NameError' at
/usr/lib/ruby/site_ruby/1.8/slave.rb:409 - uninitialized constant
Slave:
Exception `Errno::ENOENT' at /usr/lib/ruby/1.8/fileutils.rb:1281 - No
such file or directory -
/tmp/slave_proc_537512250_17988_17989_0_0.20115187038118
Exception `IOError' at /usr/lib/ruby/1.8/drb/unix.rb:92 - stream closed
Exception `Errno::EBADF' at /usr/lib/ruby/1.8/drb/unix.rb:92 - Bad file
descriptor - ///tmp/slave_proc_537512250_17988_17989_0_0.20115187038118
/usr/lib/ruby/1.8/drb/unix.rb:92:in `close': Bad file descriptor -
///tmp/slave_proc_537512250_17988_17989_0_0.20115187038118
(Errno::EBADF)
from /usr/lib/ruby/1.8/drb/unix.rb:92:in `close'
from /usr/lib/ruby/1.8/drb/drb.rb:1433:in `run'
from /usr/lib/ruby/1.8/drb/drb.rb:1427:in `start'
from /usr/lib/ruby/1.8/drb/drb.rb:1427:in `run'
from /usr/lib/ruby/1.8/drb/drb.rb:1347:in `initialize'
from /usr/lib/ruby/1.8/drb/drb.rb:1627:in `new'
from /usr/lib/ruby/1.8/drb/drb.rb:1627:in `start_service'
from /usr/lib/ruby/site_ruby/1.8/slave.rb:396:in `initialize'
... 28 levels...
from ./readdir-drb.rb:114:in `readdir'
from ./readdir-drb.rb:86:in `foreach'
from ./readdir-drb.rb:86:in `readdir'
from ./readdir-drb.rb:207
--------------------------- source
--------------------------------------
And here's my source code, excuse my lack of Ruby knowledge, I'm a newb
to Ruby, though not to programming. Hopefully that is shown. :]
#!/usr/bin/ruby
require 'getoptlong'
require 'thread'
require 'slave'
require 'drb'
require 'drb/unix'
$VERSION = "v1.0";
$max_slaves = 3
$count = 50
# Local Socket - need to worry about multiple runs of this script, so
add on
# unique PID.
$URI = "drbunix:///tmp/stoffj-test_" + Process.pid.to_s
slave_cnt_mutex = Mutex.new
opts = GetoptLong.new(
[ "--help", "-h", GetoptLong::NO_ARGUMENT],
[ "--kids", "-k", GetoptLong::REQUIRED_ARGUMENT]
)
#---------------------------------------------------------------------
class Counter
def initialize(max=1)
@slaves = []
@max = max
@count = 1
@count_mutex = Mutex.new
end
def count
@count
end
def max
@max
end
# Increment the count of slaves, returning 1 if incremented, 0 if not.
def increment
ok = nil
@count_mutex.synchronize do
if (@count < @max) then
@count += 1
ok = 1
end
end
ok
end
# Decrement the count of slaves
def decrement
@count_mutex.synchronize do
if (@count > 1) then
@count -= 1
end
end
@count
end
end
#---------------------------------------------------------------------
class ReadDir
def initialize(server)
@server = server
# Slave pool for this level of readdir() recursion.
@kids = []
end
def readdir(dir)
#puts "readdir(#{dir})"
size_file = {}
size_dir = {}
size_total = 0
# Traverse the directory and collect the size of all files and
# directories
begin
Dir.foreach(dir) do |f|
#print " #{f},"
if(f != "." && f != "..") then
f_full = addpath(dir, f)
stat = File.lstat(f_full)
if(!stat.symlink?) then
if(stat.file?) then
#puts " File: #{f}"
size = File.size(f_full)
size_file[f] = size
size_total += size
end
if(stat.directory?) then
#puts "DIR= #{f}"
if (@server.max <= 1) then
puts " no threads."
size = readdir(f_full)
if (size > 0) then
size_dir[f] = size
size_total += size
end
else
ok = @server.increment
if (ok)
puts " Threaded!"
@kids << Slave.object
size = readdir(f_full)
puts "size = #{size}"
# Duh... return the size from the slave properly
size
}
else
#puts " no free threads, do anyway"
size = readdir(f_full)
if(size > 0) then
size_dir[f] = size
size_total += size
end
end
end
end
end
end
end
end
@kids.each { |kid|
size_total += kid.value
}
#Puts "Dir: #{dir} = #{size_total}"
return size_total
end
end
#---------------------------------------------------------------------
# Read a directory and add to the database; this function is recursive
# for sub-directories
#---------------------------------------------------------------------
def usage
puts
puts "usage: readdir-drb [--kids NUM] <dir>"
puts " defaults to #{$max_kids} children"
puts
puts " version: #{$version}"
puts
end
#---------------------------------------------------------------------
def addpath(a, b)
return a + b if(a =~ /\/$/)
return a + "/" + b
end
#---------------------------------------------------------------------
# Main
#---------------------------------------------------------------------
$DEBUG = true
opts.each do |opt,arg|
case opt
when "--kids"
$max_slaves = arg.to_i
else
usage
exit
end
end
if ARGV.length != 1
puts "Missing dir argument (try --help)"
exit 0
end
dir = ARGV.shift
# Start the DRb service.
puts "Starting DRb on server"
DRb.start_service $URI, Counter.new($max_slaves)
# Child
# Fire up the first slave process which will do the work of readdir()
DRb.start_service
counter = DRbObject.new_with_uri $URI
puts "counter.count=#{counter.count}"
# Fire up a new Kid Class readdir.
kid = ReadDir.new(counter)
# Now let's try to do a recursive readdir() algorith with threads.
size = kid.readdir(dir)
sizekb = size / 1024;
sizemb = sizekb / 1024;
sizegb = sizemb / 1024;
puts ""
puts "Total size: #{size} B"
puts "Total size: #{sizekb} KB"
puts "Total size: #{sizemb} MB"
puts "Total size: #{sizegb} GB"