DRb Crashing

  • Thread starter James Edward Gray II
  • Start date
J

James Edward Gray II

If I launch this server:

#!/usr/local/bin/ruby

require "drb"
require "rinda/tuplespace"

tuplespace = Rinda::TupleSpace.new
DRb.start_service("druby://localhost:61676", tuplespace)

loop do
nums = Array.new(rand(9) + 2) { rand(10) + 1 }
ops = Array.new(nums.size - 1) { %w{+ - * /}[rand(4)] }
problem = nums.zip(ops).flatten.compact.join(" ")

tuplespace.write(["Problem", problem])
puts tuplespace.take(["Result", String]).last
end

__END__

Then run this client:

#!/usr/local/bin/ruby -w

require "drb"
require "rinda/tuplespace"

DRb.start_service
tuplespace = Rinda::TupleSpaceProxy.new(
DRbObject.new_with_uri("druby://localhost:61676")
)

while problem = tuplespace.take(["Problem", %r{^\d+(?: [-+*/] \d+)+$}])
tuplespace.write(["Result", "#{problem.last} = #{eval
problem.last}"])
end

__END__

The client crashes, generally within a few seconds. Adding a sleep
inside the server loop seems to resolve the issue.

Anyone know why?

James Edward Gray II
 
E

Eric Hodel

If I launch this server:

#!/usr/local/bin/ruby

require "drb"
require "rinda/tuplespace"

tuplespace = Rinda::TupleSpace.new
DRb.start_service("druby://localhost:61676", tuplespace)

loop do
nums = Array.new(rand(9) + 2) { rand(10) + 1 }
ops = Array.new(nums.size - 1) { %w{+ - * /}[rand(4)] }
problem = nums.zip(ops).flatten.compact.join(" ")

tuplespace.write(["Problem", problem])
puts tuplespace.take(["Result", String]).last
end

__END__

Then run this client:

#!/usr/local/bin/ruby -w

require "drb"
require "rinda/tuplespace"

DRb.start_service
tuplespace = Rinda::TupleSpaceProxy.new(
DRbObject.new_with_uri("druby://localhost:61676")
)

while problem = tuplespace.take(["Problem", %r{^\d+(?: [-+*/] \d+)+
$}])
tuplespace.write(["Result", "#{problem.last} = #{eval
problem.last}"])
end

__END__

The client crashes, generally within a few seconds. Adding a sleep
inside the server loop seems to resolve the issue.

With what error?
Anyone know why?

I've had better luck with these kinds of errors by placing the
TupleSpace in its own process.
 
J

James Edward Gray II

With what error?

$ ruby client.rb
(druby://localhost:61676) /usr/local/lib/ruby/1.8/rinda/tuplespace.rb:
332:in `move': undefined method `push' for :EDQUOT=:Symbol
(NoMethodError)
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/
monitor.rb:229:in `synchronize'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/rinda/
tuplespace.rb:329:in `move'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/drb/
drb.rb:1552:in `perform_without_block'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/drb/
drb.rb:1512:in `perform'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/drb/
drb.rb:1586:in `main_loop'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/drb/
drb.rb:1582:in `main_loop'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/drb/
drb.rb:1578:in `main_loop'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/drb/
drb.rb:1427:in `run'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/drb/
drb.rb:1424:in `run'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/drb/
drb.rb:1344:in `initialize'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/drb/
drb.rb:1624:in `start_service'
from (druby://localhost:61676) server.rb:7
from /usr/local/lib/ruby/1.8/rinda/rinda.rb:153:in `take'
from client.rb:11

James Edward Gray II
 
E

Eric Hodel

$ ruby client.rb
(druby://localhost:61676) /usr/local/lib/ruby/1.8/rinda/
tuplespace.rb:332:in `move': undefined method `push'
for :EDQUOT=:Symbol (NoMethodError)

I'm seeing similar, but only on the Mac.

I wrote [ruby-core:06629], so you may want to add your ruby versions
and any additional insight there, as I think this is a problem
somewhere in the guts of Ruby.
 
H

Hugh Sasse

If I launch this server: [...]
__END__

Then run this client:

#!/usr/local/bin/ruby -w

require "drb"
require "rinda/tuplespace"

DRb.start_service
tuplespace = Rinda::TupleSpaceProxy.new(
DRbObject.new_with_uri("druby://localhost:61676")
)

while problem = tuplespace.take(["Problem", %r{^\d+(?: [-+*/] \d+)+$}])
tuplespace.write(["Result", "#{problem.last} = #{eval problem.last}"])
end

__END__

The client crashes, generally within a few seconds. Adding a sleep inside the
server loop seems to resolve the issue.

Anyone know why?

No, I don't. What happens if you match the old way? Like:
while problem = tuplespace.take(["Problem", nil])
James Edward Gray II
Hugh
 
E

Eric Hodel

The client crashes, generally within a few seconds. Adding a
sleep inside the
server loop seems to resolve the issue.

Anyone know why?

No, I don't. What happens if you match the old way? Like:
while problem = tuplespace.take(["Problem", nil])

It seems to be a problem only when OS X is involved.

I've been running the same two scripts using Ruby 1.8.3 on FreeBSD
for the past hour without error. I'll let them run for at least
another four or five to see if they fail.

See-also: [ruby-core:06629]
 
J

James Edward Gray II

If I launch this server:
[...]

__END__

Then run this client:

#!/usr/local/bin/ruby -w

require "drb"
require "rinda/tuplespace"

DRb.start_service
tuplespace = Rinda::TupleSpaceProxy.new(
DRbObject.new_with_uri("druby://localhost:61676")
)

while problem = tuplespace.take(["Problem", %r{^\d+(?: [-+*/] \d+)+
$}])
tuplespace.write(["Result", "#{problem.last} = #{eval
problem.last}"])
end

__END__

The client crashes, generally within a few seconds. Adding a
sleep inside the
server loop seems to resolve the issue.

Anyone know why?

No, I don't. What happens if you match the old way? Like:
while problem = tuplespace.take(["Problem", nil])

The Regexp is pretty critical in this case. It validates the data
before an otherwise dangerous call to eval().

James Edward Gray II
 
H

Hugh Sasse

If I launch this server:
[...]

__END__

Then run this client:

#!/usr/local/bin/ruby -w

require "drb"
require "rinda/tuplespace"

DRb.start_service
tuplespace = Rinda::TupleSpaceProxy.new(
DRbObject.new_with_uri("druby://localhost:61676")
)

while problem = tuplespace.take(["Problem", %r{^\d+(?: [-+*/] \d+)+$}])
tuplespace.write(["Result", "#{problem.last} = #{eval problem.last}"])
end

__END__

The client crashes, generally within a few seconds. Adding a sleep inside
the
server loop seems to resolve the issue.

Anyone know why?

No, I don't. What happens if you match the old way? Like:
while problem = tuplespace.take(["Problem", nil])

The Regexp is pretty critical in this case. It validates the data before an
otherwise dangerous call to eval().

while problem = tuplespace.take(["Problem", nil])
unless problem.last =~ %r{^\d+(?: [-+*/] \d+)+$}
puts "Bogus input \'#{problem.last}\', Ted!" #:)
else
tuplespace.write(["Result", "#{problem.last} = #{eval problem.last}"])
end
end
James Edward Gray II
Hugh
 
J

James Edward Gray II

On Wed, 16 Nov 2005, James Edward Gray II wrote:



If I launch this server:


[...]


__END__

Then run this client:

#!/usr/local/bin/ruby -w

require "drb"
require "rinda/tuplespace"

DRb.start_service
tuplespace = Rinda::TupleSpaceProxy.new(
DRbObject.new_with_uri("druby://localhost:61676")
)

while problem = tuplespace.take(["Problem", %r{^\d+(?: [-+*/] \d
+)+$}])
tuplespace.write(["Result", "#{problem.last} = #{eval
problem.last}"])
end

__END__

The client crashes, generally within a few seconds. Adding a
sleep inside
the
server loop seems to resolve the issue.

Anyone know why?



No, I don't. What happens if you match the old way? Like:
while problem = tuplespace.take(["Problem", nil])

The Regexp is pretty critical in this case. It validates the data
before an
otherwise dangerous call to eval().

while problem = tuplespace.take(["Problem", nil])
unless problem.last =~ %r{^\d+(?: [-+*/] \d+)+$}
puts "Bogus input \'#{problem.last}\', Ted!" #:)
else
tuplespace.write(["Result", "#{problem.last} = #{eval
problem.last}"])
end
end

This is not equivalent. You removed the problem from the TupleSpace
whereas my version leaves it for someone else to solve.

I realize this isn't what you were originally asking for and I can
try the change, if you want to see it. To me it's irrelevant though,
because TupleSpace supports a Regexp search and it should not be
crashing when doing it.

This is just a simplified case I cooked up to show the issue.

James Edward Gray II
 
H

Hugh Sasse

On Nov 15, 2005, at 6:50 PM, Hugh Sasse wrote:


On Wed, 16 Nov 2005, James Edward Gray II wrote: [...]
The Regexp is pretty critical in this case. It validates the data before
an
otherwise dangerous call to eval().

while problem = tuplespace.take(["Problem", nil])
unless problem.last =~ %r{^\d+(?: [-+*/] \d+)+$}
puts "Bogus input \'#{problem.last}\', Ted!" #:)
else
tuplespace.write(["Result", "#{problem.last} = #{eval problem.last}"])
end
end

This is not equivalent. You removed the problem from the TupleSpace whereas
my version leaves it for someone else to solve.

Yes, that's true, but it's not really the point I was making -- one
could write it back, or whatever.
I realize this isn't what you were originally asking for and I can try the
change, if you want to see it. To me it's irrelevant though, because
TupleSpace supports a Regexp search and it should not be crashing when doing
it.

IIRC it didn't in the past. My point is that if the old way works then
maybe it narrows down where to swat the bug. My expectation is that
it won't make any difference, but it it isn't tested we won't know.
This is just a simplified case I cooked up to show the issue.

James Edward Gray II

Hugh
 
J

James Edward Gray II

IIRC it didn't in the past. My point is that if the old way works then
maybe it narrows down where to swat the bug. My expectation is that
it won't make any difference, but it it isn't tested we won't know.

Same issue.

$ ruby client.rb
(druby://localhost:61676) /usr/local/lib/ruby/1.8/rinda/tuplespace.rb:
446:in `move': undefined method `push' for :push:Symbol (NoMethodError)
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/
monitor.rb:229:in `synchronize'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/rinda/
tuplespace.rb:443:in `move'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/drb/
drb.rb:1552:in `perform_without_block'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/drb/
drb.rb:1512:in `perform'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/drb/
drb.rb:1586:in `main_loop'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/drb/
drb.rb:1582:in `main_loop'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/drb/
drb.rb:1578:in `main_loop'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/drb/
drb.rb:1427:in `run'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/drb/
drb.rb:1424:in `run'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/drb/
drb.rb:1344:in `initialize'
from (druby://localhost:61676) /usr/local/lib/ruby/1.8/drb/
drb.rb:1624:in `start_service'
from (druby://localhost:61676) server.rb:7
from /usr/local/lib/ruby/1.8/rinda/rinda.rb:229:in `take'
from client.rb:11
$ cat client.rb
#!/usr/local/bin/ruby -w

require "drb"
require "rinda/tuplespace"

DRb.start_service
tuplespace = Rinda::TupleSpaceProxy.new(
DRbObject.new_with_uri("druby://localhost:61676")
)

while problem = tuplespace.take(["Problem", nil])
tuplespace.write(["Result", "#{problem.last} = #{eval
problem.last}"])
end

__END__

James Edward Gray II
 
H

Hugh Sasse

Same issue.
Bother.

$ ruby client.rb
(druby://localhost:61676) /usr/local/lib/ruby/1.8/rinda/tuplespace.rb:446:in
`move': undefined method `push' for :push:Symbol (NoMethodError)

which means the parameter port is set to :push
from (druby://localhost:61676)
/usr/local/lib/ruby/1.8/monitor.rb:229:in `synchronize'
from (druby://localhost:61676)
/usr/local/lib/ruby/1.8/rinda/tuplespace.rb:443:in `move'
from (druby://localhost:61676)
/usr/local/lib/ruby/1.8/drb/drb.rb:1552:in `perform_without_block'
from (druby://localhost:61676)

something to do with @obj and @argv. These are delivered by
__send__.
/usr/local/lib/ruby/1.8/drb/drb.rb:1512:in `perform'
from (druby://localhost:61676)
/usr/local/lib/ruby/1.8/drb/drb.rb:1586:in `main_loop'
from (druby://localhost:61676)

seems to come from cliemt.recvfrom.
/usr/local/lib/ruby/1.8/drb/drb.rb:1582:in `main_loop' [...]
/usr/local/lib/ruby/1.8/drb/drb.rb:1624:in `start_service'
from (druby://localhost:61676) server.rb:7
from /usr/local/lib/ruby/1.8/rinda/rinda.rb:229:in `take'
from client.rb:11

All comments are from looking at the CVS, but the line numbers
agree, AFAICS.
$ cat client.rb [...]
__END__

Looks fine to me.
James Edward Gray II
I'm stumped.
Hugh
 
E

Eric Hodel

$ ruby client.rb
(druby://localhost:61676) /usr/local/lib/ruby/1.8/rinda/
tuplespace.rb:446:in
`move': undefined method `push' for :push:Symbol (NoMethodError)

which means the parameter port is set to :push
from (druby://localhost:61676)
/usr/local/lib/ruby/1.8/monitor.rb:229:in `synchronize'
from (druby://localhost:61676)
/usr/local/lib/ruby/1.8/rinda/tuplespace.rb:443:in `move'
from (druby://localhost:61676)
/usr/local/lib/ruby/1.8/drb/drb.rb:1552:in `perform_without_block'
from (druby://localhost:61676)

something to do with @obj and @argv. These are delivered by
__send__.
/usr/local/lib/ruby/1.8/drb/drb.rb:1512:in `perform'
from (druby://localhost:61676)
/usr/local/lib/ruby/1.8/drb/drb.rb:1586:in `main_loop'
from (druby://localhost:61676)

seems to come from cliemt.recvfrom.
/usr/local/lib/ruby/1.8/drb/drb.rb:1582:in `main_loop' [...]
/usr/local/lib/ruby/1.8/drb/drb.rb:1624:in `start_service'
from (druby://localhost:61676) server.rb:7
from /usr/local/lib/ruby/1.8/rinda/rinda.rb:229:in `take'
from client.rb:11

All comments are from looking at the CVS, but the line numbers
agree, AFAICS.
$ cat client.rb [...]
__END__

Looks fine to me.

I'm stumped.

This is likely not a problem that can be fixed with more Ruby code.

I ran the two files on DRb for over 4 hours continuously on a FreeBSD
machine, which tells me that it is Mac-specific. Mixing FreeBSD and
Mac would always crash within 5 minutes at worst. My best guess is
that Marshal is not operating correctly or ObjectSpace#_id2ref is
looking up bad objects.
 
E

Eric Hodel

$ ruby client.rb
(druby://localhost:61676) /usr/local/lib/ruby/1.8/rinda/
tuplespace.rb:446:in
`move': undefined method `push' for :push:Symbol (NoMethodError)

which means the parameter port is set to :push
from (druby://localhost:61676)
/usr/local/lib/ruby/1.8/monitor.rb:229:in `synchronize'
from (druby://localhost:61676)
/usr/local/lib/ruby/1.8/rinda/tuplespace.rb:443:in `move'
from (druby://localhost:61676)
/usr/local/lib/ruby/1.8/drb/drb.rb:1552:in `perform_without_block'
from (druby://localhost:61676)

something to do with @obj and @argv. These are delivered by
__send__.
/usr/local/lib/ruby/1.8/drb/drb.rb:1512:in `perform'
from (druby://localhost:61676)
/usr/local/lib/ruby/1.8/drb/drb.rb:1586:in `main_loop'
from (druby://localhost:61676)

seems to come from cliemt.recvfrom.
/usr/local/lib/ruby/1.8/drb/drb.rb:1582:in `main_loop' [...]
/usr/local/lib/ruby/1.8/drb/drb.rb:1624:in `start_service'
from (druby://localhost:61676) server.rb:7
from /usr/local/lib/ruby/1.8/rinda/rinda.rb:229:in `take'
from client.rb:11

All comments are from looking at the CVS, but the line numbers
agree, AFAICS.
$ cat client.rb [...]
__END__

Looks fine to me.

I'm stumped.

This is likely not a problem that can be fixed with more Ruby code.

I ran the two files on DRb for over 4 hours continuously on a
FreeBSD machine, which tells me that it is Mac-specific. Mixing
FreeBSD and Mac would always crash within 5 minutes at worst. My
best guess is that Marshal is not operating correctly or
ObjectSpace#_id2ref is looking up bad objects.

I should also note that disabling the GC on the client does not
affect the frequency of crashes.
 
P

Phil Tomson

If I launch this server:

#!/usr/local/bin/ruby

require "drb"
require "rinda/tuplespace"

tuplespace = Rinda::TupleSpace.new
DRb.start_service("druby://localhost:61676", tuplespace)

loop do
nums = Array.new(rand(9) + 2) { rand(10) + 1 }
ops = Array.new(nums.size - 1) { %w{+ - * /}[rand(4)] }
problem = nums.zip(ops).flatten.compact.join(" ")

tuplespace.write(["Problem", problem])
puts tuplespace.take(["Result", String]).last
end

__END__

Then run this client:

#!/usr/local/bin/ruby -w

require "drb"
require "rinda/tuplespace"

DRb.start_service
tuplespace = Rinda::TupleSpaceProxy.new(
DRbObject.new_with_uri("druby://localhost:61676")
)

while problem = tuplespace.take(["Problem", %r{^\d+(?: [-+*/] \d+)+$}])
tuplespace.write(["Result", "#{problem.last} = #{eval
problem.last}"])
end

__END__

The client crashes, generally within a few seconds. Adding a sleep
inside the server loop seems to resolve the issue.

Anyone know why?

James Edward Gray II

Just wondering if you've gotten any resolution on this? I'm seeing the same
thing on Tiger both with your test code and my own DRb code. While the same
code runs fine for hours on Linux and FreeBSD. It seems that DRb cannot be
used reliably on OSX. Anyone have any idea what might be going on?

Phil
 
P

Phil Tomson

$ ruby client.rb
(druby://localhost:61676) /usr/local/lib/ruby/1.8/rinda/
tuplespace.rb:446:in
`move': undefined method `push' for :push:Symbol (NoMethodError)

which means the parameter port is set to :push
from (druby://localhost:61676)
/usr/local/lib/ruby/1.8/monitor.rb:229:in `synchronize'
from (druby://localhost:61676)
/usr/local/lib/ruby/1.8/rinda/tuplespace.rb:443:in `move'
from (druby://localhost:61676)
/usr/local/lib/ruby/1.8/drb/drb.rb:1552:in `perform_without_block'
from (druby://localhost:61676)

something to do with @obj and @argv. These are delivered by
__send__.
/usr/local/lib/ruby/1.8/drb/drb.rb:1512:in `perform'
from (druby://localhost:61676)
/usr/local/lib/ruby/1.8/drb/drb.rb:1586:in `main_loop'
from (druby://localhost:61676)

seems to come from cliemt.recvfrom.
/usr/local/lib/ruby/1.8/drb/drb.rb:1582:in `main_loop' [...]
/usr/local/lib/ruby/1.8/drb/drb.rb:1624:in `start_service'
from (druby://localhost:61676) server.rb:7
from /usr/local/lib/ruby/1.8/rinda/rinda.rb:229:in `take'
from client.rb:11

All comments are from looking at the CVS, but the line numbers
agree, AFAICS.
$ cat client.rb [...]
__END__

Looks fine to me.

I'm stumped.

This is likely not a problem that can be fixed with more Ruby code.

I ran the two files on DRb for over 4 hours continuously on a FreeBSD
machine, which tells me that it is Mac-specific. Mixing FreeBSD and
Mac would always crash within 5 minutes at worst. My best guess is
that Marshal is not operating correctly or ObjectSpace#_id2ref is
looking up bad objects.

I've seen similar results: runs for hours on Linux, very flakey on OSX. It's
kind of hard to believe that Marshal is broke on OSX, though. I'm thinking
it's something related to networking code.

Phil
 
E

Eric Hodel

Just wondering if you've gotten any resolution on this? I'm seeing
the same
thing on Tiger both with your test code and my own DRb code. While
the same
code runs fine for hours on Linux and FreeBSD. It seems that DRb
cannot be
used reliably on OSX. Anyone have any idea what might be going on?

I haven't had the time to look into it yet. I'll try to get some
time in against the latest 1.8.4 preview RSN.
 
E

Eric Hodel

If I launch this server:

#!/usr/local/bin/ruby

require "drb"
require "rinda/tuplespace"

tuplespace = Rinda::TupleSpace.new
DRb.start_service("druby://localhost:61676", tuplespace)

loop do
nums = Array.new(rand(9) + 2) { rand(10) + 1 }
ops = Array.new(nums.size - 1) { %w{+ - * /}[rand(4)] }
problem = nums.zip(ops).flatten.compact.join(" ")

tuplespace.write(["Problem", problem])
puts tuplespace.take(["Result", String]).last
end

__END__

Then run this client:

#!/usr/local/bin/ruby -w

require "drb"
require "rinda/tuplespace"

DRb.start_service
tuplespace = Rinda::TupleSpaceProxy.new(
DRbObject.new_with_uri("druby://localhost:61676")
)

while problem = tuplespace.take(["Problem", %r{^\d+(?: [-+*/] \d+)+
$}])
tuplespace.write(["Result", "#{problem.last} = #{eval
problem.last}"])
end

__END__

The client crashes, generally within a few seconds. Adding a sleep
inside the server loop seems to resolve the issue.

Anyone know why?

Just wondering if you've gotten any resolution on this? I'm seeing
the same
thing on Tiger both with your test code and my own DRb code. While
the same
code runs fine for hours on Linux and FreeBSD. It seems that DRb
cannot be
used reliably on OSX. Anyone have any idea what might be going on?

Compiling Ruby with GCC3.3 seems to make the problem go away [ruby-
core:6825].

GCC 4 seems to build an ok Ruby on other systems [ruby-core:6827].
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top