DrbUndumped and GC

A

Ara.T.Howard

if one has a drb server that returns objects extended by DrbUndumped to the
client, how is garbage collection done? eg.


class Proxy
include DrbUndumped
end

class Server
def method
return Proxy::new
end
end

so the client will have a handle on the Proxy, and so will the client. how
will the server know when the client no longer needs the handle and gc the
object? POLS says the object would be gc'd in the client as normal and that
this would trigger the gc on the server. but what if more than one client has
a handle on the server side proxy? now we are reference counting across
remote nodes. my gut says this could get one in trouble quickly if many
DrbUndumped objects were being returned to many clients, or even if a single
DrbUndumped object was returned to many clients...

i'm about to dig into the code and run some tests to see - but am hoping some
of you drb experts out there may already have done this ;-)

regards.

-a
--
===============================================================================
| EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
| PHONE :: 303.497.6469
| A flower falls, even though we love it;
| and a weed grows, even though we do not love it.
| --Dogen
===============================================================================
 
M

Masatoshi SEKI

so the client will have a handle on the Proxy, and so will the client.
how
will the server know when the client no longer needs the handle and gc
the
object? POLS says the object would be gc'd in the client as normal
and that
this would trigger the gc on the server. but what if more than one
client has
a handle on the server side proxy? now we are reference counting
across
remote nodes. my gut says this could get one in trouble quickly if
many
DrbUndumped objects were being returned to many clients, or even if a
single
DrbUndumped object was returned to many clients...

i'm about to dig into the code and run some tests to see - but am
hoping some
of you drb experts out there may already have done this ;-)


How about TimerIdConv?

require 'drb/timeridconv'

....

DRb.install_id_conv(DRb::TimerIdConv.new)
DRb.start_service(....)


I like the following simple approach.

class Server
def initialize
@proxy = Proxy.new
end
def method
@proxy
end
end
 
E

Eric Hodel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

if one has a drb server that returns objects extended by DrbUndumped
to the
client, how is garbage collection done? eg.

class Proxy
include DrbUndumped
end

class Server
def method
return Proxy::new
end
end

so the client will have a handle on the Proxy, and so will the client.
how
will the server know when the client no longer needs the handle and gc
the
object? POLS says the object would be gc'd in the client as normal
and that
this would trigger the gc on the server. but what if more than one
client has
a handle on the server side proxy? now we are reference counting
across
remote nodes. my gut says this could get one in trouble quickly if
many
DrbUndumped objects were being returned to many clients, or even if a
single
DrbUndumped object was returned to many clients...

Nope, DRb is not that smart by default. Your Proxy instance is
immediately available for garbage collection. You want to use a
different IdConv class to ensure objects don't get GC'd before they're
supposed to, or don't get GC'd until your clients are done using them.
i'm about to dig into the code and run some tests to see - but am
hoping some
of you drb experts out there may already have done this ;-)

http://segment7.net/projects/ruby/drb/idconv.html

Shows how the various id conversion classes work with DRb, letting you
control how and when objects get GC'd.

A ref-counting IdConv would be a nifty add-on to DRb.

- --
Eric Hodel - (e-mail address removed) - http://segment7.net
All messages signed with fingerprint:
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)

iD8DBQFBUiyBMypVHHlsnwQRAp6RAKChVFS1p/Jq9VsJP/4d9hSrTwpxOgCg+jIr
bd29ZmT7uWNGWxdPkvULt7M=
=E8n6
-----END PGP SIGNATURE-----
 
A

Ara.T.Howard

How about TimerIdConv?

require 'drb/timeridconv'

....

DRb.install_id_conv(DRb::TimerIdConv.new)
DRb.start_service(....)


I like the following simple approach.

class Server
def initialize
@proxy = Proxy.new
end
def method
@proxy
end
end

in my case the code is

require 'drb'
require 'detach'
class JobRunner
#{{{
include DRbUndumped
attr :job
attr :jid
attr :cid
alias pid cid
attr :shell
attr :command
attr :status
def initialize job
#{{{
@status = nil
@job = job
@jid = job['jid']
@command = job['command']
@shell = job['shell'] || 'bash'
@r,@w = IO.pipe
@cid =
#Util::fork do
fork do
@w.close
STDIN.reopen @r
if File::basename(@shell) == 'bash' || File::basename(@shell) == 'sh'
exec [@shell, "__rq_job__#{ @jid }__#{ File.basename(@shell) }__"], '--login'
else
exec [@shell, "__rq_job__#{ @jid }__#{ File.basename(@shell) }__"], '-l'
end
end
@r.close
#}}}
end
def run
#{{{
@w.puts @command
@w.close
#}}}
end
def wait2 flags = 0
#{{{
pid, status = Process::waitpid2 @cid, flags
@status = status
[pid, status]
#}}}
end
def wait flags = 0
#{{{
wait2(flags).last
#}}}
end
#}}}
end
class JobRunnerDaemon
#{{{
class << self
#{{{
def new(*a,&b)
#{{{
super(*a,&b).detach:)background=>false)
#}}}
end
#}}}
end
def runner(*a,&b)
#{{{
JobRunner::new(*a,&b)
#}}}
end
alias new_runner runner
alias runner_new runner
%w( wait wait2 waitpid waitpid2 ).each do |m|
eval "def #{ m }(*a,&b);Process::#{ m }(*a,&b);end"
end
#}}}
end
JobRunD = JobRunnerDaemon

i need an object that can fork for me in another process. i can't fork in the
current process due to some compilcations with sqlite and open files carried
across forks. so i'm creating potentially dozens of these JobRunner objects
and using

jrd.wait

to do a blocking wait on their completion (also some non-blocking waits but
that's an impl detail).

it works great now, i was just worrying about the GC - now i know i should be.
because i should always do some sort of wait on the JobRunner's i can probably
hook something into the 'def runner' method that registers the objects in a
class datastructure (preventing gc) and hook something into the wait* methods
that does the wait and removes them from this structure.

i'll check out DRb::TimerIdConv tomorrow and get back to you.

btw. thanks VERY much for the speedy reply! this mailing list is amazing!

cheers.

-a
--
===============================================================================
| EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
| PHONE :: 303.497.6469
| A flower falls, even though we love it;
| and a weed grows, even though we do not love it.
| --Dogen
===============================================================================
 
A

Ara.T.Howard

Nope, DRb is not that smart by default. Your Proxy instance is immediately
available for garbage collection. You want to use a different IdConv class
to ensure objects don't get GC'd before they're supposed to, or don't get
GC'd until your clients are done using them.

my gut was right.

awesome. i browsed quickly and will have to read completely tomorrow. thanks
for putting this together.
Shows how the various id conversion classes work with DRb, letting you
control how and when objects get GC'd.

A ref-counting IdConv would be a nifty add-on to DRb.

that's essentially what i'm going to do (see my post to Masatoshi) -
simplified since i know there will only every be 1 reference held.

i was just doing some searching and came across some other posts by you on
this matter - i'm very glad we have a resident drb expert on the list. thanks
alot for the replies and docs.

more tomorrow...

cheers.

-a
--
===============================================================================
| EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
| PHONE :: 303.497.6469
| A flower falls, even though we love it;
| and a weed grows, even though we do not love it.
| --Dogen
===============================================================================
 
E

Eric Hodel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

that's essentially what i'm going to do (see my post to Masatoshi) -
simplified since i know there will only every be 1 reference held.

DRb::NamedIdConv may be of use also. It allows clients to die, then
come back and pick their reference back up. Also look at TupleSpace,
its a good place to store things like distributed refcounts since it
takes care of atomic updates. (How best to do this is not obvious, and
a great book on the subject is no longer in print. Give a holler if
you decide to use it and need clues.)
i was just doing some searching and came across some other posts by
you on
this matter - i'm very glad we have a resident drb expert on the list.
thanks
alot for the replies and docs.

Thanks!

- --
Eric Hodel - (e-mail address removed) - http://segment7.net
All messages signed with fingerprint:
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)

iD8DBQFBUlLwMypVHHlsnwQRAkbSAKDYzfceL417O6CrIyfkwGDSVrSfKwCgstPy
1jEZVaqFvEGmfvsiLLKxX3M=
=Lkw6
-----END PGP SIGNATURE-----
 
A

Ara.T.Howard

DRb::NamedIdConv may be of use also. It allows clients to die, then come
back and pick their reference back up. Also look at TupleSpace, its a good
place to store things like distributed refcounts since it takes care of
atomic updates. (How best to do this is not obvious, and a great book on
the subject is no longer in print. Give a holler if you decide to use it
and need clues.)

eric-

here's what i ended up with. this is an excerpt from a much larger peice of
code, but it runs standalone and i think it's pretty clear what's happening
but here's a little explanation anyways:

the JobRunnerDaemon exists to that my process can fork without forking - it
forks in another process on my behalf. other than handling the reaping of the
forked children this is all it's really for. basically i just need a handle
to track the forked children and a way to reap them in the normal
blocking/non-blocking (WNOHANG, etc) way.

JobRunnerDaemon#gen_runner is the method of interest. it returns a new
JobRunner but maintains a reference by loading it into a hash (@runners). all the
various wait methods delete the runner from the hash. this daemon is used by
one, and only one, process at a time in a single threaded fashoin so i think
this approach is safe:

- handle on returned DRbUndumped objects is maintained on both client and
server so nothing should evaporate on me

- due to the use case (wait) there is a point in the code when i know the
object can be recycled (reference lost) and this is leveraged by
automatically disgarding the reference at that point

the code:

require 'drb'
require 'detach'
class JobRunner
#{{{
include DRbUndumped
attr :job
attr :jid
attr :cid
alias pid cid
attr :shell
attr :command
attr :status
def initialize job
#{{{
@status = nil
@job = job
@jid = job['jid']
@command = job['command']
@shell = job['shell'] || 'bash'
@r,@w = IO.pipe
@cid =
#Util::fork do
fork do
@w.close
STDIN.reopen @r
if File::basename(@shell) == 'bash' || File::basename(@shell) == 'sh'
exec [@shell, "__rq_job__#{ @jid }__#{ File.basename(@shell) }__"], '--login'
else
exec [@shell, "__rq_job__#{ @jid }__#{ File.basename(@shell) }__"], '-l'
end
end
@r.close
#}}}
end
def run
#{{{
@w.puts @command
@w.close
self
#}}}
end
#}}}
end
class JobRunnerDaemon
#{{{
class << self
#{{{
def new(*a,&b)
#{{{
super(*a,&b).detach:)background=>false)
#}}}
end
#}}}
end
attr :runners
def initialize
#{{{
@runners = {}
#}}}
end
def gen_runner(*a,&b)
#{{{
r = JobRunner::new(*a,&b)
@runners[r.pid] = r
r
#}}}
end
def wait
#{{{
pid = Process::wait
@runners.delete pid
pid
#}}}
end
def wait2
#{{{
pid, status = Process::wait2
@runners.delete pid
[pid, status]
#}}}
end
def waitpid pid = -1, flags = 0
#{{{
pid = Process::waitpid pid, flags
@runners.delete pid if pid
pid
#}}}
end
def waitpid2 pid = -1, flags = 0
#{{{
pid, status = Process::waitpid2 pid, flags
@runners.delete pid if pid
[pid, status]
#}}}
end
#}}}
end
JobRunD = JobRunnerDaemon
#
# baby test - watch in top for memory leaks
#
if $0 == __FILE__
#{{{
STDOUT.sync = true
d=JobRunnerDaemon::new
loop do
#
# spawn a bunch of jobs on the server
#
rand(42).times do
r=d.gen_runner 'jid'=>42,'command'=>'echo $$'
r.run
end
#
# reap them
#
loop do
begin
pid, status = d.waitpid2
if pid
p [pid, status]
else
break
end
rescue Errno::ECHILD
break
end
end
end
#}}}
end


any comments welcome.

kind regards.

-a
--
===============================================================================
| EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
| PHONE :: 303.497.6469
| A flower falls, even though we love it;
| and a weed grows, even though we do not love it.
| --Dogen
===============================================================================
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top