Fault Tolerant DRb?

Kirk Haines · Jul 1, 2005

Just pondering different things this morning, and my mind came back to
something I've thought about now and again.

Assume you are using a DRb service for....something. It doesn't matter what.
The case is the same whether one is accessing an array via DRb or a Rinda
Ring. Is there some reasonably easy way of making a service work in a fault
tolerant way? That is, one could have two processes on two different
machines both offering the same service. If one process dies, the data is
still present on the other, and the clients of that service can continue
operating without data loss?

Kirk Haines

Ara.T.Howard · Jul 1, 2005

Just pondering different things this morning, and my mind came back to
something I've thought about now and again.

Assume you are using a DRb service for....something. It doesn't matter what.
The case is the same whether one is accessing an array via DRb or a Rinda
Ring. Is there some reasonably easy way of making a service work in a fault
tolerant way? That is, one could have two processes on two different
machines both offering the same service. If one process dies, the data is
still present on the other, and the clients of that service can continue
operating without data loss?

Kirk Haines

i've done tons of ha (high availability) setups before for stateful and
stateless machines. suffice it to say it is almost un-imaginably complex.
consider:

* how to you tell if one machine is down vs. the network just being slow?
for instance on our machines monthly backups might make any machine seem
dead (can't ping) for 20 minutes or more. typically this is solved via a
serial cable between nodes to ping on using real-time priorities.

* if you have the data on both machines and it can EVER be written to
(modified) how to you bring the data back in sync when a machine has died
but is now back up?

these problems are solved - but it's still amazingly hard to get right. check
out the linux-ha project (google it).

depending on you needs you may be able to code something simple that 'good
enough' but you'll need some sort of distributed transaction capability and
the easist way to get that is via a real rdbms like postgresql. however, once
you have that setup it's stilly to use drb unless your data is terrible to
model within the relational model.

feel free to contact me offline if you want to setup an ha box(es).

hth.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple. My religion is kindness.
| --Tenzin Gyatso
===============================================================================

Shashank Date · Jul 1, 2005

Hi Kirk,

I have written something like this a long time back to build fault tolera=
nt database clusters.
It became pretty messy pretty quick (of course I was not as proficient in=
Ruby back then ;-)).=20
So I have some questions:

--- Kirk Haines said:
Assume you are using a DRb service for....something. It doesn't matter= what. =20
The case is the same whether one is accessing an array via DRb or a Rin= da=20
Ring. Is there some reasonably easy way of making a service work in a = fault=20
tolerant way? That is, one could have two processes on two different=20
machines both offering the same service. If one process dies, the data= is=20
still present on the other,=20

^^^^^^^^^^^^^^^^^^^^^^^^^^
How do you propose to ensure that? Is it on a shared file system (like NF=
S).
If true, then take a look at Ara's rq package:

http://www.codeforpeople.com/lib/ruby/rq/rq-2.3.0/TUTORIAL

If false, then think of some "easy" way of replication.

and the clients of that service can continue=20
operating without data loss?

I had to worry about how the clients who were in the middle of a request =
would know that the
service is no longer available.=20

=20
Kirk Haines
=20

-- shanko

=20

=09
____________________________________________________=20
Yahoo! Sports=20
Rekindle the Rivalries. Sign up for Fantasy Football=20
http://football.fantasysports.yahoo.com

Kirk Haines · Jul 1, 2005

depending on you needs you may be able to code something simple that 'good
enough' but you'll need some sort of distributed transaction capability and
the easist way to get that is via a real rdbms like postgresql. however,
once you have that setup it's stilly to use drb unless your data is
terrible to model within the relational model.

LOL. All valid points. You never know, though. Sometimes when one asks for
something magical and unlikely, someone else pipes up and delivers. It was
worth a shot. Thanks Ara (and Shashank) for the comments.

Kirk Haines

gwtmp01 · Jul 1, 2005

Assume you are using a DRb service for....something. It doesn't
matter what.
The case is the same whether one is accessing an array via DRb or a
Rinda
Ring. Is there some reasonably easy way of making a service work
in a fault
tolerant way?

You might want to take a look at some of the software and ideas at
http://www.cse.cuhk.edu.hk/~xychen/GroupCS/gcs.htm

This page has a great summary of toolkits that implement
"process group communication" or "virtual synchrony". A variety of
toolkits
have evolved and been released in various forms. While I don't know
of any
ruby implementation or wrapper for these ideas/software it would be a
great
project.

The goal of process group communication is to send a series of
messages to a
named group of recipients and ensure that every member of the group
receives
the messages in a globally consistent order in the presence of
communication
and/or hardware failures. From this foundation you can build a
variety of
fault tolerant systems.

Gary Wright

Ara.T.Howard · Jul 1, 2005

You might want to take a look at some of the software and ideas at
http://www.cse.cuhk.edu.hk/~xychen/GroupCS/gcs.htm

This page has a great summary of toolkits that implement
"process group communication" or "virtual synchrony". A variety of toolkits
have evolved and been released in various forms. While I don't know of any
ruby implementation or wrapper for these ideas/software it would be a great
project.

The goal of process group communication is to send a series of messages to a
named group of recipients and ensure that every member of the group receives
the messages in a globally consistent order in the presence of communication
and/or hardware failures. From this foundation you can build a variety of
fault tolerant systems.

Gary Wright

http://raa.ruby-lang.org/project/rb_spread/

cheers.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple. My religion is kindness.
| --Tenzin Gyatso
===============================================================================

Ara.T.Howard · Jul 1, 2005

Cool! After I posted my link I found the main Spread
site and have been reading about it for the last hour or so.

Now I have something to play with!

i think i may have a patched version of this around... seems like there was a
little buggette or two in it... let me know if you can't get it working and
i'll look for it.

cheers.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple. My religion is kindness.
| --Tenzin Gyatso
===============================================================================

C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
Getting output of program with some DRb for good measure	3	Apr 7, 2010
drb problem? ringy-dingy won't answer...	0	Jul 1, 2010
drb works on one system fails on other	2	Nov 17, 2008
Strange DRb and fork issues	7	Apr 29, 2007
DRb::DRbBadScheme when using drbunix sockets, why?	3	Feb 13, 2009
More DRb; SSL & DRB & errors	0	Jul 1, 2005
DRb Mysterious Stops	12	Aug 24, 2009

Fault Tolerant DRb?

Kirk Haines

Ara.T.Howard

Shashank Date

Kirk Haines

gwtmp01

Ara.T.Howard

Ara.T.Howard

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads