Recovering from failure of a Rinda ring server

Dido Sevilla · Jan 21, 2005

I'm in the process of designing a fault-tolerant distributed
application using Ruby, and am looking at whether Rinda will be
suitable for this purpose. I am wondering what strategies are
available for recovering from the failure of the ring server, which
seems to me like a critical single point of failure. It is possible to
run the ring server/tuplespace daemon as part of a Linux-HA heartbeat
cluster to guard against physical failure of the primary ring server,
but this requires restarting the ring server on the secondary node.
The new ring server instance running on the backup server is now
ignorant of all services that were previously registered on the old
ring server before the failure. This is unacceptable for the
distributed application. Is there a way for live services to
automagically detect failure of the ring server, and automatically
reregister themselves with it when it goes back up in that case, or
some way for a primary and backup ring server to communicate with each
other and share information about registered services transparently?

Eric Hodel · Jan 22, 2005

--Apple-Mail-37--775966511
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII; format=flowed

I'm in the process of designing a fault-tolerant distributed
application using Ruby, and am looking at whether Rinda will be
suitable for this purpose. I am wondering what strategies are
available for recovering from the failure of the ring server, which
seems to me like a critical single point of failure. It is possible to
run the ring server/tuplespace daemon as part of a Linux-HA heartbeat
cluster to guard against physical failure of the primary ring server,
but this requires restarting the ring server on the secondary node.
The new ring server instance running on the backup server is now
ignorant of all services that were previously registered on the old
ring server before the failure.
Yup.

This is unacceptable for the
distributed application. Is there a way for live services to
automagically detect failure of the ring server, and automatically
reregister themselves with it when it goes back up in that case, or
some way for a primary and backup ring server to communicate with each
other and share information about registered services transparently?

1) Run more than one RingServer, and have each cross-register the
other's services. (This service doesn't have to run on the RingServer
itself, actually...)

2) The RingServer removes services automatically when a service's
renewer fails to respond. Renewers are invoked after some timeout. On
the service side, if the service's renewer is not invoked within a
timeout, you could have the service re-register itself, something like
IRC's PING/PONG handshake.

--
Eric Hodel - (e-mail address removed) - http://segment7.net
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04

--Apple-Mail-37--775966511
content-type: application/pgp-signature; x-mac-type=70674453;
name=PGP.sig
content-description: This is a digitally signed message part
content-disposition: inline; filename=PGP.sig
content-transfer-encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)

iD8DBQFB8gXcMypVHHlsnwQRAkBXAKCu2HWbhy3UhVWTfEROHAonSflOTgCeJJtS
+E0jSWEYBzaLIIcLEbOi490=
=GwCI
-----END PGP SIGNATURE-----

--Apple-Mail-37--775966511--

Distributed testing with Test::Unit and Rinda	0	Oct 9, 2006
WANdisco Announces New High Availability Disaster Recovery Solution for CVS, Subversion and CVSNT	0	May 10, 2006
Hotlist from Kraftware	1	Feb 16, 2007
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006
Richmond Jobs Update 01/10/05	2	Jan 11, 2005
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	1	Feb 1, 2004

Recovering from failure of a Rinda ring server

Dido Sevilla

Eric Hodel

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads