[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: High Availability Cluster Help



A few comments, below:

Leslie Rhorer wrote:
I need a little (or maybe more than a little) advice and guidance on setting up a High Availablity cluster on some Debian machines.  I've read through the man pages and the config files, but I'm falling short of understanding everything I need to do.  I am still in the process of obtaining all the software, so I don't yet have the full process plan laid out, let alone all configured, but I know I already need some help, so thanks in advance.

First of all, let me outline the situation.  I've written an HVAC control suite that works with a number of wireless thermostats to control the air handlers in my house.  I have one device which monitors the status lines from the thermostats, and a second device that controls relays that open and close the air vents as needed and turns on or off the appropriate air handlers as needed.  Each of those devices have an IP address and are controlled by a simple binary ( c program ).  The binary sits in memory and polls the contact monitor every couple of seconds and then using the data obtained from the monitor writes the data to the relays and outputs the information to a couple of small data files for use by some scripts that provide CL and Web status of the systems.  One of these scripts has to run once a minute or so in order to maintain a historical record of how much time each unit spends running.  I have the Web data online at

http://fletchergeek.homelinux.net

for anyone who would like to see.

Right now I have the binary and the CL scripts running on a little Raspbery Pi (hostname Thermostat), but I don't want to have my Air Conditioning fail if the little RPi is down or for whatever reason not talking to either of the two terminal devices.  I have two servers (hostnames RAID-Server and Backup), either of which can take over in that event.  If one or both of the terminal devices are unavailable to all three servers, then I want to be alerted to the fact.  By my understanding so far, this means the three servers need to be set up as cluster members and the two terminal devices set up as pseudo-cluster ping devices.  I have a set of scripts that restart the binary if it hangs and reports to me if it has to restart the binary more than 5 times in a row, but I believe all that can be handled by the cluster manager (or is it directly handled by Heartbeat?).  Right now a cron job handles running the data collection correlation once a minute, but I take it that function will have to be taken over by a script that runs continuously on the active cluster node.  That's about as far as I have gotten though.

Both big machines are running Debian Jessie, while the RPi is running Raspbian, a Wheezy derivative.  To my understanding, Pacemaker is the best cluster management system for this purpose, but evidently one of the libraries used by Pacemaker did not make it to the Jessie distro, so the entire package has been removed from the distro.  Unless something has changed in the last couple of months, evidently I am going to have to compile from source on those to machines.  Pacemaker should be available on the RPi, but it will no doubt be a different version than that running on the Jessie machines.  Will that create issues?


My familiarity with pacemaker is in the more common cluster setup - mirroring virtual machine stack (in my case, Xen), and it's associated disks (using DRBD) - for automatic failover of entire virtual machines. Haven't tried using pacemaker by itself for application level failover.

One way to set things up would be simply to set up mirrored virtual machines, and let failover be handled at that level (you could use pacemaker or the Remus funcationlity of later Xen implementations).

You might also look at pure application layer redundancy - pacemaker might or might not help you - you might be better off just running two copies of your scripts with some basic synchronization and primary/secondary logic. (Personally, I'd do this in Erlang, which makes this kind of application rather trivial).

For more pacemaker help, check out the resources at http://clusterlabs.org/, and maybe pose this query on the linux-ha email list.

Hopes this helps,

Miles Fidelman

--
In theory, there is no difference between theory and practice.
In practice, there is.   .... Yogi Berra



Reply to: