High Availability Cluster Help

To: debian-user@lists.debian.org
Subject: High Availability Cluster Help
From: Leslie Rhorer <lrhorer@mygrande.net>
Date: Sun, 2 Aug 2015 14:18:10 -0700 (PDT)
Message-id: <[🔎] f9cea4e9-de12-48d7-b7b5-3fcf4b79f425@googlegroups.com>

I need a little (or maybe more than a little) advice and guidance on setting up a High Availablity cluster on some Debian machines.  I've read through the man pages and the config files, but I'm falling short of understanding everything I need to do.  I am still in the process of obtaining all the software, so I don't yet have the full process plan laid out, let alone all configured, but I know I already need some help, so thanks in advance.

First of all, let me outline the situation.  I've written an HVAC control suite that works with a number of wireless thermostats to control the air handlers in my house.  I have one device which monitors the status lines from the thermostats, and a second device that controls relays that open and close the air vents as needed and turns on or off the appropriate air handlers as needed.  Each of those devices have an IP address and are controlled by a simple binary ( c program ).  The binary sits in memory and polls the contact monitor every couple of seconds and then using the data obtained from the monitor writes the data to the relays and outputs the information to a couple of small data files for use by some scripts that provide CL and Web status of the systems.  One of these scripts has to run once a minute or so in order to maintain a historical record of how much time each unit spends running.  I have the Web data online at

http://fletchergeek.homelinux.net

for anyone who would like to see.

Right now I have the binary and the CL scripts running on a little Raspbery Pi (hostname Thermostat), but I don't want to have my Air Conditioning fail if the little RPi is down or for whatever reason not talking to either of the two terminal devices.  I have two servers (hostnames RAID-Server and Backup), either of which can take over in that event.  If one or both of the terminal devices are unavailable to all three servers, then I want to be alerted to the fact.  By my understanding so far, this means the three servers need to be set up as cluster members and the two terminal devices set up as pseudo-cluster ping devices.  I have a set of scripts that restart the binary if it hangs and reports to me if it has to restart the binary more than 5 times in a row, but I believe all that can be handled by the cluster manager (or is it directly handled by Heartbeat?).  Right now a cron job handles running the data collection correlation once a minute, but I take it that function will have to be taken over by a script that runs continuously on the active cluster node.  That's about as far as I have gotten though.

Both big machines are running Debian Jessie, while the RPi is running Raspbian, a Wheezy derivative.  To my understanding, Pacemaker is the best cluster management system for this purpose, but evidently one of the libraries used by Pacemaker did not make it to the Jessie distro, so the entire package has been removed from the distro.  Unless something has changed in the last couple of months, evidently I am going to have to compile from source on those to machines.  Pacemaker should be available on the RPi, but it will no doubt be a different version than that running on the Jessie machines.  Will that create issues?

Below is what I have so far on one of the Jessie machines.  I know it is not complete, and may not be entirely correct.  After confirming / fixing the below files, I think my next step is setting up haresources, but I'm quite unsure what needs to be set up in that file.  Once that gets done, where do I go from there?

ha.cf:
logfile /var/log/ha-log
logfacility     local0
keepalive 2
deadtime 30
warntime 10
initdead 120
udpport 694
auto_failback on
node    RAID-Server
node    Backup
node    Thermostat
ping    192.168.1.117
ping    192.168.1.118
respawn hacluster /usr/lib/heartbeat/ipfail
deadping 10

authkeys:
auth 2
2 sha1 HI!

Reply to:

Follow-Ups:
- Re: High Availability Cluster Help
  - From: Gilles Mocellin <gilles.mocellin@nuagelibre.org>
- Re: High Availability Cluster Help
  - From: Miles Fidelman <mfidelman@meetinghouse.net>
- Re: High Availability Cluster Help
  - From: Erick Ocrospoma <zipper1790@gmail.com>

Prev by Date: Re: systemd!
Next by Date: Re: systemd!
Previous by thread: Re: systemd!
Next by thread: Re: High Availability Cluster Help
Index(es):
- Date
- Thread