[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Mirror applications on Linux Server



Rob Owens wrote:

> On Tue, Sep 28, 2010 at 07:47:29PM -0500, lrhorer wrote:
>> Michal wrote:
>> 
>> >   On 28/09/10 14:05, Miles Fidelman wrote:
>> >> lrhorer wrote:
>> >>> If there is a better forum for this, let me know and I will post
>> >>> my questions there.
>> >>>
>> >>> I am building an application which needs to have high
>> >>> reliability. I have two essentially identical Linux servers which
>> >>> can host the
>> >>> application.  Right now, I have the programs - a bash script and
>> >>> a c
>> >>> binary, running on one machine every minute in a cron job.  I
>> >>> also have an rsync cron job running to synchronize the files on
>> >>> the standby
>> >>> machine so the data (and binaries, of course) will be identical.
>> >>> What I need to do is have the standby machine take over
>> >>> operations if the applications on the primary machine quite
>> >>> working, for whatever reason. Of course I can easily ping the
>> >>> primary to make sure the machine is up, but what is going to be
>> >>> my best bet for having the standby machine wake up and start
>> >>> running the apps every minute until such time as the
>> >>> primary comes back online?  I'm wide open on how to implement. 
>> >>> An external application would be great, or I could write either
>> >>> or both c or shell apps to have the two machines talk to one
>> >>> another.
>> >> It may be overkill, but take a look at Pacemaker and the Linux-HA
>> >> Project - http://www.linux-ha.org - it's specifically intended for
>> >> such applications.
>> >>
>> >> Also look at DRBD - www.drbd.org - which mirrors a disk (or
>> >> partition), in realtime across two machines.
>> >>
>> >> The combination gives you automated fail-over capability.
>> >>
>> >> Now, if you want to get really fancy, you can run your application
>> >> in a virtual machine, and use pacemaker and DRBD to fail-over the
>> >> entire VM.
>> >>
>> >> Be warned, it takes a while to get all of these working properly -
>> >> both individually and in combination.  You could also take a look
>> >> at ganeti - http://code.google.com/p/ganeti/ - which pulls a bunch
>> >> of the pieces together.
>> >>
>> >>
>> > DRBD and soforth might be overkill, or it might not. If you think
>> > it is and are happy with the rsync you can stick to that and use
>> > heartbeat to monitor the application. Easy to setup as well
>> 
>> Thanks.  I skimmed the intro, and I'm not sure heartbeat is really
>> what
>> I need.  These aren't server applications that run full-time on the
>> machine.  They are relatively simple programs that only take a few
>> seconds to run, and then terminate (hopefully with a return value of
>> 0), to be run again in 60 seconds.  I don't need the inter-machine
>> service to cause something to happen on the standby system if the
>> apps
>> are no longer running, or at least not immediately so.  Rather, I
>> need the standby machine to take over if:
>> 
>> 1. The primary machine is no longer on the network (clearly the
>> heartbeat app can do this).
>> 
>> 2. The applications being run by cron fail to run, say, 5 or 6 times
>> in a row.
>> 
>> 3. One or more of the applications fails to terminate (hangs).
>> 
>> 4. One or more of the applications terminates with other than a 0
>> status.
>> 
> How about this:
> 
> Every time Server A does its thing, it signals success to Server B.
> 
> Server B runs a timer script that resets every time it receives the
> "success" signal from Server A.  If it doesn't receive the signal, it
> "does its thing" and signals success to Server A.

Yeah, I was thinking of writing a simple client / server pair in c that
opens a UDP socket and sends a status from the client on B to the
server on A.
 
> Server A is also running a timer script, and it resets when it gets
> the success signal from Server B.

'Not needed.  Server B will be trying to run the process, regardless. If
it works, it signals A, which is inhibited from doing anything.  If it
fails for whatever reason, then A will take over when it doesn't get an
update.  If server A croaks completely, B doesn't really care.  It will
still be happily running the process.  If any aspect of server B's
processes fail, A does its bit.

> So now Server B will act as "primary", and Server A will just wait
> until
> Server B fails to send a success signal.  ...Unless you adjust the
> timing of the script and the cron job.  For instance:

The process doesn't care in the least which server is actually running,
so there is no need to "flip-flop" server rolls.  There are certainly
cases where one server must actually assume the mantle of the other in
order to take over, particularly, for example, if external processes
are seeking to initiate conversations with the server, but that is not
the case, here.  The active server initiates all conversations with the
subordinate host.  Indeed, for this application, it would mostly not
really hurt if both machines ran the processes, as long as both did not
attempt to run  them at the exact same time.  There are a few
scenarios, however, where running the process on both machines a few
seconds apart could produce some undesirable results, so I am going to
inhibit machine A from doing anything as long as machine B is alive and
well.
 
> Server A could be set to do its thing every 5 minutes, assuming it has
> not received a success signal from Server B for the last 4 minutes.
> This gives Server A the opportunity to become primary again after
> Server B had been doing its thing.
> 
> Server B would be set to do its thing every 5 minutes, assuming it has
> not received a success signal from Server A for the last 5 minutes (or
> 6 minutes, or whatever seems reasonable).
> 
> Sorry I can't help w/ the timer script.  But I think something like
> "ssh
> serverA touch timerfile" would be sufficient to reset the timer.  Then
> your script could check for the timestamp on "timerfile".

Yeah, you know, that would work.  Creating a timer is no big deal.  If I
use your ssh example, I could simply do something like 

find -cmin +5 -name timerfile -exec <Process> \;

Since both processes are running every minute under cron, nothing more
is needed.

As long as server B is alive and talking, timerfile will never be older
than 1 minute.  If B goes down for 4+ minutes, <Process> will run every
minute on A.  Once B comes back, timerfile will once again be young. 
As long as I add a little delay on A or B so the two processes don't
collide endlessly, it should work.

Hmm. 


Reply to: