Re: NFS Failover
On 06/26/2013 09:11 PM, David Parker wrote:
> Hello,
>
> I'm wondering if there is a way to set up a highly-available NFS share
> using two servers (Debian Wheezy), where the shared volume can failover
> if the primary server goes down. My idea was to use two NFS servers and
> keep the exported directories in sync using DRDB. On the client, mount
> the share via autofs using the "Replicated Server" syntax.
>
> For example, say I have two servers called server1 and server2, each of
> which is exporting the directory /export/data via NFS, and /export/data
> is a synced DRDB filesystem shared between them. On the client, set
> up an autofs map file to mount the share and add this line:
>
> /mnt/data server1,server2:/export/data
>
> This is close, but it doesn't do what I'm looking to do. This seems to
> round-robin between the two servers whenever the filesystem needs to be
> mounted, and if the selected server isn't available, it then tries the
> other one.
>
> What I'm looking for is a way to have the client be aware of both
> servers, and gracefully failover between them. I thought about using
> Pacemaker and Corosync to provide a virtual IP which floats between the
> servers, but would that work with NFS? Let's say I have an established
> NFS mount and server1 fails, and the virtual IP fails over to server2.
> Wouldn't there be a bunch of NFS socket and state information which
> server2 is unaware of, therefore rendering the connection useless on the
> client? Also, data integrity is essential in this scenario, so what
> about active writes to the NFS share which are happening at the time the
> server-side failover takes place?
>
> In full disclosure, I have tried the autofs method but not the
> Pacemaker/Corosyn HA method, so some experimentation might answer my
> questions. In the meantime, any help would be greatly appreciated.
>
> Thanks!
> Dave
I have also studied NFS fail-over with Pacemaker/Corosync/DRBD and it
could work with NFSv3; NFSv4 uses TCP which makes things very hard. But
even with NFSv3 I stumbled over strange situations, the likes of which I
don't really remember, but the bottom line I have decided that NFS NFS
fail-over is too fiddly and hard to control reliably. Now I'm studying
using Gluster for replicating data between nodes and mounting the
gluster volumes on the clients via glusterfs - this seems like a much
better, simpler and more robust approach. I suggest you take a look at
Gluster, it's an exceptionally good technology.
--
Adrian Fita
Reply to: