[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: best way to keep web servers in sync



On Sun, Mar 03, 2002 at 11:20:26PM +0800, Patrick Hsieh wrote:
> I have some apache web servers under the server load-balancer.  To
> keep the static html and php files in sync among them, I have to use
> NFS to achieve the purpose.  My problem is, since NFS is not quite
> stable and is prompt to be a network bottleneck, I have to find other
> methods to keep the files in sync.
> 
> I am considering to use rsync, but I have to write scripts to automate
> the rsync behavior. 

rsync would be a good way of doing it.  the scripting involved is
probably very simple unless you have a very weird setup.

it could be as simple as something like:

    #! /bin/bash
	
    SERVERS="server1 server2 server3 server4"
	DIRECTORIES="/var/www /usr/lib/cgi-bin"

	LOCKFILE="/var/run/rsync-in-progress"
    
	# exit if the script is already running
	[ -e $LOCKFILE ] && exit 0

	touch $LOCKFILE
    
    RSYNC_ARGS="-e ssh -ar --blocking-io --delete"

    for host in $SERVERS; do
	  for dir in $DIRECTORIES ; do
        rsync $RSYNC_ARGS $dir/ $host:$dir
      done
    done

	rm -f $LOCKFILE

it is important to make sure that only one instance of the rsync script
is running at a time.  this script uses a very simple lockfile method.

this script assumes that ssh is already set up to allow passwordless
access from the "master" server to the "slaves".

depending on the amount of data to be rsynced, the script above probably
won't scale beyond 5 or 10 servers...it would take too long to iterate
through all servers.


> Another criteria is, the files could be updated in
> any server among them, after that, the first updated one should rsync
> to others to make EVERYTHING in sync.

the major problem with using rsync for this job is, as you pointed out,
that the files might be updated on any server.  the best way to solve
that is by designating one of the machines as "master" and require all
uploads to go that machine...by simplifying the problem from a
many-to-many situation to a one-to-many situation, you avoid all of the
more difficult synchronisation issues.


the second major problem is that rsyncing hundreds of megabytes to
multiple servers is a heavy load for a machine, so you can't be running
it out of cron every 5 minutes.  so the update is not "real-time".
there are several ways to minimise (but not eliminate) this
problem...here's a few ideas:

1. monitor /var/log/auth.log and trigger the rsync script when an ftp
session disconnects.

2. have a cgi script allowing users to click on a button to trigger the
rsync script.


also run the rsync script from cron every few hours.



> Any good suggestion? 

read the rsync pages at http://rsync.samba.org/ (especially the
FAQ-o-Matic) and the rsync tutorial at http://everythinglinux.org/rsync/


another possibility is to use a different network file system.  i
haven't used it myself, but CODA is supposed to be pretty good, and IIRC
supports some kind of off-line operation (in case the file server is
unavailable) via a cache on each client.


craig

-- 
craig sanders <cas@taz.net.au>

Fabricati Diem, PVNC.
 -- motto of the Ankh-Morpork City Watch



Reply to: