Re: high performance, highly available web clusters

To: Russell Coker <russell@coker.com.au>
Cc: debian-isp@lists.debian.org, David Wilk <myca@cia-g.com>
Subject: Re: high performance, highly available web clusters
From: David Wilk <myca@cia-g.com>
Date: Thu, 20 May 2004 10:27:12 -0600
Message-id: <[🔎] 20040520162712.GC27061@scsconnect.net>
In-reply-to: <[🔎] 200405210123.52574.russell@coker.com.au>
References: <[🔎] 20040520054831.GA21694@scsconnect.net> <[🔎] 200405210123.52574.russell@coker.com.au>

On Fri, May 21, 2004 at 01:23:52AM +1000 or thereabouts, Russell Coker wrote:
> On Thu, 20 May 2004 15:48, David Wilk <myca@cia-g.com> wrote:
> > The cluster is comprised of a load-balancer, several web servers
> > connected to a redundant pair of NFS servers and a redundant pair of
> > MySQL servers.  The current bottle-neck is, of course, the NFS servers.
> > However, the entire thing needs an increase in capacity by several
> > times.
> 
> The first thing I would do in such a situation is remove the redundant NFS 
> servers.  I have found the NFS client code in Linux to be quite fragile and 
> wouldn't be surprised if a cluster fail-over killed all the NFS clients (a 
> problem I often had in Solaris 2.6).

In this case the webservers (NFS client) and NFS servers are FreeBSD.  I
believe FreeBSD's NFS is a bit more reliable than with Linux.  However,
for pure performance (and scalability) reasons, the NFS has got to go.
Local disks can be used for content that doesn't need to change in real
time.  that's what the Mysql servers are for.

Now, here's the other question.  Now that the web cluster can scale the
static content ad infinitum, what about the dynamic content?  What can
be done with Mysql to load balance?  currently they do what everyone
does with two stand-alone Mysql servers that are updated simulataneously
with the client writing to both.  The client can then read from the
backup Mysql server if the primary fails.  I could just build two
massive stand-alones, but a cluster would be more scalable.
> 
> > However, for alot less money, one could simply do away with the file
> > server entirely.   Since this is static content, one could keep these
> > files locally on the webservers and push the content out from a central
> > server via rsync.  I figure a pair of redundant internal web server
> > 'staging servers' could be used for content update.  Once tested, the
> > update could be pushed to the production servers with a script using
> > rsync and ssh.  Each server, would of course, require fast and redundant
> > disk subsystems.
> 
> Yes, that's a good option.  I designed something similar for an ISP I used to 
> work for, never got around to implementing it though.  My idea was to have a 
> cron job watch the FTP logs to launch rsync.  That way rsync would only try 
> to copy the files that were most recently updated.  There would be a daily 
> rsync cron job to cover for any problems in launching rsync from ftpd.
> 
> With local disks you get much more bandwidth (even a Gig-E link can't compare 
> with a local disk), better reliability, and you can use the kernel-httpd if 
> you need even better performance for static content.  Finally such a design 
> allows you to have a virtually unlimited number of web servers.

Agreed.  I think the last comment on scalability is key.  I hadn't
thought of that.  Removing the common storage makes adding more
webservers as easy as dropping in more boxes to the cluster and updating
the load-balancer.  Adding mores storage is not a chore either.  Servers
can be removed one at a time for disk upgrades.  or, simply add new ones
and retire the old ones, add more drives to the RAID... etc.

thanks for the advice!

Dave
> 
> -- 
> http://www.coker.com.au/selinux/   My NSA Security Enhanced Linux packages
> http://www.coker.com.au/bonnie++/  Bonnie++ hard drive benchmark
> http://www.coker.com.au/postal/    Postal SMTP/POP benchmark
> http://www.coker.com.au/~russell/  My home page

-- 
*******************************
David Wilk
System Administrator
Community Internet Access, Inc.
myca@cia-g.com

Reply to:

Follow-Ups:
- Re: high performance, highly available web clusters
  - From: Jeremy Zawodny <Jeremy@Zawodny.com>

References:
- high performance, highly available web clusters
  - From: David Wilk <myca@cia-g.com>
- Re: high performance, highly available web clusters
  - From: Russell Coker <russell@coker.com.au>

Prev by Date: Re: high performance, highly available web clusters
Next by Date: Re: high performance, highly available web clusters
Previous by thread: Re: high performance, highly available web clusters
Next by thread: Re: high performance, highly available web clusters
Index(es):
- Date
- Thread