[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: High volume mail handling architecture



On Thu, 9 Sep 2004 18:44, Marcin Owsiany <porridge@debian.org> wrote:
> On Thu, Sep 09, 2004 at 06:03:20AM +1000, Russell Coker wrote:
> > You have to either be doing something very intensive or very wrong to
> > need more than one server for 20K users.  Last time I did this I got 250K
> > users per server, and I believe that I could have easily doubled that if
> > I was allowed to choose the hardware.
>
> We have a little over 10K users, and the disk subsystem seems to be the
> bottleneck. When we reach about 600 read transactions + 150 write
> transactions per second (as reported by sar -b), the load average starts
> to grow expotentially instead of proportionally. There are about 20K
> sectors read, and 3K written per second. (That was before I turned noatime
> on. After that we had about 2K sector writes and 70 write transactions
> less, and load average dropped to a more sane value - about 3, instead
> of 20.)

Last time I was doing this I had some Dell 2U servers (2650 from memory) with 
4 * 10K U160 disks in a RAID-5 (5th disk was hot-spare) and something like 4G 
of RAM.  The machines had almost no read access to the drives, something less 
than 10% of disk access was for read because the cache worked really well 
(the accounts that receive the most mail are the ones that have clients 
checking them most often - in some cases people leave their email client on 
24*7 checking every 5 mins).

The write bottleneck was just under 3M/s, I don't recall how many transactions 
that was.

To give better performance you may want to look at getting more RAM.  RAM is 
cheap and you can eliminate most read bottlenecks by caching lots of stuff.

3K sectors written per second isn't too good, but I guess that's because of 
the 20K sectors read.  Get some more cache and things should improve a lot.

Also if using a typical Unix mail server (Postfix, Sendmail, etc) then the 
data is written synchronously somewhere under /var before being read from 
there and written to the destination.  If you use a NVRAM card from UMEM 
http://www.umem.com/ for /var/spool then you could possibly double mail 
delivery performance.  If you use data=journal and put the journal for the 
mail store file system on the umem device you could probably double 
performance again.

> Also, did you implement virus/spam scanning on that box?

No!  Virus/spam scanning was on the front-end machines.  It was believed that 
the mail store machines were busy enough with doing the most basic work 
without virus scanning (also the number of licences for the anti-virus 
program didn't match the number of store machines that were planned).

You want to do as much work away from the mail store as possible.  Mail store 
machines can not be replaced without major inconvenience to everyone 
(customers, staff, management).  Front-end anti-virus machines are 
disposable, if you have a traffic balancing device (such as a Cisco 
LocalDirector or IPVS) in front of a cluster of anti-virus machines then an 
anti-virus machine can go down for a few days without anyone bothering.

If (hypothetically) anti-virus was to take 10% of the performance from a mail 
store then it could require another mail store machine (if you have 5-10 
machines) and therefore that's one more machine which can break and cause 
massive pain to everyone.

Another thing, a mail store machine should require almost no CPU power.  Give 
it a single CPU that's not the fastest available.  It sucks when you have two 
almost unused CPUs which are both fast and hot and then one breaks down 
killing the machine.

-- 
http://www.coker.com.au/selinux/   My NSA Security Enhanced Linux packages
http://www.coker.com.au/bonnie++/  Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/    Postal SMTP/POP benchmark
http://www.coker.com.au/~russell/  My home page



Reply to: