[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Quality Assurance of mailserver?



lør, 14 05 2011 kl. 12:11 -0300, skrev Henrique de Moraes Holschuh:
> On Sat, 14 May 2011, Camaleón wrote:
> > > I am running a mailserver with exim, courier-pop, courier-imap and
> > > squirrelmail. I has been running "fine" for about 18 months now. But how
> > > can I assure my self and my customers that I deliver a good quality
> > > mail-server? 
> > 
> > You can endorse a SLA (Service Level Agreement) that ensures and metrics 
> > some basic aspects of your service.
> 
> Yes.  Assuming you're going to do it professionaly (and not
> semi-professionaly, or whatever):
> 
> 1. Measure the *SERVICE*, which OUGHT to be different from measuring a
> server (because really, you need at least two in active/active or
> active/passive servers in an high-availability cluster to deliver something
> worth paying for).
> 
> 2. Measure the standard metrics:
>    a) service uptime (pingdom.com, custom scripts)
>       - this is not just SERVER uptime.
>       - service availability
>    b) service performance metrics
>       - delivery latency (incoming and outgoing.  Outliers are not
>         a problem.  Average delay to deliver 90% of the email, is).
>       - rejects (incoming and outgoing)
>       - spam/virii blocking
> 
> 3. Measure per-user metrics
>    a) incoming/outgoing bandwitdh usage
>    b) storage quota
> 
> 4. Have a data recovery and backup strategy, provide an "undelete" service
>    (your users will demand it), and test them often (or you're toast at the
>    first failure).
> 
> 5. Use the high-availability cluster capabilities to do rolling upgrades, an
>    outdated server is a compromised server... and this also makes sure the
>    HA is working well (which also means you NEED to do the fail over in a
>    monitored way to make sure it won't go up in flames if it fails.  E.g.
>    whatever you do, don't use a filesystem that will not tolerate
>    active/active usage even if normally you prefer to operate in
>    active/passive mode.
> 
> You will need scripts.  munim, nagios, cacti, net-snmp can help you a lot to
> monitor the servers.  queuegraph and mailgraph can help you monitor the MTA
> performance.  You will have to write scripts to monitor service latency
> (send email, measure time until it is delivered to a system pop account,
> measure time until it is delivered at a remote account in gmail, hotmail,
> yahoo, other providers).
> 
> You will need to enroll on yahoo/hotmail/etc spam feedback loops, and keep
> strict monitoring of all RBLs, and take a lot of care with the reputation of
> your outgoing servers and email domain.  You will need a way to temporarily
> switch the outgoing server to a different IP if you get blacklisted (but do
> so only after you *HAVE* fixed the issue that resulted in the blacklisting).
> 
> The list goes on and on.  SPAM, and the "email attestation" industry
> fomented by the big email services like hotmail and yahoo, are causing a big
> problem.
> 
> -- 
>   "One disk to rule them all, One disk to find them. One disk to bring
>   them all and in the darkness grind them. In the Land of Redmond
>   where the shadows lie." -- The Silicon Valley Tarot
>   Henrique Holschuh
> 
> 

Hi all,
Thank you for you fine answers. I got a lot of inspiration. I know now
that some kind of HA cluster is my target, but the first steps will be
more monitoring with nagios, munin and pingdom. Also I will write some
kind of SLA that my users can use to track my level of service.
Thanks again, and more suggestions are always welcome! :-)

Regards
Lars Nielsen
www.lfweb.dk


Reply to: