Re: Quality Assurance of mailserver?

On Sat, 14 May 2011, Camaleón wrote:
> > I am running a mailserver with exim, courier-pop, courier-imap and
> > squirrelmail. I has been running "fine" for about 18 months now. But how
> > can I assure my self and my customers that I deliver a good quality
> > mail-server? 
> You can endorse a SLA (Service Level Agreement) that ensures and metrics 
> some basic aspects of your service.

Yes.  Assuming you're going to do it professionaly (and not
semi-professionaly, or whatever):

1. Measure the *SERVICE*, which OUGHT to be different from measuring a
server (because really, you need at least two in active/active or
active/passive servers in an high-availability cluster to deliver something
worth paying for).

2. Measure the standard metrics:
   a) service uptime (pingdom.com, custom scripts)
      - this is not just SERVER uptime.
      - service availability
   b) service performance metrics
      - delivery latency (incoming and outgoing.  Outliers are not
        a problem.  Average delay to deliver 90% of the email, is).
      - rejects (incoming and outgoing)
      - spam/virii blocking

3. Measure per-user metrics
   a) incoming/outgoing bandwitdh usage
   b) storage quota

4. Have a data recovery and backup strategy, provide an "undelete" service
   (your users will demand it), and test them often (or you're toast at the
   first failure).

5. Use the high-availability cluster capabilities to do rolling upgrades, an
   outdated server is a compromised server... and this also makes sure the
   HA is working well (which also means you NEED to do the fail over in a
   monitored way to make sure it won't go up in flames if it fails.  E.g.
   whatever you do, don't use a filesystem that will not tolerate
   active/active usage even if normally you prefer to operate in
   active/passive mode.

You will need scripts.  munim, nagios, cacti, net-snmp can help you a lot to
monitor the servers.  queuegraph and mailgraph can help you monitor the MTA
performance.  You will have to write scripts to monitor service latency
(send email, measure time until it is delivered to a system pop account,
measure time until it is delivered at a remote account in gmail, hotmail,
yahoo, other providers).

You will need to enroll on yahoo/hotmail/etc spam feedback loops, and keep
strict monitoring of all RBLs, and take a lot of care with the reputation of
your outgoing servers and email domain.  You will need a way to temporarily
switch the outgoing server to a different IP if you get blacklisted (but do
so only after you *HAVE* fixed the issue that resulted in the blacklisting).

The list goes on and on.  SPAM, and the "email attestation" industry
fomented by the big email services like hotmail and yahoo, are causing a big

