Re: default MTA
- To: debian-devel@lists.debian.org
- Subject: Re: default MTA
- From: Tollef Fog Heen <tfheen@err.no>
- Date: Sat, 01 Jun 2013 11:34:22 +0200
- Message-id: <[🔎] m238t2maxt.fsf@rahvafeir.err.no>
- Mail-followup-to: debian-devel@lists.debian.org
- In-reply-to: <87hahlz659.fsf@windlord.stanford.edu> (Russ Allbery's message of "Wed, 29 May 2013 17:02:42 -0700")
- References: <20130528010222.GA14069@bongo.bofh.it> <20130529190659.GB32022@angband.pl> <8761y1603c.fsf@windlord.stanford.edu> <201305291945.11107.Chris.Knadle@coredump.us> <87hahlz659.fsf@windlord.stanford.edu>
]] Russ Allbery
> Basically, what we're looking for here is the equivalent of a check engine
> light (except, of course, with better user-visible diagnostics available).
> That's what the end user actually wants: something clear and visible
> indicating that something is wrong, which they can drill down and see the
> details and dismiss the error condition if they want, or have all the
> details available to consult someone who knows more about computers if
> they don't know what to do with it themselves. Historically, root cron
> mail has been exactly that, and that's still a great way of handling it
> for servers, since that mail can be sent off somewhere centrally, analyzed
> and assigned to sysadmins, used to open internal trouble tickets, etc.
I don't think it's a good way at all, since far too often, cron mails
aren't actionable. I'll get a mail from some automated process that
tried to run apt-get update and that failed (during the middle of the
night). Since that process runs every hour, it'll have succeeded
afterwards, and there's nothing I can do about the mail.
I wish we had a better system where some, but not all errors would latch
and need acknowledgment, there would be correlation (between hosts and
between messages, so if the router's down, you get a message about data
centre A not being able to successfully complete $process, rather than a
zillion individual messages), there would be merging of identical
messages, so I get a message about $process being broken for the last
$time period (or having a failure rate above $threshold), rather than a
thousand mails because of some error.
Oh, and a pony. Don't forget the pony. Or an otter, I like otters.
--
Tollef Fog Heen
UNIX is user friendly, it's just picky about who its friends are
Reply to: