Can we build a proper email cluster? (was: Re: Why is debian.org email so unreliable?)
- To: debian-isp@lists.debian.org
- Subject: Can we build a proper email cluster? (was: Re: Why is debian.org email so unreliable?)
- From: Henrique de Moraes Holschuh <hmh@debian.org>
- Date: Tue, 12 Oct 2004 18:29:22 -0300
- Message-id: <[🔎] 20041012212922.GA10921@khazad-dum.debian.net>
- In-reply-to: <20041012184336.GA3092@yzma.clarkk.net>
- References: <20040921141306.GY642@parcelfarce.linux.theplanet.co.uk> <Pine.LNX.4.58.0409241218360.1184@gradall.private.brainfood.com> <Pine.LNX.4.58.0409241230280.1184@gradall.private.brainfood.com> <20041012132627.GA1194@tonelli.sns.it> <20041012184336.GA3092@yzma.clarkk.net>
So, in one way or the other, the problem is that our email infrastructure
is inadequate at the server level?
We have a lot of resources, why can't we invest some of them into a small
three or four machine cluster to handle all debian email (MLs included), and
tune the entire thing for the ground up just for that? And use it *only* for
that? That would be enough for two MX, one ML expander and one extra
machine for whatever else we need. Maybe more, but from two (master +
murphy) two four optimized and exclusive-for-email machines should be a good
start :)
It is not like proper email processing is not one of the top three most
critical infrastructure parts of the project. Too much of Debian relies on
this. The BTS needs it working or it is effectively read-only. Our
colaborative work needs the MLs in tip-top shape, or it suffers a LOT. Way,
way too many developers use @debian.org as their primary Debian contact
address (usually the ONLY well-advertised one), and get out of the loop
everytime master.d.o croaks.
One of the obvious things that come to mind is that we should have MX
machines with very high disk throughput, of the kinds we need RAID 0 on top
of RAID 1 to get. Proper HW RAID (defined as something as good as the Intel
SCRU42X fully-fitted) would help, but even LVM+MD allied to proper SCSI U320
hardware would give us more than 120MB/s read throughput (I have done that).
Maybe *external* journals on the performance-critical filesystems would help
(although data=journal makes that a *big* maybe for the spools, the logging
on /var always benefit from an external journal). And in that case, we'd
need obviously two IO-independent RAID arrays. That means at least 6 discs,
but all of them can be small disks.
The other is to use a filesystem that copes very well with power failures,
and tune it for spool work (IMHO a properly tunned ext3 would be best, as
XFS has data integrity issues on crashes even if it is faster (and maybe the
not-even-data=ordered XFS way of life IS the reason it is so fast). I don't
know about ReiserFS 3, and ReiserFS 4 is too new to trust IMHO).
The third is to not use LDAP for lookups, but rather cache them all in a
local, exteremly fast DB (I hope we are already doing that!). That alone
could get us a big speed increase on address resolution and rewriting,
depending on how the MTA is configured.
Others in here are surely even better experienced than me in this area, and
I am told exim can be *extremely* fast for mail HUBs. Why can't we work to
have an email infrastructure that can do 40 messages/s sustained?
--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh
--
Please respect the privacy of this mailing list.
Archive: file://master.debian.org/~debian/archive/debian-isp/
To UNSUBSCRIBE, use the web form at <http://db.debian.org/>.
Reply to: