[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Very huge email service



On Tue, 23 Nov 1999, Baltazar Quinterno wrote:
>Does anyone had to configure mail servers to serve 30 millon users,
>with smtp and pop3, if so can you give any clue , because i dont know
>from where to start.

Let's start by doing some maths.  Assume that an average user checks their
email through a single POP connection once a day.  Some users check their
email every 5 minutes for several hours a day, some only once a week or less.
Anyway if it's 30M POP connections per day then it's:
30,000,000/24/60/60 = 347 POP connections per second.

Presume that an average user has 3 emails per day:
1041 SMTP connections per second.

Each POP connection receives on average 3 emails.

Presume that an average email is 10K:
10410KB of data written to disk through SMTP per second.

Presume that 80% of email is received by users and 20% is removed from
inactive accounts:
8328KB of data read from disk and sent out the net per second.

Those numbers are all averages.  Double them for peak usage.

>From this we can see that if you have a single machine you will need at least
2 fast-Ethernet connections to the mail system for average use and at least 4
fast-ethernet connections for peak use.

To sustain that sort of disk throughput you will probably need a RAID-10
array (do not consider RAID-5 - you need more write bandwidth than RAID-5 can
provide) that's capable of delivering 20MB/s writes, 16MB/s reads, and maybe
1000 seeks per second.  Doing this on a single array may be impossible. 
However if you are going to do it then you will probably need at least 12
hard drives (6 way striping with two copies of the data).  Two copies of the
data doubles read throughput and gives reliability.  6 way striping gives 6
times the bandwidth under really heavy load.  Each drive is capable of maybe
80 seeks per second so 12 drives may handle it.  12 drives will handle the
throughput (I've seen much more throughput with less drives).

Now calculate storage.  If each email on average sits on your server for a
day before being collected (the users who get the most mail check their mail
most often) and the average is 30K/day then the disk space requirements is:

30M*30K = 900GB

As the biggest drive I'm aware of is IBM's 72G drive that means we need 13 *
72G drives to store the data, and another 13 for RAID-1 mirroring.  Given
that my first calculations suggested that 12 drives might provide the
bandwidth using 26 seems like it should be OK.  We need some benchmarks
though.

Now there's the issue of disk layout.  You can't use a standard
/var/spool/mail because there is no file system for Linux which can be used
with 30M files in a directory (I know this for a fact).  There may be a file
system for some other Unix which can do this, but I have doubts.
Best thing to do is use Maildir for delivery, then maybe use a Netapp Filer
to store the data.  Your Netapp can have a couple of gigabit-Ethernet ports
and then talk to multiple Linux machines via NFS.  The Linux machines can
then be in a redundant configuration.  Netapp may be able to support 30M home
directories directly under /home.  However I'd suggest seperate partitions
for managability (you do not want to backup 900G at one go).

Then we come to authentication.  LDAP is the solution I would recommend.  I
have not got it working properly for shell accounts, but for email-only
accounts it passes all my tests.
Hopefully nscd will work well and we won't get 1200 LDAP requests per second
at peak times.

----

I hope this information helps.  Now I have to say, what you should do is hire
some professionals who have worked on moderate sized ISP servers before.

The largest system I have worked on was for 600,000 users.  Not as large as
what you are doing, but in the same ballpark.  I have colleagues with similar
experience who I can call on (setting this up will take 6 people 3 months of
work if all hardware is available immidiately IFF all tests show that things
will work - unexpected problems are likely to occur and maybe blow it out to
6 months).
Quite seriously my best advice for you is to contact me via private email
with regard to setting this up for you.


Russell Coker


Reply to: