[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Upgrade a mail server




Craig Sanders wrote:
> 
> On Mon, Feb 18, 2002 at 06:02:38PM -0600, Rich Puhek wrote:
> > Craig Sanders wrote:
> > > i'd love to convert it over to Maildir/ but haven't yet found any way
> > > that doesn't involve many hours of downtime while converting the
> > > mailboxes from mbox format to Maildir.
> > >
> > > one of these days i'll have the time to sit down and work out a good
> > > solution to the problem.  i've got some ideas but no time to work them
> > > out.
> >
> > Look at qmail's site. They've got some nice tools for scripting the
> > switch. Some will need a little bit of cooking for your specific
> > implementaion, but basicly it's there.
> 
> yes, i've seen all that.
> 
> none of the available scripts do the job in what i consider to be a safe
> manner while the system is STILL RUNNING as a mail server (handling both
> incoming mail AND pop & imap connections from hundreds of users).
> 
Yea, there are some limitations. Remember though, once you switch to
Maildir, your script can safely plop data into the Maildirs while MDAs
are dropping mail in and MUAs are removing it. That's a key to avoiding
disaster, and one of my favorite Maildir features.

> > Downtime will be the time it takes to install procmail (or other
> > Maildir happy LDA) and restart sendmail (assuming home directories are
> > setup properly.) Also need to restart your POP3 daemon of course.
> > Then, you can fire off a script to convert the mailboxes.
> 
> no, there is a LOT more to it than that...and the fact that NONE of the
> existing conversion scripts take that into account is why i don't
> consider any of them safe enough to use as is.
> 
There's definately more to it, but your downtime (in terms of not being
reachable via SMTP or POP) is of short duration. Downtime in terms of
mail not being accessable will obviously depend on the quantity of data
and the machine performance.

I agree with the safety issue of the convertion scripts. That's why I
added a few wrinkles to the scripts I used.
  1) Cooked in some sanity checking (does $HOME/Maildir exist? who owns
it?, etc. skip mbox on failure, print warning so I can handle it later).
  2) Added a lot of verbosity to the scripts.
  3) ran on small chunks of the user list (~100 users at a crack, just
so I had less mess to repair if the stuff hit the fan)
  4) Before shutting down SMTP and POP: made a copy of /var/spool/mail
  5) Immediately after shutting down SMTP and POP: rsynced the copy
(less downtime than full copy at that point).

> you might want to risk it on a home system with only a few dozen
> accounts or less, but nobody in their right mind would convert a
> professional mail server with thousands of accounts without first
> planning out every step of the transition and taking the time to think
> of (and work around) every little thing that *could* go wrong.
> 
Definately true, that's what I did on the last cutovers I did (~1500
users on the last one, about ~3000 users on the one prior). And I got to
cut over from Post.Office on NT to Linux/Sendmail/courier-pop. Ugh!

> end-users care about their mail, and any fault with the mail system is
> highly-visible....they WILL call the helpdesk if they can't get their
> mail.
> 
> there are several race-conditions which must be avoided, and oddities
> that WILL cause users to call the helpdesk and whine that they can't
> read their mail (e.g. when the pop daemon is Maildir/ but their mailbox
> hasn't been converted yet because it takes a long time  to convert
> several gigabytes of other users mbox files to maildir.  users will not
> tolerate being unable to access their mail for an hour, let alone a day
> or several days)
> 
You definately will get some calls if mail is not showing up. My most
recent convertion (the P.O. to Sendmail) had to haul ~2GB of mail up
over a 3MB link (moved the mail server from one POP to another while we
were at it). Not only did I have to transfer the mail, I got to strip
the damn ^Ms out (did I mention how much I miss that Post.Office
server?)  Did have some complaints the next morning, but everyone was
happy when we told them that an upgrade was in progress and the mail was
on its way.

That (lag in convertion) is the biggest hurdle (assuming you already
have user accounts on the machine). You should be alright if you have a
reasonably fast machine. Also worth considering is exploiting any
maintenance windows you may have. You didn't state if you're an ISP or
another institution. If you're a business, university, etc, you most
likely have a defined window when "the system may be down". Even if
you're an ISP (for some reason ISPs got the shaft, and are supposed to
have everything working at full capacity, latest version of everything,
but are never allowed to work on anything), you will have an overnight
period of very low activity. This time could be perfect for such a
cutover.

> the problem isn't unsolvable.  it's just tedious to get it right.  i've
> already mapped out ways around these problems (*) - or at least the
> problems that i've thought of - one of these days when i get the time
> i'll implement it on a test system, and then run it on my real system
> when it is working to my satisfaction.
> 
> in short: if you plan the transition right then the users won't even
> notice that you've changed anything, it will all happen seamlessly
> behind the scenes.   if you don't plan it properly then your helpdesk
> will be overloaded with calls about mail problems.
> 
> (*) it involves using semaphore files in each users homedir to indicate
> whether they are maildir or still mbox, plus procmail/maildrop rules to
> deliver accordingly, and a pop proxy which chooses whether to connect to
> the mbox or Maildir capable pop daemon.  when that's done it is possible
> to safely convert all the mboxes to maildirs
> 
Yow. That might be overkill. I'd try some testing to determine how long
the migration process will take. If you're talking about anything under
4 hours, you can probably count on running from, say 2AM to 6AM. Set
your, ahem, pickier users to be transfered first. Safety isn't going to
be an issue with Maildirs, the only concern I'd have is making sure that
before you run your script to convert the mboxes you have disabled any
other process that will touch an mbox. Last convertion I did our
customer (an ISP with ~3000 users) had accepted a maintenance window
where POP and SMTP services would be unavailable, so I simply made sure
sendmail and qpopper were dead, then for good measure I believe I
changed permissions, ownership, or the directory name of
/var/spool/mail, just in case.

If preliminary testing indicates you can do the changeover is a matter
of very few hours (which wouldn't surprise me with decent hardware), you
may be better off in terms of effort required than you think. Not to
knock the proposed solution (which is rather elegant, BTW), but we held
up a convertion based on a similar set of concerns. Turns out, the wait
for us to develop the tools necessary created much more of an impact on
our users than the actual server upgrade.

If tests indicate a longer cutover time, then you've got more issues,
and the procmail magic will be the direction you'll have to go...

Here's another angle to consider: perhaps a watchdog script to fire off
a convertion script as soon as the user logs in (PPP-wise, this does
assume you're in a dialup-ISP type environment). By the time they finish
connecting, negotiating, and resolving the POP server IP address, most
of their mail will be converted. On the downside... not sure I trust a
script to run that unattended for such a critical process.

Best of luck when you finally convert. You're on the right track to make
sure it doesn't end in disaster.

> craig
> 
> --
> craig sanders <cas@taz.net.au>
> 
> Fabricati Diem, PVNC.
>  -- motto of the Ankh-Morpork City Watch

-- 

_________________________________________________________
                         
Rich Puhek               
ETN Systems Inc.         
_________________________________________________________



Reply to: