[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Mass update deployment strategy



I would say that if you let your machines blindly to an "apt-get update;
apt-get upgrade" every day, most of the time it won't be a problem, but
someday it may be a problem and you might render half your cluster
unbootable.  There are various modifications to this "blind update" theme
as others have suggested, but I think the basic issue remains. There will
be some of those packages which ask a lot of questions.  The thing I think
you really need to wonder about is kernel packages.... a few times I have
used stock kernels (not ones I compiled), and when a new update comes out
the apt-get upgrade tries to install the new kernel, update lilo.conf, run
lilo, and advises you that reboot asap.  My personal track record with this
has been less than perfect, and a few times I've need to revert to an old
kernel, or use an emergency boot CD to fix the problem and then I'm all
set.  Therefore, kernel upgrades are something I want to do manually at
this point in my life, and I tend to stick with the same kernel for long
periods of time.

I am getting ready to deploy hundreds of small embedded devices running
Debian, and keeping them up to date is a potential nightmare, and I
consider it carefully.  Here is my strategy thus far:
- My devices are the same, hardware wise, and it's my intent to keep them
the same software wise.
- Remove anything I don't need.  Anything that's purged won't need to be
updated.  A samba server doesn't need gcc, or X, or frozen bubble, or
apache, or LaTex.  Eliminate those packages, and you remove the maintenance
concerns on those packages.
- General security approaches which reduce exposure.  Eliminate services,
use tight firewall rules, network isolation, read-only filesystems, and all
that.
- Look at the security updates and decide for yourself how exposed you are,
and if it's really necessary to upgrade a particular package right now.  If
they find a bug which is only exploitable at the console, and none of my
systems have a console, I don't need to worry about it really, or at least
I can put it off until I have a bunch that matter.  Or in the case of some
exotic (potential) risk with a race condition where my local users would
have do something rediculously complex, and I know they aren't smart enough
to know how to do that, I can weigh the cost/benefit of inaction and
potentially postpone this patch until I really need it.
- When I finally do have to make some updates, I will do a couple machines
by hand, and then make sure it works, then write a script which hits each
of the other boxes and does exactly the same.

That's my plan, we'll see how well it works.  These systems are small data
collection appliances, with no proprietary data, and if one of these gets
hijacked we have taken steps to prevent it from spreading, so the
consequences of a vulnerability are rather small.  If you have a public web
server, that's a whole 'nother story.

Good luck.
Joe




Reply to: