[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: how to make Debian less fragile (long and philosophical)

On Tue, Aug 17, 1999 at 12:56:17AM -0400, Justin Wells wrote:
>    #5 -- a hardware error occurs and it corrupts a few files. you
>          don't know how extensive the problem is, but libC is 
>          at least one of the file that's been hosed

So you copy libc from your backup disk. What's the big deal?

> You're talking to someone who knows how to list files when "ls" is 
> broken ("echo *" is remarkably effectiver). 

I'm duly impressed. 

> > 1) A backup root disk. It's easy, it's cheap in today's market, and it
> > buys me protection from both hardware and human error. (Even static
> > binaries quail in the face of rm.)
> > 2) A serial console. If some disaster is pulling the rug out from under
> > my server, I don't want to rely on its network interface. There's too
> > much that can go wrong, and I can't boot over it.
> > 3) A failover machine. Sometimes things really do break.
> All three of your points assume that it is OK to reboot. 

Are you intentionally being obstinate? You mount your backup disk
read-only, then you copy what you need off of it. You use your serial
line to do it. The serial line's getty is going to be there regardless
of what you do to your libs because it's already running. If you delete
sash, you're in the same boat you'd be in if you deleted any other
shell. Of course, if you're paranoid, you can have another login using a
sash in another location as its shell. Depending on your particular
needs, the serial line can be tied to a sash with no login at all, which
will be even more highly available. This is a solution that will work in
any case where recovery is possible, as opposed to static linking, which
will work if someone fscks things up a little bit, but not too much.
You're putting too much emphasis on static bins, pure and simple. I've
got machines that don't even have dynamic libs, and let me assure you
that they're not failure-proof. 

> I know of a machine that was tracking satellite data, and rebooting 
> was absolutely unacceptable. It might lose track of the satellite, 
> and cost millions of dollars in re-orientation. It was only one of
> many redundant machines, but the thought of losing a little bit 
> of that redundancy for even a brief while gave everyone heartburn.

Duly impressed again.

Mike Stone

Attachment: pgpBk_F43dkkM.pgp
Description: PGP signature

Reply to: