[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: how to make Debian less fragile (long and philosophical)



On Mon, Aug 16, 1999 at 05:32:09AM -0700, tony mancill wrote:
[snip]
> top-notch.  It is.  I've had a dozen boxes in production since early 1997,
> and have had no serious failures attributable to Debian.  (I run only
> stable.)  Things that have failed are either the kernel itself or
> hardware.  And this, I think, is key to Mr. Wells' point.  When the /usr
> partition turns to mush, you'd like to be able to boot your system and
> have enough tools to try to salvage something, or maybe even start a
> restore.  

Granted. And static binaries really don't buy you much here. Read on. 

> Instead of trying to have a superfloppy with 30MB of tools on
> it, if you have statically linked binaries, you can run them from
> /target/bin/tool and not have to work about dependencies on libraries,
> which may (also) be whacked.  

Most of the stuff in /sbin only relies on a couple of libraries. If
those couple of libraries get nailed, it's very likely that at least one
of the static binaries you need will also get nailed. If you actually
need to recover from a massive hardware failure you need a reliable
solution--you don't need to be playing guessing games about which
binaries are toast and which are still working. Or can you think of a
massive hardware failure that only targets files with "lib" in the
filename?

> I think that the worst thing I ever had to do on a Debian box was
> recreate enough device files in /dev so that the box could boot. I
> couldn't tell you what happened to them (some other form of
> makedevbadness? ;), but my system was toast, and MAKEDEV wouldn't run.

This is a nice anecdote, but how does it relate to the need for a set of
static binaries? (I apologize for the harshness, but this thread keeps
growing without anything substantial being added. I'm half tempted to
just dig through the list archives and post a 500k tarball of the last
couple of times this argument came up so we could all save our breath...)


On Mon, Aug 16, 1999 at 10:11:51AM -0400, Justin Wells wrote:
> Debian has callously thrown away 30 years of hard won knowledge here, 
> because for some reason people believe the intricate dependency manager
> is a replacement for common sense.

No. Using dynamic libs was a decision made after weighing the advantages
and disadvantages of static linking. It wasn't done on a whim. It wasn't
done by a pack of fools with no common sense. You know, the more I
reread that paragraph, the more insulting it becomes.

You apparantly ignored my subtle hinting and didn't bother to dig
through the archives. Since you wouldn't spend your time on it, I'll
waste my own (and that of everyone else on the list.) 

Dynamic linking provides these benefits:
1) Saves space.
2) Removes the need for seperate rescue and run-time binaries.
3) Easier to update critical binaries in the event that a library flaw
is discovered. (E.g., security problem or correctness issue.)

and has this flaw:
4) Library damage can prevent binaries from working.

You can poo-poo point 1. For most people this isn't a real big issue.
Another 20-30M isn't a huge chunk on a modern hard disk. We do have some
users with older systems, but they can cope. Point 2 is a little harder.
This is a volunteer effort. It's hard enough sometimes for people to
maintain their packages without maintaining an extra set of them. You
could put the static versions in /sbin instead of breaking them off in a
seperate dir, but then you waste RAM instead of hd space. (Dynamic libs
are loaded once, static libs are loaded for each binary.) It's a
tradeoff, just like everything else in a distribution. Point 3 is more
thought-provoking. If you statically link everything then any libc
update means updating 30 packages by 30 different maintainers rather
than a updating a single libc package. 

Against the pros of dynamic linking, you have a single con: a damaged
library can prevent programs from loading. But as I said earlier: what
are the odds that a massive failure will affect only libraries? If even
one of your static binaries is destroyed, you're in the same place that
you were with a broken library. (E.g., a disk problem or a bad case of
the dumbs wiped out /dev and /lib. You've got static bins, but mknod
also got wiped out. Bummer.) For static bins to be useful you need a
particular combination of disaster and luck.  Optimizing for that
combination is like writing an algorithm for best-case performance: I
can't say that it never helps, but it buys you nothing when you really
need it. If a particular machine has to be highly available, it needs
more than static binaries. 

What do I think a machine needs to be reliable? Here's a couple for
starters:
1) A backup root disk. It's easy, it's cheap in today's market, and it
buys me protection from both hardware and human error. (Even static
binaries quail in the face of rm.)
2) A serial console. If some disaster is pulling the rug out from under
my server, I don't want to rely on its network interface. There's too
much that can go wrong, and I can't boot over it.
3) A failover machine. Sometimes things really do break.

I know that 3) can be prohibitive. But 1 & 2 don't really cost that much
if reliability is important. (And if it's not, then why worry about
static binaries at all?) There is no combination of static binaries that
can give me the reliability of booting off a known good drive. That's
the reasoning that really underlies the choice of dynamic libs--there is
no benefit to pinning anyone's hopes on a false promise of reliability,
even if there weren't some drawbacks inherent in static linking.

[snip]
> You do NOT replace trusted, well tested, and simple precautions with 
> complicated, not well tested, and fancy ones. You do not need to have
> a high-performance multi-threaded dynamically linked fsck--you need
> one that works reliably when you really need it most.

That's exactly right. That's why you don't screw around with a
production system unless you have a way out. And that's why you don't
run a production system unless you have a way to compensate for
catastrophic failure. And that has nothing to do with static binaries.

Mike Stone

Attachment: pgp2zV7OzRmIH.pgp
Description: PGP signature


Reply to: