[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: how to make Debian less fragile (long and philosophical)



Justin Wells <justin@semiotek.com> writes:

> I *do* care that I can get a root shell. I *do* care that I can install
> packages. I *do* care that I can fsck my disk. I *do* care that I can
> edit configuration files. I *do* care that I can partition my disk. 
> I *do* care whether I can bring my network back up or not (to ftp or
> nfs mount something important). 

Thats the reason why Debian has a rescue disk. Also for this reason I
allways have a swap partition, so i can stick a base.tgz onto it.
Also you can use a CD or removeable (like zip) to boot a rescue system
from.

The chance that you destroy your base system during an upgrade of
stable is so low, that a rescuedisk is good enough, no need to waste
the space on disk and memory and to slow down the system by having
only static bins on /. Also whats the point of it? As soon as a broken 
libc is released, all bins would be statically linked against that. If 
the lib is broken, all the bins will be broken as well. And fixing
that would be much more work than just copying the libc from the
rescuedisk.

> This is not FUD. This is something that 30 years of Unix experience has
> taught us that we need, and that every other decent OS provides. Look
> at Solaris, SunOS, FreeBSD, NetBSD, BSDI, OSF/1, RedHat, Caldera, 
> HPUX, SCO, and for that matter, I'm willing to bet NT has some 

I think they are all statically linked for 99.99% of the bins.

> statically linked system tools just for system repair--certainly
> it's predecessor VMS did.
> 
> Debian has callously thrown away 30 years of hard won knowledge here, 
> because for some reason people believe the intricate dependency manager
> is a replacement for common sense.

30 years of hard won knowledge state "Never change a runing system"
and "Allways make backups". People allways do the same errors
again. :)
 
> This is similar to when the world trade center replaced basic, ordinary
> emergency lights (which turn on by laws of physics when the power fails)
> with a centrally controlled computer run emergency light system. That 
> is why the whole building went black when the computer got blown up. 
> 
> You do NOT replace trusted, well tested, and simple precautions with 
> complicated, not well tested, and fancy ones. You do not need to have
> a high-performance multi-threaded dynamically linked fsck--you need
> one that works reliably when you really need it most.
> 
> There is only ONE advantage in dynamic linking, and that is a performance
> advantage: dynamic binaries are smaller, and load faster, and use fewer
> system resources. 

Your suggestion has more drawbacks, its not only slower, consumes much 
more space and much more ram, but also it introduces many points of
failure instead of only one. Its far easier to fix one broken dynamic
library than >50 binaries.

> Now if apt-get, fsck, dpkg, /bin/sh, ifconfig, route, ping, fsck,
> mount, umount, mke2fs, dump, restore, ps, ln, and dd were somehow
> performance-critical applications you might have some kind of point:

They are memory critical. sh, ifconfig, route, fsck, mount,... are run 
before swap gets activated and with 500 KB static garbage in each of
those instead of a linked library lowmem terminals won't work anymore.
And its not only lowmem comps having that problem: Every time fsck
runs on my raid during bootup, the boot fails because out of memory,
and I have 128 MB ram. Well, my raid is big, but it shows that memory
is precious.

> running 1000 simultaneous copies of mke2fs might be a problem if
> it were statically linked. All of these binaries are very small, 
> though, most only one or two hundred K at most--so even then I 
> would guess your average modern machine could handle it.

But statically linked they gain a lot.
 
> Let me know when you set up an apt-get server; and when you start 
> hosting a machine that allows hundreds of users to run thousands
> of copies of fsck and fdisk.

As I said above, one fsck is enough to waste 128 MB ram.
...
> Note that "I goofed up and had to copy libC from another machine, it
> took five minutes" is bad. "I goofed up, had to reboot from boot floppies,
> needed to re-install part of my OS, and hunt down my backup tapes" is
> a fu#!king disaster.

Installing something on a production system without a on the fly
backup/restore is stupidity. If you don't have the five minutes time
to restore a libc from the rescue disks, use a second system or a
chroot system to check if the update works first.

> The system should strive to guarantee the availablity of anything 
> that you might need in single user mode, and you are much more able
> to guarantee that when it's statically linked. 

Thats not true. If something like the libc is broken, all statically
linked bins will also be broken and then its much more work. If
something gets deinstalled by mistake, its gone, no matter if its
dynamic or static.

May the Source be with you.
                        Goswin


Reply to: