[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Crash recovery.

on Wed, Nov 01, 2000 at 11:01:28PM +1000, paul@netwise.net.au (paul@netwise.net.au) wrote:
> Hi Everyone,
> Had a fun experience today. Debian hung and refused any input. The
> only option I had was to power it down. The reboot failed dropping me
> at the command prompt to run fsck manually. This fixed this and that
> and place a number of things in to /lost+found. Rebooting again and
> this time we get stuck starting syslogd. Recreating the dir /var/run
> got me past that but there are obviously still issues from the
> problem. What I need to know is a way to either work out what was
> crunched and placed in /lost+found, whether or not knowing this is
> good enough, and some way of checking the current status of the system
> to repair what is now broken.
> Any input is appreciated.

Backups would be good to have at this point, wouldn't they?  I ran mine
this morning.

First measure at this point is to decide what's valuable to you on the
system, and make sure that it's safe.  This would include personal data,
system configuration data (/etc and your dpkg selections, also
/var/backups), and anything under /usr/local. 

I would *avoid booting from the system* until I've had time to assess
the extent of the damage.  Your Debian installation disk, Tom's
Root/Boot, or the LinuxCare Bootable Business card are all bootable
GNU/Linux systems on removable media, and are strongly recommended.  My
personal favorite is the LinuxCare bootable, you can get the ISO image
by rooting around at http://www.linuxcare.com/.

I would run an e2fsck on all filesystems, *without* allowing correction
of defects first.  If you're dealing with only a few damanged files,
it's probably OK to go ahead and allow the corrections.  If there are
lots (more than a dozen or so) of damaged files, chances are pretty good
that there's a more fundamental problem.  If possibly, you may want to
salvage more of the system (backups from read-only mounted media).

If the system isn't too badly hosed, you will need to identify which
files under /lost+found directories are what.  You're basically going to
have to do some forensics here, there's no straightforward way of doing
this.  Until you've identified what is or isn't damaged, boot the system
single-user mode *only*.  You can spawn additional terminals if
necessary (man getty), but it's a good idea to leave the system as
quiescent as possible until you've patched things up.  You may want to
reinstall packages associated with more seriously damanged files.  You
should also look for any possible indication of a hack or other attack
-- usually this will involve changed files under /bin, /sbin, /usr/bin,
or /usr/sbin, and sometimes odd directory names like '...' or '.. '
(note the space).

If the system *is* badly damaged, salvage what you can, buy a new disk
drive, confirm that you don't have problems somewhere else down the line
(controller, motherboard, CPU -- diagnostics, anyone?), and try a fresh
installation using your existing package selection, then restore
personal and local data.

Good luck.

Karsten M. Self <kmself@ix.netcom.com>     http://www.netcom.com/~kmself
 Evangelist, Opensales, Inc.                    http://www.opensales.org
  What part of "Gestalt" don't you understand?      There is no K5 cabal
   http://gestalt-system.sourceforge.net/        http://www.kuro5hin.org

Attachment: pgpTzdhmKrErc.pgp
Description: PGP signature

Reply to: