[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: /usr broken, will the machine reboot ?



jacques wrote:
> by error most of the binaries in /usr are erased (killing rm :-(

Everyone has made that mistake at some point.  I know I have!

> The server is still up.

Good.  You can probably do a lot of good recovery in that case.  You
will find lots of stories of how people have recovered systems with
far less available.

> Most of the services are restarted either by copying (rsync)
> the binaries from another squeeze server (both are running Squeeze)
> or desinstalling/installing packages. (apt-* dpkg and suite are
> restored and runing ok)

That sounds like you have done a lot of good work and made good
progress down the recovery path.

> Now the the question is : will this machine *reboot* properly,
> inclufing the network ?

That depends upon how well you have recovered any missing components.
So the answer is that it depends.  Maybe yes.  Maybe no.  Let me run
some more ideas by you and then you can decide.

The /var/backups/ contains a list of what was installed on the machine
previously.  You can get a list of installed packages with:

  $ grep-dctrl -s Package -n "install ok installed" /var/backups/dpkg.status.0
  ...dumps a list of previously installed packages...

  $ grep-status -s Package -n "install ok installed"
  ...dumps the current list of packages installed now...

Putting that information to use you can see what was different between
the backup file and now.

  $ grep-dctrl -s Package -n "install ok installed" /var/backups/dpkg.status.0 | sort > /tmp/list.prev

  $ grep-status -s Package -n "install ok installed" | sort > /tmp/list.now

  $ comm -3 /tmp/list.prev /tmp/list.now

But your problem isn't /var being removed it is /usr being removed.
In that case I would check the directories in /usr/share/doc.  By
policy every package must have a directory there.

  $ ls /usr/share/doc > /tmp/list.doc
  $ comm -3 /tmp/list.now /tmp/list.doc

And then inspecting that list and making a decision about how to
repair.  You may want to install more packages.  There are a very few
directories in /usr/share/doc that are not the same name as packages.
I have two files and one directory on my machine that are obviously
okay and two directories that should have been removed but are lint
left behind.  Something for me to clean.  So don't expect that to be
100% but it should be very close to correct.  Enough to cross-check
that you have reinstalled the right packages.

The 'debsums' program is useful for checking installed package
integrity.  I would definitely run it to check the integrity of the
/usr directory.

By what you say I think you already know how to reinstall individual
packages.  But I will say it here because that is how we all learn
these things.  Since it is only files in /usr that were removed I
think reinstalling every package should be a good way to recover.

  # apt-get install --reinstall PACKAGENAME

If you have a second machine running the same OS version then if the
list of installed packages are the same between the machines then you
should be able to recovery most of /usr from the good machine to the
problem one.  Or at least compare files and file lists between them.
Run a find across both machines, sort them, and then compare the
lists.

  find /usr -print | sort > /tmp/usr.file.list

I think with the above then you should be fine and the machine should
reboot okay.

For the very paranoid pedantic like me then I would set up a scratch
machine for testing purposes.  Then install everything that should be
installed.  Then remove /usr to recreate the failure.  Then recover as
above and test that the test machine reboots.  Learn of any problems
there first and then improve and repeat the recovery process until it
works.  It would be quite a bit of work!  But it all depends upon how
important this is to you and how much trouble it will be if it fails.

Good luck!
Bob

Attachment: signature.asc
Description: Digital signature


Reply to: