Upgrade Sarge > Etch: segfault upgrading libc6: system hosed
Hi all,
The subject says it all.
I've submitted an upgrade report with all the requisite attachements but
wanted to summarize here.
System: IBM PS/ValuePoint 6492X5C 486DX4-100, 32 MB ram, 1200 MB and 840
MB drives.
Was running up-to-date Sarge but used Woody's xserver-s3 driver. Since
Etch uses Xorg I figured I'd take the system down to a non-X state so I
removed everything related to X.
I followed the release notes (retrieved and printed 2007-04-08) to the
letter.
During the step
# aptitude install initrd-tools
which upgrades libc6, something segfaulted. Couldn't do anything. I'm
assuming that libc was hosed and so was everything that uses it.
Everything failed including shutdown. Let it sit for 10 minutes and
then turned off the power.
Rebooted with Woody boot floppies (neither Sarge nor Etch installer will
run on this box). The root fs was totally hosed, e2fsck couldn't fix
it. Was able to mount the /var partition (A good reason to have
separate partitions) and look at the logs. No log entries were made
after the segfault which itself doesn't appear in the logs. Looking at
the logs its like the system fell off a cliff.
During the upgrade, I had a top session going in one VT and a watch of
df going in another. While some swap was being used (aptitude will do
that) it wasn't much and disk space was ample.
I'll attach here my plan.txt which was both a plan for the upgrade and
my notes made during it. It was created on the system but on a zip disk
so it survived.
I'll welcome comments and ideas but I'll probably do an old-hardware
juggle and make a new system.
Thanks all,
Doug.
-- plan.txt
Plan for upgrading the 486 box (pluto) from Sarge to Etch.
=========================================================
1. Considerations:
----------------------
Pluto has two roles. Under normal circumstances, it functions
as a remote terminal upstairs to access the main athlon box
(titan) downstairs via ssh for both CLI and X11. It also acts
as a repository for some of the backup information from titan.
Should there be a problem with titan, pluto has its own modem
and tools to allow connection to the internet for browsing,
downloading and email, as well as local reading of
documentation, although it can't burn CDs nor use USB. The
reverse also applies; pluto and titan provide a backup rescue
service to each other. Titan burns CDs and uses USB.
Therefore, prior to upgrading pluto, titan must be prepared to
effect a repair of both itself and pluto in the event of a
simultaneous failure of the upgrade on pluto and the running
system on titan.
Titan is running Etch amd64. The most recent install media we
have is for a daily build long before RC1.
The Debian Installer does not run on pluto due to memory
limitations. The last installer to work was Woody's
boot-floppies. Thus the only way to get pluto to Etch is by
upgrading. The installer requires 48 MB minimum while pluto
only has 32 MB. Etch release notes do not specify a minimum
memory amount for an upgrade. When going from Woody to Sarge,
the apt cache size had to be increased and this will have to be
increased at the appropriate time again. Of more concern,
with only aptitude running on pluto/Sarge, the system is already
using 8 MB swap. Will aptitude for Etch even run properly?
Should the upgrade fail due to lack of memory, the backup plan
for pluto is to try out OpenBSD and NetBSD.
Currently, pluto runs Woody's xserver-s3 driver because Sarge's
xservers do not work on this hardware.
Pluto does not use devfs. It uses kernel 2.6.8-3-386.
2. Preliminary tasks: [ALL DONE]
-------------------------
The following preliminary tasks can be completed in any order
but must be completed prior to proceeding.
- Verify that grub-disk works.
[DONE]
- Download, burn, and test Etch amd64 CD1
[DONE]
- Download, burn, and test Etch amd64 netinst.iso
[DONE]
- Download, burn, and test Etch i386 netinst.iso
[DONE]
- Download, burn, and test NetBSD i386, make boot
floppies, and test that the installer boots.
[DONE]
- Download, burn, and test OpenBSD i386, make boot
floppies, and test that the installer boots.
[DONE]
- Print out Etch release notes and put its
instructions in this document.
[DONE]
- Make all preparations for submitting an upgrade report.
Data can be maintained on a zip disk on pluto which can
be read on titan if necessary.
[DONE]
- Determine name of the 486 kernel package that will be
installed.
[DONE]
Currently has:
kernel-image-2.6-386 -->
kernel-image-2.6.8-3-386
Will use:
linux-image-2.6-486 -->
linux-image-2.6.18-4-486
3. Scheduled tasks:
-----------------------
The following tasks must be compled in order unless noted
otherwise.
- Complete a final aptitude update of the sarge system.
[DONE]
- Complete a final backup of pluto and transfer to titan
and then to CD.
[DONE]
- Complete a backup of titan and burn to CD.
[DONE]
- Copy anything relavent from /var/local/backup to zip
disk.
[DONE: nothing. have it on CD]
- Remove everything large from /var/local/backup
[DONE]
- Check for extraneous files in /home, /usr/local/,
/var/local, and /var/log
[DONE]
- Copy to zip disk (directory: pre-shrink):
/var/lib/dpkg/status
/var/lib/aptitude/pkgstates
/var/local/backup/apt_inst.sel
(output of aptitude search '~i!~M'
[DONE]
- Sanitize those files of anything sensitive.
[DONE]
- Purge packages not required for updating. This includes
the X window system and all X apps.
[DONE]
- Ensure directories that should now be empty or gone (per
release notes, esp re X) are so.
[/usr/X11R6 still exists, with directories under ./lib and one
file /usr/X11R6/lib/X11/fonts/75dpi/fonts.alias existing. dpkg
-S doesn't find it. **** Removed the whole /usr/X11R6 directory
tree. ]
[ /etc/X11 directory not empty with a symlink X pointing to the
now missing /usr/bin/X11/XF86_S3, rxvt.menu, and XF86Config.
***Removed the whole/etc/X11 directory.]
- Copy to zip disk (directory: pre-upgrade):
/var/lib/dpkg/status
/var/lib/aptitude/pkgstates
output of aptitude search '~i!~M'
[DONE]
- dpkg --audit
[DONE: no output]
- change sources.list to etch
[DONE]
- apt-cdrom add any etch i386 CDs (netinst or CD1)
[DONE]
- mount /zip
[DONE]
- script -t 2>/zip/upgrade-etch.time -a
/zip/upgrade-etch.script
From here on, any activity that should not go into the
log (e.g. a vi session) should be done from a different
VT.
[DONE]
- pon from titan
[DONE]
- aptitude update
[DONE]
- aptitude upgrade
verify size of download and disk space
requiremets.
[DONE. netkit-ping caused a problem, had to run aptitude
interactively to fix it.]
[ after each of these steps, check in /etc/apt for updated
config files. May need to increase cache limit: see release
notes section 4.5.8.]
- aptitude install initrd-tools
[Segmentation fault after debconf asked if I wanted to continue with
upgrade to libc6 re NSS and services needing restart. It logged me
right out. Now I can't log in, don't get a password prompt.
Even when logged in, get spit out as soon as I try something,
e.g. ssh to titan or su. Will try a reboot. Tried to umount
/zip and get segmentation fault.]
Reboot failed, just sitting there after giving everything the
TERM signal. Had to power off. Powered back on, grub menu OK,
loading the kernel failed with crc error and 'system halted'.
Will try tomorrow to boot a rescue floppy and see if I can
access the logs and perhaps get them copied to zip.
-----
root filesystem hosed, e2fsck unable to fix. Able to access
/var (good reason to have separate fs) but _NO_ entries in
syslog or other logs after trouble started.
-----
- aptitude install linux-image-2.6-486
- aptitude dist-upgrade
- aptitude update
uses the newly installed apt to get package
signatures.
- check /boot/grub/menu.lst for updated info,
- exit the script session.
- umount /zip
- shutdown to reboot.
- Assuming all is well, run aptitude interactivly and
remove obsolete kernel and generally tidy up.
- Run aptitude again and start reinstalling packages.
Reply to: