[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Upgrade Sarge > Etch: segfault upgrading libc6: system hosed



Hi all,

The subject says it all.

I've submitted an upgrade report with all the requisite attachements but
wanted to summarize here.

System: IBM PS/ValuePoint 6492X5C 486DX4-100, 32 MB ram, 1200 MB and 840
MB drives.

Was running up-to-date Sarge but used Woody's xserver-s3 driver.  Since
Etch uses Xorg I figured I'd take the system down to a non-X state so I
removed everything related to X.

I followed the release notes (retrieved and printed 2007-04-08) to the
letter.

During the step 
	# aptitude install initrd-tools
which upgrades libc6, something segfaulted.  Couldn't do anything.  I'm
assuming that libc was hosed and so was everything that uses it.
Everything failed including shutdown.  Let it sit for 10 minutes and
then turned off the power.

Rebooted with Woody boot floppies (neither Sarge nor Etch installer will
run on this box).  The root fs was totally hosed, e2fsck couldn't fix
it.  Was able to mount the /var partition (A good reason to have
separate partitions) and look at the logs.  No log entries were made
after the segfault which itself doesn't appear in the logs.  Looking at
the logs its like the system fell off a cliff.

During the upgrade, I had a top session going in one VT and a watch of
df going in another.  While some swap was being used (aptitude will do
that) it wasn't much and disk space was ample.

I'll attach here my plan.txt which was both a plan for the upgrade and
my notes made during it.  It was created on the system but on a zip disk
so it survived.

I'll welcome comments and ideas but I'll probably do an old-hardware
juggle and make a new system.

Thanks all,

Doug.

-- plan.txt

Plan for upgrading the 486 box (pluto) from Sarge to Etch.
=========================================================

1.	Considerations:
----------------------

	Pluto has two roles.  Under normal circumstances, it functions
	as a remote terminal upstairs to access the main athlon box
	(titan) downstairs via ssh for both CLI and X11.  It also acts
	as a repository for some of the backup information from titan.

	Should there be a problem with titan, pluto has its own modem
	and tools to allow connection to the internet for browsing,
	downloading and email, as well as local reading of
	documentation, although it can't burn CDs nor use USB.  The
	reverse also applies; pluto and titan provide a backup rescue
	service to each other.  Titan burns CDs and uses USB.

	Therefore, prior to upgrading pluto, titan must be prepared to
	effect a repair of both itself and pluto in the event of a
	simultaneous failure of the upgrade on pluto and the running
	system on titan.

	Titan is running Etch amd64.  The most recent install media we
	have is for a daily build long before RC1.  

	The Debian Installer does not run on pluto due to memory
	limitations.  The last installer to work was Woody's
	boot-floppies.  Thus the only way to get pluto to Etch is by
	upgrading.  The installer requires 48 MB minimum while pluto
	only has 32 MB.  Etch release notes do not specify a minimum
	memory amount for an upgrade.  When going from Woody to Sarge,
	the apt cache size had to be increased and this will have to be
	increased at the appropriate time again.  Of more concern,
	with only aptitude running on pluto/Sarge, the system is already
	using 8 MB swap.  Will aptitude for Etch even run properly?

	Should the upgrade fail due to lack of memory, the backup plan
	for pluto is to try out OpenBSD and NetBSD.

	Currently, pluto runs Woody's xserver-s3 driver because Sarge's
	xservers do not work on this hardware.

	Pluto does not use devfs.  It uses kernel 2.6.8-3-386.


2.	Preliminary tasks: [ALL DONE]
-------------------------

	The following preliminary tasks can be completed in any order
	but must be completed prior to proceeding.

	-	Verify that grub-disk works. 
	[DONE]

	-	Download, burn, and test Etch amd64 CD1 
	[DONE]

	-	Download, burn, and test Etch amd64 netinst.iso
	[DONE]

	-	Download, burn, and test Etch i386 netinst.iso
	[DONE]

	-	Download, burn, and test NetBSD i386, make boot
		floppies, and test that the installer boots.
	[DONE]

	-	Download, burn, and test OpenBSD i386, make boot
		floppies, and test that the installer boots.
	[DONE]

	-	Print out Etch release notes and put its
		instructions in this document.
	[DONE]

	-	Make all preparations for submitting an upgrade report.  
		Data can be maintained on a zip disk on pluto which can
		be read on titan if necessary.
	[DONE]

	-	Determine name of the 486 kernel package that will be
		installed. 
	[DONE]
			Currently has:
				kernel-image-2.6-386 -->
					kernel-image-2.6.8-3-386

			Will use:
				linux-image-2.6-486 -->
					linux-image-2.6.18-4-486
	

3.	Scheduled tasks:
-----------------------

	The following tasks must be compled in order unless noted
	otherwise.  

	-	Complete a final aptitude update of the sarge system.
	[DONE]

	-	Complete a final backup of pluto and transfer to titan
		and then to CD.
	[DONE]

	-	Complete a backup of titan and burn to CD.
	[DONE]

	-	Copy anything relavent from /var/local/backup to zip
		disk.
	[DONE: nothing. have it on CD]

	-	Remove everything large from /var/local/backup
	[DONE]

	-	Check for extraneous files in /home, /usr/local/,
		/var/local, and /var/log
	[DONE]

	-	Copy to zip disk (directory: pre-shrink):
			/var/lib/dpkg/status 
			/var/lib/aptitude/pkgstates 
			/var/local/backup/apt_inst.sel
				(output of aptitude search '~i!~M'
	[DONE]

	-	Sanitize those files of anything sensitive.
	[DONE]

	-	Purge packages not required for updating.  This includes
		the X window system and all X apps.
	[DONE]

	-	Ensure directories that should now be empty or gone (per
		release notes, esp re X) are so.

	[/usr/X11R6 still exists, with directories under ./lib and one
	file /usr/X11R6/lib/X11/fonts/75dpi/fonts.alias existing.  dpkg
	-S doesn't find it.  **** Removed the whole /usr/X11R6 directory
	tree. ] 

	[ /etc/X11 directory not empty with a symlink X pointing to the
	now missing /usr/bin/X11/XF86_S3, rxvt.menu, and XF86Config.
	***Removed the whole/etc/X11 directory.]

	

	-	Copy to zip disk (directory: pre-upgrade):
			/var/lib/dpkg/status
			/var/lib/aptitude/pkgstates
			output of aptitude search '~i!~M'
	[DONE]

	-	dpkg --audit
	[DONE: no output]
	
	-	change sources.list to etch
	[DONE]

	-	apt-cdrom add any etch i386 CDs (netinst or CD1)
	[DONE]

	-	mount /zip
	[DONE]

	-	script -t 2>/zip/upgrade-etch.time -a
		/zip/upgrade-etch.script

		From here on, any activity that should not go into the
		log (e.g. a vi session) should be done from a different
		VT.
	[DONE]

	-	pon from titan
	[DONE]
	
	-	aptitude update
	[DONE]

	-	aptitude upgrade 
			verify size of download and disk space
			requiremets.
	[DONE.  netkit-ping caused a problem, had to run aptitude 
	interactively to fix it.]

	[ after each of these steps, check in /etc/apt for updated
	config files.  May need to increase cache limit: see release
	notes section 4.5.8.]

	-	aptitude install initrd-tools
	[Segmentation fault after debconf asked if I wanted to continue with
	upgrade to libc6 re NSS and services needing restart.  It logged me
	right out.  Now I can't log in, don't get a password prompt.  
	Even when logged in, get spit out as soon as I try something, 
	e.g. ssh to titan or su.  Will try a reboot.  Tried to umount 
	/zip and get segmentation fault.]

	Reboot failed, just sitting there after giving everything the
	TERM signal.  Had to power off. Powered back on, grub menu OK,
	loading the kernel failed with crc error and 'system halted'.

	Will try tomorrow to boot a rescue floppy and see if I can
	access the logs and perhaps get them copied to zip.

	-----
	root filesystem hosed, e2fsck unable to fix.  Able to access
	/var (good reason to have separate fs) but _NO_ entries in
	syslog or other logs after trouble started.

	-----

	-	aptitude install linux-image-2.6-486

	-	aptitude dist-upgrade

	-	aptitude update 
			uses the newly installed apt to get package
			signatures.

	-	check /boot/grub/menu.lst for updated info,
		
	-	exit the script session.

	-	umount /zip

	-	shutdown to reboot.

	- 	Assuming all is well, run aptitude interactivly and
		remove obsolete kernel and generally tidy up.

	-	Run aptitude again and start reinstalling packages.




Reply to: