Bug#534466: Booting a Debian Lenny ia64 install cdrom on a Sarge system corrupts raid array
Package: cdrom
Version: 5.0.1
Severity: critical
After booting a Debian 5.0.1 install cdrom (debian-501-ia64-netinst.iso) on an Itanium 2 server with Debian 3.1 (ia64) installed on a software raid 5 root partition, the raid array get corrupted, leading to a kernel panic at boot.
The raid 5 array containing the root partition is made of 3 partitions on 3 scsi disks (sda2, sdb2, sdc2) which ran fluently for years. Here is the output since then:
md: invalid superblock checksum on sdb2
md: sdb2 has invalid sb, not importing!
md: md_import_device returned -22
mdadm: failed to add /dev/sdb2 to /dev/md0: Invalid argument
md: invalid superblock checksum on sdc2
md: sdc2 has invalid sb, not importing!
md: md_import_device returned -22
mdadm: failed to add /dev/sdc2 to /dev/md0: Invalid argument
md: invalid superblock checksum on sda2
md: sda2 has invalid sb, not importing!
md: md_import_device returned -22
mdadm: failed to add /dev/sda2 to /dev/md0: Invalid argument
md: bug in file drivers/md/md.c, line 1513
md:o**********************************
md:o* <COMPLETE RAID STATE PRINTOUT> *
md:o**********************************
md0:
md:o**********************************
mdadm: failed to RUN_ARRAY /dev/md0: Invalid argument
EXT3-fs: unable to read superblock
pivot_root: No such file or directory
/sbin/init: 432: cannot open /dev/console: No such file
Kernel panic: Attempted to kill init!
When I boot on the Lenny cdrom again in rescue mode, I can successfully reassemble the array and mount the root partition on it with no problem at all: the data is accessible and "mdadm --detail /dev/md0" reports no error.
However, if I boot a Sarge install cdrom (debian-31r8-ia64-netinst.iso) and tries to reassemble the raid manually in a console, mdadm fails:
$ mdadm --assemble /dev/md0 /dev/sda2 /dev/sdb2 /dev/sdc2
mdadm: failed to add /dev/sdb2 to /dev/md0: Invalid argument
mdadm: failed to add /dev/sdc2 to /dev/md0: Invalid argument
mdadm: failed to add /dev/sda2 to /dev/md0: Invalid argument
The output of "mdadm --examine" on sda2, sdb2, sdc2 is correct though: all partitions the array is made from are listed correctly; no failure is reported; checksum is reported correct on every partition. I tried to force reassembling in several ways using options --force, --update=resync, --update=summaries with no success.
The server is using Debian 3.1 Sarge ia64, kernel 2.6.8-mckinley-smp
Apart from fixing this bug, I would be grateful that you to suggest me a safe way to make the server bootable again. I was thinking about booting on a Sarge install cdrom and try to re-create the raid array with option "--assume-clean" or, if that fails, re-create the array and restore content from a tar backup.
Thank you very much.
Reply to: