hi all, i have new information. the problem comes up, in the moment, the old kernel - i am not sure, if it only occurs with 2.6.8 from sarge, hadn't currently no other - is running, new mdadm package get installed and write-access to disc is done. so one solution is: modify apt.source to etch aptitude update; aptitude upgrade aptitude install aptitude install linux-image-2.6-??? grub aptitude sync aptitude install mdadm check if /boot/initrd-2.6.18 is correct created (cause possible problems on /tmp which is a separate xfs-filesystem) run update-grub and reboot now run aptitude dist-upgrade i the meantime i thought the problem occurs, cause or partition scheme: part1 (md1) /boot part2 (md0) / part3 (md2) pv for volumegroup part4 --- swap cause once i had the problem, where mdadm swapped md0 and md1, but the first machine had the partition's and md-devices in the same sequence. maybe this helps thomas Thomas Stegbauer schrieb: > Thomas Stegbauer schrieb: >> hi all, >> >> sorry for cross-mailing. >> >> i upgraded two machine's from latest sarge 3.1r5 kernel 2.6.8 to debian etch. >> the machine's are completly differntly, the one is celeron or sempron with two ide harddisk's the >> other is a fsc-server econel 50 with intel chipset and pentium4 whith four sata-drive's. >> what was identical? >> debian sarge 3.1r5 >> md raid1 device's >> lvm2 >> xfs on all lv's and root (on /dev/md0) >> >> i found on internet a similar problem: >> http://www.debianhelp.org/node/6006 >> which has an other hardware, but software looks identical. >> >> while upgrading filesystem's on lvm's get shutdown: >> the kern.log shows the following: >> >> Apr 10 16:18:05 hornet kernel: XFS internal error XFS_WANT_CORRUPTED_GOTO at line 1583 of file >> fs/xfs/xfs_alloc.c. Caller 0xf8935305 >> Apr 10 16:18:05 hornet kernel: [<f8934091>] xfs_free_ag_extent+0x471/0x7a0 [xfs] >> Apr 10 16:18:05 hornet kernel: [<f8935305>] xfs_free_extent+0xe5/0x110 [xfs] >> Apr 10 16:18:05 hornet kernel: [<f8935305>] xfs_free_extent+0xe5/0x110 [xfs] >> Apr 10 16:18:05 hornet kernel: [<f89977fc>] kmem_zone_alloc+0x4c/0xa0 [xfs] >> Apr 10 16:18:05 hornet kernel: [<f8968686>] xfs_efd_init+0x86/0x90 [xfs] >> Apr 10 16:18:05 hornet kernel: [<f898bee8>] xfs_trans_get_efd+0x38/0x50 [xfs] >> Apr 10 16:18:05 hornet kernel: [<f8948b8f>] xfs_bmap_finish+0x13f/0x1e0 [xfs] >> Apr 10 16:18:05 hornet kernel: [<f8992e7e>] xfs_remove+0x2fe/0x500 [xfs] >> Apr 10 16:18:05 hornet kernel: [<f899f0f0>] linvfs_unlink+0x30/0x70 [xfs] >> Apr 10 16:18:05 hornet kernel: [<c017267a>] vfs_unlink+0x10a/0x1e0 >> Apr 10 16:18:05 hornet kernel: [<c01727fe>] sys_unlink+0xae/0x130 >> Apr 10 16:18:05 hornet kernel: [<c0175b60>] sys_getdents64+0xa0/0xaa >> Apr 10 16:18:05 hornet kernel: [<c01759c0>] filldir64+0x0/0x100 >> Apr 10 16:18:05 hornet kernel: [<c01061eb>] syscall_call+0x7/0xb >> Apr 10 16:18:05 hornet kernel: xfs_force_shutdown(dm-2,0x8) called from line 4049 of file >> fs/xfs/xfs_bmap.c. Return address = 0xf89a244b >> Apr 10 16:18:05 hornet kernel: Filesystem "dm-2": Corruption of in-memory data detected. Shutting >> down filesystem: dm-2 >> Apr 10 16:18:05 hornet kernel: Please umount the filesystem, and rectify the problem(s) >> Apr 10 16:22:06 hornet kernel: xfs_force_shutdown(dm-2,0x1) called from line 353 of file >> fs/xfs/xfs_rw.c. Return address = 0xf89a244b >> Apr 10 16:25:04 hornet kernel: xfs_force_shutdown(dm-2,0x1) called from line 353 of file >> fs/xfs/xfs_rw.c. Return address = 0xf89a244b >> >> on the fsc machine this happened on /tmp and /var >> on the other server, it destroyed /tmp/ and after rebooting with 2.6.18-4 /usr was gone after a while. >> >> the only solution was: umount the partition (if possible, otherwise start a rescue-system) and run a >> xfs_repair. on the "other server" (not fsc ;) cant login currently), xfs_repair failed, cause it >> complained about a unreplayed log, mounting/unmounting, didn't replay it. so i had to recover with >> xfs_repair -L, where the log get zeroed. happily all data of tmp still exist's, of course, cause it >> wasn't important ;) >> >> i checked already bugs for xfsprogs and linux-2.6, there was nothing for xfs-progs and several xfs >> bugs in kernel-2.6, the maybe nearest was >> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=410204 but there the problem seems to be in dm-crypt. >> >> any ideas? >> >> thomas > > hi all, > > i upgraded some extra mashine's, there wasn't any problem, when there was kernel 2.6.18 running > before, cause xen, or hardware-raid controller. > > today there was a sempron 2600, adaptec, 2gb ram running latest sarge with kernel 2.6.8, root on md0 > (software-raid1), lvm, all filesystem's sgi xfs. > > i upgraded the minimal way > aptitude upgrade > aptitude install linux-image-2.6-k7 initrd-tools > aptitude install libfam0 > aptitude dist-upgrade > > it worked until dist-upgrade, download and unpacking worked and when the postinstall run's it > stopped this time at openvpn, i/o error. > > running dmesg, there where xfs error's on the usr lv. > > ok, started which gparted, and run xfs_check on > / /dev/md0 > /usr /dev/vg0/usr > /var /dev/vg0/var > /tmp /dev/vg0/tmp > > no error's on /home, /var/log, /home, /usr/src > > so all filesystem's are under higher read/write io while upgraded are damaged. > today "the best" happened on / there whas error's which was unrecoverable by xfs_repair! it fixed > all time's the same inode's. so i rsync the / over to an other machine, formated it, and synced it > back. / kept crashing before formating! mostly on shutdown. > > cause this error's on /, which happened also on the machine above with ide harddisc's i think lvm > and device-mapper is not the cause. > > i expect the error in xfs in conjuction with software-raid. > > in all cases the error occured, where the sarge kernel 2.6.8 was still running. > > are there any idea's i can do to further examine the problem? > > greetings > thomas > > -- # Thomas Stegbauer # https://keyserver1.pgp.com/vkd/DownloadKey.event?keyid=0x9A3F1866FC68E91D # Key fingerprint = 5A2D FEDC 8A50 F1BB 25FB 967B 9A3F 1866 FC68 E91D
Attachment:
signature.asc
Description: OpenPGP digital signature