Hey everyone :) the filesystem of my hurd box died some time ago, only the root filesystem, probably because of the sync problems on shutdown. The box is an athlon 1.5Ghzish, 512MByte of ram, 40 gig hard drive, two nics, one rtl8139c that works fine with gnumach, one sis900 that works both with gnumach (well I think it does, not sure thoug) and a dde userspace driver (that was a nice experience, unfortunately the server died every now and then). So I downloaded the current (2011-07-01) version of Samuels images and bootet the system. Impressions: * Xorg failed to use the correct resolution / refresh rates, monitor driven out of spec (19" flat screen). I never tried Xorg on that box before, so I can't say whether this is specific to the installer or not. * Locating the cdrom drive takes a long time, some error messages are printed to stderr ('end_request: I/O error, dev 02:00, sector 0', once more for 02:01) * wow! network autoconfiguration using dhcp worked fine :) * strange though, /var/log/syslog shows: May 1 12:30:28 kernel: rtl8139.c:v1.23a 8/24/2003 Donald Becker, becker@scyld.com. May 1 12:30:28 kernel: http://www.scyld.com/network/rtl8139.html May 1 12:30:28 kernel: eth0: RealTek RTL8139C Fast Ethernet at 0xd000, IRQ 5, 0:c0:df:10:fc:30. May 1 12:30:28 kernel: sis900.c: modified v1.06.11 4/30/2002 May 1 12:30:28 kernel: eth1: SiS 900 PCI Fast Ethernet at 0xd400, IRQ 5, 0: b:6a:56:98:c5. May 1 12:30:28 kernel: eth1: Realtek RTL8201 PHY transceiver found at address 1. May 1 12:30:28 kernel: eth1: Using transceiver found at address 1 as default Has the sis900 card a realtek phy or is the log wrong? The realtek card works fine though, havent tried the sis900. * The cdrom isn't shown in the partition list like it did in a previous version I tried, this is nice :) * I tried normal installation about five times, without restarting the installer, just running the partitioner and base install over and over again. * It failed at various points within the debootstrap process, for example 'Warning: Failure trying to run: chroot /target /usr/lib/hurd/setup-translators -k'. The beginning of that file contained a small number of garbage bytes, the rest of the file looked fine. /var/log/messages contained 'debootstrap: /usr/lib/hurd/setup-translators /usr/lib/hurd/setup-translators: cannot execute binary file'. * the filesystem is not umounted if the installation failed, not sure what the consequence was, maybe the partitioner was failing or it wasn't possible to format the partition and keeping the files was not my intention... * at one point the ext2fs translator of the target partition died resulting in many farms being bought by the computer. * I discovered that I could load additional installer components and I enabled the network-console. It was working fine, nice touch! Some minor points though: * If you load the module it shows instructions how to use it and also prints the ssh command to log in, but the ip / hostname was missing ('ssh installer@'). * The debian installer which is invoked as login shell prints 'Hurd console not started; disabling graphical frontend', not sure if that he should even think about that when invoked by sshd. * I'm using xterm from unstable here and the line art is messed up and navigating through *some* menus the screen got distorted. I think it happened only in the partition editor. * So now that I got a comfortable way of copying text I chrooted to the target installation and decided to take over where debootstrap gave up. I enabled some repositories and installed file. dpkg was trying to configure some packages, one of the three failing ones was groff-base: root@ganymede:/# file /var/lib/dpkg/info/groff-base.postinst /var/lib/dpkg/info/groff-base.postinst: timezone data, version 2, 9 gmt time flags, 9 std time flags, 24 leap seconds, 203 transition times, 9 abbreviation chars root@ganymede:/# dpkg -i /var/cache/apt/archives/groff-base_1.21-6_hurd-i386.deb [...] root@ganymede:/# file /var/lib/dpkg/info/groff-base.postinst /var/lib/dpkg/info/groff-base.postinst: POSIX shell script text executable * Okay, something really fishy is going on with the filesystem... * btw, some specs: May 1 12:30:28 kernel: ide: SiS 5513 (dual FIFO) DMA Bus Mastering IDE May 1 12:30:28 kernel: Controller on PCI bus 0 function 21 May 1 12:30:28 kernel: ide0: BM-DMA at 0xff00-0xff07 May 1 12:30:28 kernel: ide1: BM-DMA at 0xff08-0xff0f May 1 12:30:28 kernel: hd0: got CHS=4865/255/63 CTL=8 from BIOS May 1 12:30:28 kernel: hd0: WDC WD400EB-75CPF0, 38166MB w/2048kB Cache, CHS=4865/255/63 May 1 12:30:28 kernel: hd2: CRD-8484B, ATAPI CDROM drive May 1 12:30:28 kernel: ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 May 1 12:30:28 kernel: ide1 at 0x170-0x177,0x376 on irq 15 * at one point aptitude and apt-get were segfaulting while reading the package lists, aptitude update fixed this. * whoa! top showing meaningful cpu usages (was 0% or 99.9% before). * hm, aptitude is segfaulting again amd I still haven't installed gdb... root@ganymede:/etc/apt# aptitude dist-upgrade [100%] Building dependency treeSegmentation fault * I never could convince the crash-dump-core server to produce core dumps, I didn't work this time either. * Hm, I ran find on / and this killed the ext2fs server root@ganymede:/tmp# ls bash: /bin/ls: Computer bought the farm [ctrl+d] ~ # ps | grep ext2 3 root 489m S ext2fs --multiboot-command-line=root=gunzip:device:r 4066 root 146m S grep ext2 * uh, trying to fsck my filesystem I hit tab trying to complete the path to the device, this took a very long time and so did ls /dev. /dev # fsck.ext2 hd0s1 e2fsck 1.41.12 (17-May-2010) ext2fs_check_if_mount: Can't check if filesystem is mounted due to missing mtab file while determining whether hd0s1 is mounted. hd0s1 was not cleanly unmounted, check forced. Pass 1: Checking inodes, blocks, and sizes Deleted inode 33 has zero dtime. Fix<y>? yes Deleted inode 65 has zero dtime. Fix<y>? yes Inode 97 is in use, but has dtime set. Fix<y>? hd0s1: e2fsck canceled. hd0s1: ***** FILE SYSTEM WAS MODIFIED ***** /dev # fsck.ext2 -a hd0s1 ext2fs_check_if_mount: Can't check if filesystem is mounted due to missing mtab file while determining whether hd0s1 is mounted. hd0s1 was not cleanly unmounted, check forced. hd0s1: Inode 289 has EXTENTS_FL flag set on filesystem without extents support. hd0s1: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY. (i.e., without -a or -p options) The connection died at this point, when I walked to the bo I was greeted by grub, let's see whether I can fire up the expert install, configure the network and get the installation going again. Yepp, net-console is up and running again. ~ # fsck.ext2 -y /dev/hd0s1 e2fsck 1.41.12 (17-May-2010) ext2fs_check_if_mount: Can't check if filesystem is mounted due to missing mtab file while determining whether /dev/hd0s1 is mounted. /dev/hd0s1 contains a file system with errors, check forced. Pass 1: Checking inodes, blocks, and sizes Inode 40517 is in use, but has dtime set. Fix? yes Inode 40517 has imagic flag set. Clear? yes Inode 40518 is in use, but has dtime set. Fix? yes Inode 40518 has imagic flag set. Clear? yes Inode 40519 is in use, but has dtime set. Fix? yes Inode 40519 has imagic flag set. Clear? yes Inode 40520 has EXTENTS_FL flag set on filesystem without extents support. Clear? yes Inode 40901 is in use, but has dtime set. Fix? yes Segmentation fault ~ # fsck.ext2 -y /dev/hd0s1 [...] ext2fs_write_inode: Illegal inode number while writing inode 2497 in pass1 e2fsck: aborted ~ # fsck.ext2 -y /dev/hd0s1 [...] Error storing directory block information (inode=11, block=898, num=0): Wrong magic number for directory block list structure e2fsck: aborted ~ # fsck.ext2 -y /dev/hd0s1 Segmentation fault ~ # tail /var/log/syslog Segmentation fault ~ # true * ssh session died at this point, reconnection gave me a new shell, nothing in /dev/klog or syslog. * I just noticed, if I spawn a shell from the net-console it tells me that I'll get back to the menu once I exit the shell, but now my ssh connections goes down if the shell exists. I *think* it worked before the reboot though. ~ # settrans -ag /target /hurd/ext2fs --writable /dev/hd0s1 ext2fs: /dev/hd0s1: warning: FILESYSTEM NOT UNMOUNTED CLEANLY; PLEASE fsck ext2fs: /dev/hd0s1: warning: MOUNTED READ-ONLY; MUST USE `fsysopts --writable' ext2fs: /dev/hd0s1: panic: main: no root node! Hm, not sure how to proceed from here. It could also be the aging hardware, memtest86+ didn't find any problems, maybe I should try to install a debian linux on there to test the controller and hard disk. The last debian hurd installation was created using my bootdisk and installer and it lasted over a year, so I'm pretty sure that the hardware is properly supported by gnumach. Let me say that despite the fact that the installation failed I am very impressed and it has been a nice adventure to explore the state of the debian installer on the Hurd :) Any thoughts? Justus
Attachment:
signature.asc
Description: PGP signature