[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Installation report of the July 1st version of the debian installer image



Hey everyone :)

the filesystem of my hurd box died some time ago, only the root
filesystem, probably because of the sync problems on shutdown. The box
is an athlon 1.5Ghzish, 512MByte of ram, 40 gig hard drive, two nics,
one rtl8139c that works fine with gnumach, one sis900 that works both
with gnumach (well I think it does, not sure thoug) and a dde userspace
driver (that was a nice experience, unfortunately the server died every
now and then). So I downloaded the current (2011-07-01) version of
Samuels images and bootet the system.

Impressions:

 * Xorg failed to use the correct resolution / refresh rates, monitor
   driven out of spec (19" flat screen). I never tried Xorg on that box
   before, so I can't say whether this is specific to the installer or
   not.
 * Locating the cdrom drive takes a long time, some error messages are
   printed to stderr ('end_request: I/O error, dev 02:00, sector 0',
   once more for 02:01)
 * wow! network autoconfiguration using dhcp worked fine :)
 * strange though, /var/log/syslog shows:

May  1 12:30:28 kernel: rtl8139.c:v1.23a 8/24/2003 Donald Becker, becker@scyld.com.
May  1 12:30:28 kernel:  http://www.scyld.com/network/rtl8139.html
May  1 12:30:28 kernel: eth0: RealTek RTL8139C Fast Ethernet at 0xd000, IRQ 5,  0:c0:df:10:fc:30.
May  1 12:30:28 kernel: sis900.c: modified v1.06.11  4/30/2002
May  1 12:30:28 kernel: eth1: SiS 900 PCI Fast Ethernet at 0xd400, IRQ 5,  0: b:6a:56:98:c5.
May  1 12:30:28 kernel: eth1: Realtek RTL8201 PHY transceiver found at address 1.
May  1 12:30:28 kernel: eth1: Using transceiver found at address 1 as default

Has the sis900 card a realtek phy or is the log wrong? The realtek card
works fine though, havent tried the sis900.

 * The cdrom isn't shown in the partition list like it did in a
   previous version I tried, this is nice :)
 * I tried normal installation about five times, without restarting the
   installer, just running the partitioner and base install over and
   over again.
 * It failed at various points within the debootstrap process, for
   example 'Warning: Failure trying to run:
   chroot /target /usr/lib/hurd/setup-translators -k'. The beginning of
   that file contained a small number of garbage bytes, the rest of the
   file looked fine. /var/log/messages contained
   'debootstrap: /usr/lib/hurd/setup-translators /usr/lib/hurd/setup-translators:
   cannot execute binary file'.
 * the filesystem is not umounted if the installation failed, not sure
   what the consequence was, maybe the partitioner was failing or it
   wasn't possible to format the partition and keeping the files was
   not my intention...
 * at one point the ext2fs translator of the target partition died
   resulting in many farms being bought by the computer.
 * I discovered that I could load additional installer components and I
   enabled the network-console. It was working fine, nice touch! Some
   minor points though:
  * If you load the module it shows instructions how to use it and also
    prints the ssh command to log in, but the ip / hostname was missing
    ('ssh installer@').
  * The debian installer which is invoked as login shell prints 'Hurd
    console not started; disabling graphical frontend', not sure if that
    he should even think about that when invoked by sshd.
  * I'm using xterm from unstable here and the line art is messed up and
    navigating through *some* menus the screen got distorted. I think it
    happened only in the partition editor.
 * So now that I got a comfortable way of copying text I chrooted to
   the target installation and decided to take over where debootstrap
   gave up. I enabled some repositories and installed file. dpkg was
   trying to configure some packages, one of the three failing ones was
   groff-base:

root@ganymede:/# file /var/lib/dpkg/info/groff-base.postinst
/var/lib/dpkg/info/groff-base.postinst: timezone data, version 2, 9 gmt time flags, 9 std time flags, 24 leap seconds, 203 transition times, 9 abbreviation chars
root@ganymede:/# dpkg -i /var/cache/apt/archives/groff-base_1.21-6_hurd-i386.deb
[...]
root@ganymede:/# file /var/lib/dpkg/info/groff-base.postinst
/var/lib/dpkg/info/groff-base.postinst: POSIX shell script text executable

 * Okay, something really fishy is going on with the filesystem...
 * btw, some specs:

May  1 12:30:28 kernel: ide: SiS 5513 (dual FIFO) DMA Bus Mastering IDE 
May  1 12:30:28 kernel:     Controller on PCI bus 0 function 21
May  1 12:30:28 kernel:     ide0: BM-DMA at 0xff00-0xff07
May  1 12:30:28 kernel:     ide1: BM-DMA at 0xff08-0xff0f
May  1 12:30:28 kernel: hd0: got CHS=4865/255/63 CTL=8 from BIOS
May  1 12:30:28 kernel: hd0: WDC WD400EB-75CPF0, 38166MB w/2048kB Cache, CHS=4865/255/63
May  1 12:30:28 kernel: hd2: CRD-8484B, ATAPI CDROM drive
May  1 12:30:28 kernel: ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
May  1 12:30:28 kernel: ide1 at 0x170-0x177,0x376 on irq 15

 * at one point aptitude and apt-get were segfaulting while reading the
   package lists, aptitude update fixed this.
 * whoa! top showing meaningful cpu usages (was 0% or 99.9% before).
 * hm, aptitude is segfaulting again amd I still haven't installed
   gdb...

root@ganymede:/etc/apt# aptitude dist-upgrade
[100%] Building dependency treeSegmentation fault

 * I never could convince the crash-dump-core server to produce core
   dumps, I didn't work this time either.
 * Hm, I ran find on / and this killed the ext2fs server

root@ganymede:/tmp# ls
bash: /bin/ls: Computer bought the farm
[ctrl+d]
~ # ps | grep ext2
    3 root      489m S    ext2fs --multiboot-command-line=root=gunzip:device:r
 4066 root      146m S    grep ext2

 * uh, trying to fsck my filesystem I hit tab trying to complete
   the path to the device, this took a very long time and so did
   ls /dev.

/dev # fsck.ext2 hd0s1
e2fsck 1.41.12 (17-May-2010)
ext2fs_check_if_mount: Can't check if filesystem is mounted due to missing mtab file while determining whether hd0s1 is mounted.
hd0s1 was not cleanly unmounted, check forced.
Pass 1: Checking inodes, blocks, and sizes
Deleted inode 33 has zero dtime.  Fix<y>? yes

Deleted inode 65 has zero dtime.  Fix<y>? yes

Inode 97 is in use, but has dtime set.  Fix<y>?

hd0s1: e2fsck canceled.

hd0s1: ***** FILE SYSTEM WAS MODIFIED *****
/dev # fsck.ext2 -a hd0s1
ext2fs_check_if_mount: Can't check if filesystem is mounted due to missing mtab file while determining whether hd0s1 is mounted.
hd0s1 was not cleanly unmounted, check forced.
hd0s1: Inode 289 has EXTENTS_FL flag set on filesystem without extents support.


hd0s1: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
        (i.e., without -a or -p options)

The connection died at this point, when I walked to the bo I was
greeted by grub, let's see whether I can fire up the expert install,
configure the network and get the installation going again. Yepp,
net-console is up and running again.

~ # fsck.ext2 -y /dev/hd0s1
e2fsck 1.41.12 (17-May-2010)
ext2fs_check_if_mount: Can't check if filesystem is mounted due to
missing mtab file while determining whether /dev/hd0s1 is
mounted. /dev/hd0s1 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes Inode 40517 is in use, but
has dtime set.  Fix? yes

Inode 40517 has imagic flag set.  Clear? yes

Inode 40518 is in use, but has dtime set.  Fix? yes

Inode 40518 has imagic flag set.  Clear? yes

Inode 40519 is in use, but has dtime set.  Fix? yes

Inode 40519 has imagic flag set.  Clear? yes

Inode 40520 has EXTENTS_FL flag set on filesystem without extents
support. Clear? yes

Inode 40901 is in use, but has dtime set.  Fix? yes

Segmentation fault
~ # fsck.ext2 -y /dev/hd0s1
[...]
ext2fs_write_inode: Illegal inode number while writing inode 2497 in
pass1 e2fsck: aborted
~ # fsck.ext2 -y /dev/hd0s1
[...]
Error storing directory block information (inode=11, block=898, num=0):
Wrong magic number for directory block list structure e2fsck: aborted
~ # fsck.ext2 -y /dev/hd0s1
Segmentation fault
~ # tail /var/log/syslog 
Segmentation fault
~ # true

 * ssh session died at this point, reconnection gave me a new shell,
   nothing in /dev/klog or syslog.
 * I just noticed, if I spawn a shell from the net-console it tells me
   that I'll get back to the menu once I exit the shell, but now my ssh
   connections goes down if the shell exists. I *think* it worked
   before the reboot though.

~ # settrans -ag /target /hurd/ext2fs --writable /dev/hd0s1
ext2fs: /dev/hd0s1: warning: FILESYSTEM NOT UNMOUNTED CLEANLY; PLEASE fsck
ext2fs: /dev/hd0s1: warning: MOUNTED READ-ONLY; MUST USE `fsysopts --writable'
ext2fs: /dev/hd0s1: panic: main: no root node!

Hm, not sure how to proceed from here. It could also be the aging
hardware, memtest86+ didn't find any problems, maybe I should try to
install a debian linux on there to test the controller and hard disk.
The last debian hurd installation was created using my bootdisk and
installer and it lasted over a year, so I'm pretty sure that the
hardware is properly supported by gnumach.

Let me say that despite the fact that the installation failed I am very
impressed and it has been a nice adventure to explore the state of the
debian installer on the Hurd :)

Any thoughts?
Justus

Attachment: signature.asc
Description: PGP signature


Reply to: