[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Possible SILO config bug in installation program on Lenny/SPARC NetInstall ISO image



Hello folks,

I believe I have found a bug in the Lenny installer on UltraSPARC.  It occurs during the installation of SILO.  When selecting "install" from the GRUB menu, the installation proceeds normally to the "Select and install software" screen.  At 77%, the installation appears to "hang", displaying "Configuring silo" and will sit there forever.  This is on the 2008-05-06 weekly image of the Netinstall CD-ROM image (the 110MB image).

HARDWARE:

Sun Ultra 5 with 270MHz CPU, 256MB DRAM, 15GB hard disk, and the built-in Happy Meal Ethernet.

TO REPRODUCE:

Go through the installation program and choose "install" mode, i. e. not "expert".
Install normally, choosing sane stuff (basically the defaults, except for partitioning).
Do not select anything but the "Standard system" during software selection.
Sit back and watch the installation proceed.

Partitioning is as follows:

/dev/hda1 = /boot
/dev/hda2 = /
/dev/hda4 = swap
/dev/hda5 = /var
/dev/hda6 = /home

WHAT HAPPENS:

The "Select and install software" screen comes up, and software starts getting installed.  The software installation proceeds until 77%, "Configuring silo", at which time the installation appears to "hang".  It will sit there forever.  There is no option to break out of the installation.

I tried going to vty4 (Ctrl-Alt-F4) to see if there were any log messages.  Indeed, there are.  I see the following:

May 13 23:14:25 in-target: SETTING UP SILO (1.4.13A+GIT20070930-2) ...^M
May 13 23:14:25 in-target: SILO, THE SPARC IMPROVED LOADER, SETS UP YOUR SYSTEM TO BOOT LINUX^M
May 13 23:14:25 in-target: DIRECTLY FROM YOUR HARD DISK, WITHOUT THE NEED FOR A BOOT FLOPPY OR A NET^M
May 13 23:14:25 in-target: BOOT.^M
May 13 23:14:25 in-target:
May 13 23:14:25 in-target: YOU ALREADY HAVE A SILO CONFIGURATION IN THE FILE /BOOT/SILO.COMF^M
May 13 23:14:25 in-target: INSTALL A BOOT BLOCK USING YOUR CURRENT SILO CONFIGURATION? [YES]

ATTEMPTED WORKAROUND:

It is possible to work around this issue.  I tried two methods, one of which worked.  The first is simply to use the "expert" installation, which completely avoids the issue.  This is the one that works.

The second method is considerably longer and no longer seems to work for me (it did work on the 2008-04-01 image).  I did it by going into another vty and repeatedly killing the siloconfig process that Debian runs.  Of course, I have to manually run silo from another vty (vty3, in this case) to make sure that the system will actually boot.  :-)

There are apparently two processes that deal with SILO configuration during installation.  Going to vty3 and doing a "ps ax" yields the following two processes:

/bin/sh /var/lib/dpkg/info/silo.postinst configure 1.
/usr/bin/perl /usr/sbin/siloconfig

If I kill the second process (/usr/bin/perl /usr/sbin/siloconfig), then the installation proceeds with exim, etc., and the logs on vty4 do reflect this.  However, at 94%, "Installed iamerican", the installation "hangs" again, and I see the following reappear (in mixed case this time) on vty4:

May 14 00:02:47 in-target: A package failed to install.  Trying to recover:
May 14 00:02:47 in-target:
May 14 00:02:47 in-target: Setting up silo (1.4.13a+git20070930-2) ...
May 14 00:02:47 in-target: SILO, the Sparc Improved LOader, sets up your system to boot Linux
May 14 00:02:47 in-target: directly from your hard disk, without the need for a boot floppy or a net
May 14 00:02:47 in-target: boot.
May 14 00:02:47 in-target:
May 14 00:02:47 in-target: You already have a SILO configuration in the file /boot/silo.conf
May 14 00:02:47 in-target: Install a boot block using your current SILO configuration? [Yes]


Again, the following processes appear on "ps ax":

/bin/sh /var/lib/dpkg/info/silo.postinst configure 1.
/usr/bin/perl /usr/sbin/siloconfig

I killed the /usr/sbin/siloconfig process (Process #17021), and I got the expected messages:

/var/lib/dpkg/info/silo.postinst: line 2: 17021 Terminated    /usr/sbin/siloconfig
dpkg: error processing silo (--configure):
subprocess post-installation script returned error exit status 143
Errors were encountered while processing:
  silo


Then, the installation continued for about one second, and again I got the "Setting up silo (1.4.13a+git20070930-2) ..." messages asking to install a boot block using my current SILO configuration.  This time, I killed both the /var/lib/dpkg/info/silo.postinst process *and* the /usr/sbin/siloconfig process.

Finally, I got the bright red backgrounded screen saying "Installation step failed" and advising me that I can try to run the failing item again from the menu, or skip it and choose something else.  The failing step is: Select and install software.

At this point, I switch back to vty3, cd to /target, and issue the following commands:

chroot /target
silo


Go back to vty1 and choose "Continue without installing boot loader".  We get to the "Finishing installation" screen, where it hangs at 15%, "Storing language".  A look over at vty4 shows the above SILO messages once again.  I have to kill the /usr/sbin/siloconfig process again for things to continue on.  Then I do get the "Installatoin complete" box.  I choose Continue.

The box reboots, SILO comes up, and I get a "Cannot find /vmlinux (Unknown ext2 error)" message.  Looking at /etc/silo.conf, it appeared to be pointing to the wrong partition.  Furthermore, there was no "vmlinux" in /boot, but there is a "vmlinuz", with a "z".

To attempt to recover, I booted the Netinstall CD in "rescue" mode, deleted /etc/silo.conf (which symlinks to /boot/silo.conf), and ran siloconfig manually, followed by running silo.  Still can't find /vmlinuz.  Now, that's probably due to my not having played with SILO very much (I'm mostly on x86), and I'll teach myself SILO.  But I'd imagine folks doing a basic Debian install still aren't expected to have to do all of this.  :-)

POSSIBLE REGRESSION INFO:

This issue has also occurred on the 2008-04-01 version of the Netinstall SPARC ISO image.  When choosing the "Expert" install on either the 2008-04-01 or the 2008-05-06 ISO images, this issue does not occur, and SILO installs just fine.  This issue also does not happen on Debian Etch using either type of install (normal or expert).

--TP
______________________________________
Terrell Prudé, Jr., Network Analyst
Fairfax County Public Schools, Dept. of IT
Network Management Services (NMS)
(703) 329-7575


Reply to: