problems installing jessie on Dell R815 and C6145
I have a Debian farm of 24 machines running wheezy that I am upgrading to
jessie. Four of the machines are identical Dell R815 servers, each with 6
disks. (The others are Dell T5500 with 4 disks each (12 machines), Dell C6145
with 4 disks each (4 machines), or HP Proliant DL165 G5p with 3 disks each (4
machines).) All of the machines of each type are identical. Upgrading the
T5500, C6154, and DL165 is progressing with only minor issues that I will get
into if it matters. But I am having difficulty with the R815. So far, I have
only attempted to upgrade one machine. The other 3 are still running wheezy.
All have been running wheezy reliably for about 5 years.
I have made a USB flash drive with:
   http://cdimage.debian.org/cdimage/unofficial/non-free/cd-including-firmware/8.5.0+nonfree/amd64/iso-cd/firmware-8.5.0-amd64-netinst.iso
   http://ftp.nl.debian.org/debian/dists/jessie/main/installer-amd64/current/images/hd-media/boot.img.gz
I have used this to successfully install jessie on six of the T5500, one of
the DL165, and one of the C6145. But on the R815, I notice nondeterministic
behavior. I have tried the install about 7 times. All 7 succesfully boot into
the install screen. I select
  Install (default)
  English (default)
  United States (default)
  American English (default)
It then searches for hardware and an ISO image. Four of the seven times, I get
a red screen that claims that it can't find an ISO image. Three of the seven
times, it finds one, proceeds with the install, but fails during the grub
install. Each of those three times, it failed in different ways. I can
describe the precise ways in followup email if someone can help. I'm not
including the details here to avoid clutter.
The machine was running wheezy reliably until a few days ago.
The identical install media (a USB flash drive) has successfully been used to
install jessie on the T5500, DL165, and C6145, both before and after the
attempts on the R815. So I believe that the media is OK.
Because of the nondeterministic behavior, and the fact that the machine was
running reliably, I believe that there are no hardware problems, at least not
ones that were tickled by wheezy. Perhaps there are hardware problems that are
tickled by jessie. I conjecture that there may be kernel or installer changes
that no longer work reliably on R815 hardware.
What also may help in diagnosing the issues is the following observation.
While the installs on six T5500s and one C6145 were all successful, the
install on the C6145 took about 3 hours, while each install on a T5500 took
about 15 minutes. The T5500s and C6145 both have the same number of disks of
the same size. And all are partitioned and configured identically. So one
would not expect much difference in install time. Yet when installing on the
C6145, the steps that involve probing hardware take hours while those steps on
on the T5500 are instantaneous. This suggests that there have been software
issues introduced into the installer and/or kernel that affect the R815 and
C6145 platforms.
FWIW, the c-a-f4 vt during the install on the C6145 has a huge number of
messages:
   reset high-speed usb device number 6 using ehci-pc
   reset high-speed usb device number 7 using ehci-pc
Even after the successful install, dmesg on that machine gives a huge number
of
[18660.329918] usb 1-5.2: reset high-speed USB device number 5 using ehci-pci
I have nothing plugged into USB on that machine.
That machine was running wheezy reliably for 5 years prior to the upgrade to
jessie.
    Jeff (http://engineering.purdue.edu/~qobi)
Reply to: