Re: jessie won't install/boot on a Dell Poweredge R815
On Tue, 21 Jun 2016, Jeffrey Mark Siskind wrote:
Are you certain that there isn't a PERC H700 in this machine? [Sort of
odd that mpt2sas is triggering a state error in your screenshot if there
actually isn't one.]
> I don't believe that I have any add-in cards. The machine was
> purchased straight from Dell. It has six SATA disks and 4 gigabit
> ethernet ports. It has four 12-core AMD CPUs and 128GB RAM. The output
> of lspci on an indentical machin purchased at the same time that is
> still running wheezy is enclosed below.
> 00:11.0 SATA controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 SATA Controller [IDE mode]
makes me think that the SATA controller is in IDE/Legacy mode instead of
AHCI. In theory, this shouldn't matter, but it's possible that this is
also a problem. I'd try switching it in the bios and see what happens.
> What does the kernel output while it is detecting the disks and
Remove the quiet option from the kernel command line by editing it in grub.
> Do all of the drives show up properly?
echo /dev/sd*; should give you an idea of what is there in the initramfs.
> When the boot fails, can you read from the underlying block
more /dev/sda; should work, I believe.
> I don't know what one can do in at the initramfs command prompt. If you give
> me some commands, I will try them out and post the output.
> Does specifying delay=20 or similar result in a successful boot?
> I will try this.
This should actually be rootdelay=20; sorry.
> I will try to get this info. It will require me to redo the exercise
> of a fresh jessie install from USB. I'll have to take and post screen
> pictures because I have no way to capture the console output.
I believe the R815 still has a serial port; you can just plug in a
serial cable and append an appropriate serial tty option to the kernel
command line to get output as text.
> But again note, that I do not believe that there are any disk hardware
> errors. And I do not believe that there are any data errors in the
> layout of the ext3 file system, the layout of the md0 raid array, or
> the partition tables. The reason is that after the failed jessie
> install, I reinstall a fressh wheezy from USB. I don't repartition.
> And I don't rebuild md1 and don't rebuild /aux. But I do rebuild md0
> and / as part of the fresh install. And it works.
Yes; it's possible that a change in one of the drivers between the
wheezy and jessie kernels is exposing a firmware bug (or there's a bug
in the kernel itself) which is causing this issue.
What I'm trying to do is get enough information so that the error is
Don Armstrong https://www.donarmstrong.com
What I can't stand is the feeling that my brain is leaving me for
someone more interesting.