[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: jessie won't install/boot on a Dell Poweredge R815



On Mon, 2016-06-27 at 08:07 -0400, Jeffrey Mark Siskind wrote:
[...]
>     Whenever I observe any of the behavior reported in this email, it is
>     almost always associated with dmesg reporting the same error on the same
>     sector 2056 (sometimes 2058 or 2062). Given the dozens of attempted
>     reinstalls and reboots, at this point, I have seen this on almost all, if
>     not all, of the six disks on each of the four machines. I don't believe
>     that 24 disks all have the same bad sectors.

The first partition probably starts at an offset of 1MB, which is 2048
sectors.  So these errors are presumably occurring while reading a
filesystem label near the start of that partition, which is pretty much
the first thing that will happen after the array is assembled.

[...]
>  D. In step (4), there appears to be nondeterminism in the serial numbers of
>     the disks that get reported in the menu of options of where to install
>     grub. Sometimes, the disks get reported as ata-*, sometimes as scsi-*.
>     Note that all of my disks are SATA so the ones reported as scsi-* are
>     clearly in error. If I do fresh installs multiple times on the same
>     machine, each time it reports different serial numbers for the disks.

Linux uses an ATA/SCSI translation layer (libata), so that each ATA
drive is also seen as a SCSI drive and has two such identifiers.  The
non-determinism in which identifiers are shown might be a bug in the
installer, or it might be caused by failure of ID commands to the
drives.

[...]
> Note that there is a lot of nondeterministic behavior (all cases above where I
> say "sometimes"). In all cases, I do exactly the same thing over and over to
> the same machine and get different behavior.

This is an unfortunate effect of doing multiple things in parallel,
which is really the only way to make them go fast.

I think most of the problems you're still having must be caused by a
bug in the RAID driver, mpt2sas (or its firmware, if that's not
embedded in the BIOS).

Ben.

-- 

Ben Hutchings
Humour is the best antidote to reality.

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: