[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Support for mirroring of EFI System Partition



Hi

On 2015-09-19, Francesco Poli wrote:
> On Sat, 19 Sep 2015 01:01:28 +0100 Steve McIntyre wrote:
> > On Fri, Sep 18, 2015 at 07:18:56PM +0200, Francesco Poli wrote:
> > >On Tue, 15 Sep 2015 23:54:36 +0200 Francesco Poli wrote:
> > >> On Sat, 12 Sep 2015 11:05:28 +0200 Francesco Poli wrote:
[...]
> That's understandable, I just have my new box sitting there waiting for
> me to install Debian stretch on it and I would like to avoid botching
> the installation plan and having to start over multiple times...

As Steve McIntyre mentioned, many UEFI firmwares are deeply flawed and
buggy, if you want to rely on it behaving sanely, you'll have to dry 
test the failure case on the mainboard in question (ovmf helps a lot to
experiments with the basics before).

[...]
> > To be honest, I think you're making it sound more complicated than it
> > needs to be there. If the normal "master" disk has failed, simply pick
> > by hand the first of the replicas and call that the new master.
> 
> What do you mean "pick by hand"?
> If I understand correctly, you mean that, after one drive breaks, the
> user would have to:

In the best case, the disk is totally dead and makes this already 
apparent to the mainboard firmware, so the firmware won't even try
booting from it. Unfortunately this is the least likely case of drive
breakage...

Pick by hand usually usually means to invoke the UEFI firmware boot
menu and to select the non-default boot entry by hand. For two reasons,
on the one hand your primary disk might not be totally dead and still get
into the way - on the other hand I wouldn't expect every UEFI firmware
to fall back silently, if the non-default UEFI entry can't be found.

Yes, this is a problem for remote systems where you can't just go there
and select a different boot entry on the local console.

>  • learn about it (by, e.g., receiving local mail about degraded
> arrays, the usual stuff) *before* the next grub-efi-amd64 upgrade

yes.

>  • manually check whether the broken drive is the one hosting the
> "master" ESP

not really, just don't upgrade grub if you know about a drive in your
array, after all the old grub should have been good enough to boot the
system.

>  • in case it is, manually alter /etc/fstab to mount one of the "slave"
> ESPs on /boot/efi

I wouldn't even do this. the ESP is only needed by the mainboard 
firmware at boot time, it doesn't need to be mounted under linux at all
(unless you want to upgrade grub, install a new kernel with gummiboot or
want to access the ESP any other way). Therefore I'd just mount it nofail
and only react to drive failures (by changing the mountpoint) when it's
actually necessary to do so.

>  • manually mount that "slave" ESP on /boot/efi
> 
>  • manually instruct grub-efi-amd64 to consider this "slave" ESP as the
> "temporarily-master" one and begin to sync other remaining ESPs (if
> any) to this one (maybe this may be automated, by having grub-efi-amd64
> consider the ESP mounted on /boot/efi as the "master" one, but the
> package needs to find the accessible "slave" ESPs anyway and this may
> be tricky)

imho, just ignore it until you have installed the new drive - unless 
replacing the disk will take more than a couple of days. But this depends
hugely on how reliably your system needs to be and how sure you are not
to make things worse by doing uncommon things in a hurry.

[...]
> > The proliferation of mount points looks messy to me, I'll be
> > honest. In fundamental terms, there's no real difference between what
> > you'll get on all the ESPs but you'll forever have more noise in "df"
> > and friends.
> 
> As I said, the extra "noise" in the output of df and friends seems to
> me a more than acceptable price for avoiding all the additional manual
> operations described above!

Finding a common policy would be great, but in general I wouldn't consider
the existance of additional mountpoints as particularly messy (all the
virtual filesystems provide more noise in that regard).

> > Hmmm, pondering... Is there any way to hide RAID metadata in some way
> > so we really *could* just do RAID1?
> 
> I am afraid I don't understand what you mean by this sentence: could
> you please elaborate a bit?
> 
> [...]
> > we're still seeing implementors get things
> > wrong, either by incompetence or sheer laziness.
> [...]
> > 
> > *However*, don't le my glib warning about broken implementations put
> > you off trying to do something clever and useful here.
> 
> I'll try to do my best, but I was a little scared by this mess of
> broken UEFI implementations: after all, one wants RAID1 (or some more
> sophisticated RAID level) to get data redundancy and survive a drive
> failure; if the system fails to boot, when one drive breaks, then the
> usefulness of RAID1 is seriously reduced!

I have to agree here, yes in theory pushing the whole RAID question to
the hardware and using its very own RAID methods sounds tempting.
Personally I see the same problems here as with hardware specific RAID
implementations in general - what happens if the old mainboard dies...

I last tried (Intel-) fakeraid with RAID1 when sandy-bridge was brand new,
then the current grub2 versions in unstable couldn't deal with it properly
(as in not managing to install on it); meanwhile that probably has changed,
but there's still the issue of "what happens with the fakeraid array when 
I replace my existing/ hypothetical broken sandy-bridge mainboard with a
new skylake specimen (or something completely different, AMD)..." This
simply shows that fakeraid just isn't a reliable option, unless you're
in a commercial environment with identical spares on hand (and identical
mainboard firmware versions, etc. pp.).

> > >> > > I'm not sure how well most are likely to
> > >> > > deal with actual hardware failure. We'll find out, I guess... :-)
> > >> > 
> > >> > That's not comforting!  :-(
> > >> > 
> > >> > What I can do to test the setup, is (at most) try to boot with one
> > >> > drive disconnected and see what happens.
> > >> > One thing's sure: I will *not* intentionally break one drive [1], just
> > >> > to test how the UEFI firmware implementation deals with actual hardware
> > >> > failure!
> > >> > 
> > >> > 
> > >> > [1] how, by the way? with a hammer?!?   ;-)
> > 
> > What I'd be tampted to do to start with is simply unplug a drive,
> > either physically or logically.
> 
> The first test I had in mind was just that: unplug one drive, attempt
> to boot the system and see what happens.

That's the best -and only- thing you can test, the worst failure modes
"disk no longer working properly, but also not being registered as dead
by the mainboard firmware" are close to impossible to test proactively
anyways.

> > For most UEFI development, using
> > qemu/KVM and OVMF as a firmware binary is really useful. You get to
> > work directly on your development system, and it's possible to debug
> > things much more quickly and easily. If you're not sure how to do
> > that, shout.
> 
> I have unfortunately zero experience with KVM.
> It could be useful to test a modified grub-efi-amd64 package, when we
> reach that point of development.
> Hence, I'll sure get back to you and ask for help later.
> But now I need to install Debian stretch on physical hardware,
> instead...
[...]

qemu/ kvm are invaluable tools for this kind of testing, while it can't
completely replace testing these things on real hardware (due to real
hardware showing different kinds of bugs and brokeness, yes, when it 
comes to mainboard firmware -BIOS or UEFI alike- you really see totally
braindead failures), you can't overestimate qemu's usefulness. It's very
much worth it investing some time into familiarising yourself with qemu, 
before spending too much time on this on the real hardware - being able 
to quickly assess something in a virtual machine is a huge time saver.

[ personally I use qemu/ kvm directly, as libvirt-bin and other frontends
  never really managed to convince me ]

Regards
	Stefan Lippers-Hollmann

Attachment: pgp2vcFtvYQ6L.pgp
Description: Digitale Signatur von OpenPGP


Reply to: