[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Support for mirroring of EFI System Partition



On Fri, Sep 18, 2015 at 07:18:56PM +0200, Francesco Poli wrote:
>On Tue, 15 Sep 2015 23:54:36 +0200 Francesco Poli wrote:
>> On Sat, 12 Sep 2015 11:05:28 +0200 Francesco Poli wrote:
>> > 
>> > Please tell me if my reasoning makes sense to you or, otherwise,
>> > explain where I am being naive.
>> 
>> Please clarify whether my reasoning is flawed...
>
>I am trying hard to address this issue, but I need some explanations:
>that's why I would like to discuss my ideas...
>Please help me to help you!

Apologies for delayed response - the last week has been hellishly
busy with $dayjob stuff and I've had almost no time at all for
discussions elsewhere.

So:

>The problem is: if one ESP is considered to be the "master" one, and
>the other ESPs are "slave" ones, kept in sync with the "master", what
>happens when the drive hosting the "master" ESP breaks? The system
>should be able to boot from one "slave" ESP (assuming boot priorities
>are set with efibootmgr), but it won't be able to mount /boot/efi
>(since the fstab will refer to the inaccessible "master" ESP); at
>that point, if an upgrade of grub-efi-amd64 has to be performed
>before the dead drive is replaced, a new "temporarily-master" ESP has
>to be found and selected, mounted on /boot/efi, its content updated,
>and any remaining ESPs (if present) have to be synced to this
>"temporarily-master" ESP...

To be honest, I think you're making it sound more complicated than it
needs to be there. If the normal "master" disk has failed, simply pick
by hand the first of the replicas and call that the new master.

>Instead, if all ESPs are mounted on distinct mount points (/boot/efi
>, /boot/efi2 , /boot/efi3 , and so forth) and updated independently,
>there should be no need for special tricks whenever one of them is
>inaccessible (and thus not mounted).
>
>Please tell me if my reasoning makes sense to you or, otherwise,
>explain where I am being naive.

The proliferation of mount points looks messy to me, I'll be
honest. In fundamental terms, there's no real difference between what
you'll get on all the ESPs but you'll forever have more noise in "df"
and friends.

Hmmm, pondering... Is there any way to hide RAID metadata in some way
so we really *could* just do RAID1?

>> > > >  e) is there anything else missing?
>> > > 
>> > > It's quite likely that we'll find broken UEFI firmware implementations
>> > > that will get this all wrong.
>> > 
>> > Even for boot priorities on multiple ESPs?!?
>> > If this is the case, it looks like a major regression with respect to
>> > BIOS times, where one would just install GRUB on multiple MBRs and be
>> > fine!    :-(
>> 
>> Could someone please elaborate on the fragility of UEFI?
>Once again, is there anyone willing to explain a little more about this?

One of the problems that we're facing at this point and earlier in the
cycle of BIOS->UEFI upgrades is that UEFI is fundamentally larger and
more complex than BIOS. This is to be expected, as it's also vastly
more capable. However, we're still seeing implementors get things
wrong, either by incompetence or sheer laziness. The lack of real
compliance testing has been a major issue here, and things that
*should* work according to the spec still sometimes don't. The further
you go out from the core functionality, the more likely that is.

*However*, don't le my glib warning about broken implementations put
you off trying to do something clever and useful here.

>> > > I'm not sure how well most are likely to
>> > > deal with actual hardware failure. We'll find out, I guess... :-)
>> > 
>> > That's not comforting!  :-(
>> > 
>> > What I can do to test the setup, is (at most) try to boot with one
>> > drive disconnected and see what happens.
>> > One thing's sure: I will *not* intentionally break one drive [1], just
>> > to test how the UEFI firmware implementation deals with actual hardware
>> > failure!
>> > 
>> > 
>> > [1] how, by the way? with a hammer?!?   ;-)

What I'd be tampted to do to start with is simply unplug a drive,
either physically or logically. For most UEFI development, using
qemu/KVM and OVMF as a firmware binary is really useful. You get to
work directly on your development system, and it's possible to debug
things much more quickly and easily. If you're not sure how to do
that, shout. I'm planning on adding a second UEFI page in the wiki
when I get some time, detailing how I do this kind of
development. Hopefully it will help others too...

-- 
Steve McIntyre, Cambridge, UK.                                steve@einval.com
Google-bait:       http://www.debian.org/CD/free-linux-cd
  Debian does NOT ship free CDs. Please do NOT contact the mailing
  lists asking us to send them to you.


Reply to: