Re: AWS boot failure: symbol 'grub_calloc' not found

To: <debian-cloud@lists.debian.org>
Cc: "Noah Meyerhans" <noahm@debian.org>
Subject: Re: AWS boot failure: symbol 'grub_calloc' not found
From: "Phil Endecott" <phil_itxjl_endecott@chezphil.org>
Date: Tue, 11 Aug 2020 20:19:10 +0100
Message-id: <[🔎] 1597173550026@dmwebmail.dmwebmail.chezphil.org>
In-reply-to: <[🔎] 1597167744040@dmwebmail.dmwebmail.chezphil.org>
References: <[🔎] 1597167744040@dmwebmail.dmwebmail.chezphil.org>

(I've just seen Noah's message with the link to the reddit thread -
there should be something useful in there - but I'd written the
following already, so I'm sending it anyway.)

Following up on my earlier message:

Phil Endecott wrote:
> I've just rebooted an AWS instance running buster and it
> has failed to restart, with the message "symbol 'grub_calloc'
> not found" shown on the console screenshot.

I have attempted to fix it by attaching and mounting the failed
instance's root device on another temporary instance, applying
the grub fixes suggested in bug 966575, and re-attaching to
the original instance and restarting. This has not been
successful; I could really do with some help!

It's straightforward to detach an instance's root volume in
the EC2 console and attach it to another temporary instance
(in the same availability zone). I then followed the advice
from Colin Watson in message 184 of bug 966575: bind-mount
/dev,proc,sys and chroot into the problematic filesystem,
and dpkg-reconfigure grub-pc. That did not look successful:

# dpkg-reconfigure grub-pc
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.19.0-10-amd64
Found initrd image: /boot/initrd.img-4.19.0-10-amd64
Found linux image: /boot/vmlinuz-4.19.0-9-amd64
Found initrd image: /boot/initrd.img-4.19.0-9-amd64
Found linux image: /boot/vmlinuz-4.9.0-5-amd64
Found initrd image: /boot/initrd.img-4.9.0-5-amd64
WARNING: Device /dev/nvme0n1 not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/nvme0n1p1 not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/nvme0n1p14 not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/nvme0n1p15 not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/nvme1n1 not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/nvme1n1p1 not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/nvme0n1 not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/nvme0n1p1 not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/nvme0n1p14 not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/nvme0n1p15 not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/nvme1n1 not initialized in udev database even after waiting 10000000 microseconds.
WARNING: Device /dev/nvme1n1p1 not initialized in udev database even after waiting 10000000 microseconds.
Found Debian GNU/Linux 10 (buster) on /dev/nvme0n1p1
done

Note that /dev/nvme0 is the temporary instance's root, not
the device I'm trying to repair which is nvme1.

I decided to try it anyway to see what happened, and at this
point I realised that detaching an EC2 instance's root device
is probably *not a good idea* (and I'm surprised there isn't
a warning when doing so). It seems that you can't just
re-attach it where it was; when you try it is attached as a
non-root block device, and the instance refuses to launch.
To use it you need to 1. make a snapshot of the instance, 2. make
an AMI from the snapshot, 3. launch a new instance from the
snapshot. If anyone knows of a better way, please tell me!
Anyway I did all that and the new instance fails to boot in
the same way. Here is the complete content of the screenshot:

SeaBIOS (version 1.7375.34-g83e4d3a3f79e.18)
Machine UUID ec2123456........
Booting from Hard Disk 0...
error: symbol 'grub_calloc' not found.
Entering rescue mode...
grub_rescue> _

Thanks, Phil.

Reply to:

References:
- AWS boot failure: symbol 'grub_calloc' not found
  - From: "Phil Endecott" <phil_itxjl_endecott@chezphil.org>

Prev by Date: Re: AWS boot failure: symbol 'grub_calloc' not found
Next by Date: Re: smaller cloud images, raw cloud images
Previous by thread: Re: AWS boot failure: symbol 'grub_calloc' not found
Next by thread: Re: AWS boot failure: symbol 'grub_calloc' not found
Index(es):
- Date
- Thread