Hi all,
I'm having problems with one of my machines. it's a Pine RockPro64.
Debian bookworm has been running very stably on it for some time. I
rebooted it a couple of weeks ago for maintainance, having applied
updates, after 108 days up. I have an encrypted LVM volume, containing
root, swap, and data LVs, with /boot on a MMC card, and I use
cryptsetup-initramfs to allow me to log in and unlock the volume at
boot time, via ssh.
I rebooted it this morning, due to a crash, and it didn't come back up.
I can still connect to dropbear (running in the initramfs context),
and the MoTD (as always) prompts me to run cryptroot-unlock, which
appears to do its job (ie. the LVM volume is unlocked / decrypted),
however it does not proceed to switch root (and drop the ssh
connection), as it used to, with complete reliability.
At first, I suspected a problem with the rootfs, however this appears
not to be the issue - the volume is present at the expected location,
and can be manually mounted.
Executing the following allows me to enter a chroot on the rootfs:
mount -text4 /dev/vg0/root /root
mount --bind /dev /root/dev
mount --bind /proc /root/proc
mount --bind /sys /root/sys
chroot /root /bin/bash --login
and following this I can run:
mount /boot
which correctly mounts the MMC card containing the /boot partition in
the chrooted environment.
Inspecting /boot, everything appears to be in order. I have issued
update-initramfs a few times, even completely removing the existing
initramfs and recreating it. I have also inspected the initramfs
built by update-initramfs, and can see nothing out of the ordinary.
crypttab is copied from the host, and the UUID matches that displayed
by lsblk -f - which is not surprising given that executing
cryptroot-unlock does, in fact, decrypt the volume.
Once chrooted, I can see that /sbin/init is a symlink to
/lib/systemd/systemd, which exists and is executable, but obviously
cannot be executed as anything other than PID 1. Attempting to execute
it results in it complaining of a missing argument, or, if one is
provided, an error that it is ignoring the request due to running in a
chroot.
The kernel command line (cat /proc/cmdline) contains a correct root=
entry, which points at /dev/mapper/vg0-root
I'm stumped - I cannot see why the initramfs environment fails to
mount the rootfs and execute init.
I have run an additional apt update / upgrade / dist-upgrade, whilst
under chroot, in the hope that it will magically fix everything, but
to no avail.
I was using a custom dtb that enabled PCIe x4 on the board, but have
removed that and reverted to the debian-supplied .dtb file just in case.
Any ideas? I have several machines using this configuration, both
arm64 and amd64, and I'm now a little uneasy about rebooting any of
them, in case there has been a breaking change somewhere which they,
too, are likely to fall afoul of.
Any thoughts? I've never really had to debug the init process (ie.
PID1) and am not sure how to proceed.
Thanks,
-Ian