Bug#621137: Random exec failures on ARM; breaks boot -- /init: exec: line 306: run-init: Unknown error 2372692
Short version: debian/patches/applets-fallback.patch causes a
regression on ARM in Debian 1.18 packages.
Multiple users reported issues when upgrading their ARM device
(specifically NSLU2 hardware -- "slugs") to sid; they couldn't boot
anymore; the serial console would show something like:
[ 7.779891] EXT4-fs (mmcblk0p2): mounted filesystem with ordered data mode. Opts: (null)
Begin: Running /scripts/local-bottom ... done.
Begin: Running /scripts/init-bottom ... done.
/init: exec: line 306: run-init: Unknown error 2372692
[ 8.450448] Kernel panic - not syncing: Attempted to kill init!
[ 8.452811] [<c003d100>] (unwind_backtrace+0x0/0xe0) from [<c03e93c0>] (panic+0x50/0x16c)
[ 8.453186] [<c03e93c0>] (panic+0x50/0x16c) from [<c0054050>] (forget_original_parent+0xb4/0x1e4)
Arnaud Patard and myself could reproduce in QEMU, with both versatile
(ARMv5) and Versatile Express (ARMv7A) kernels. We considered that it
could be a run-init issue in klibc, but it turns out that the error is
from the initrd's /init interpreter, /bin/sh in the initrd, which comes
from busybox (hence the output with line number information, this is
the line of /init where run-init gets exec-ed).
As this looked like either a toolchain issue or a busybox issue, we
tried rebuilding busybox in multiple ways; this is the table of
results I have:
current package: Debian sid gcc (4.5) + Debian sid busybox (1.18) => fail
rebuild: Debian sid gcc (4.5) + Debian sid busybox (1.18) => fail
Debian sid gcc-4.4 + Debian sid busybox (1.18) => fail
Emdebian squeeze cross (4.4) + Ubuntu natty busybox (1.17) => pass
Ubuntu natty cross (4.5 + linaro) + Ubuntu natty busybox (1.17) => pass
Debian squeeze busybox (1.17) => pass
Debian stable busybox (1.17) => pass
Debian-ports sid gcc (4.5) + Debian sid busybox (1.18) => pass (not same ABI!)
Debian sid gcc 4.4 + Debian sid busybox (1.18) => fail
Ubuntu natty cross (4.5 + linaro) + Busybox git tip => pass
Ubuntu natty cross (4.5 + linaro) + Busybox git 1_18_4 => pass
However I once got it to boot, with no changes, so it seems there are
conditions where exec does not fail. I managed to boot multiple times
by running exec a first time from an interactive shell and then
Upstream busybox would never be affected and 1.17 would never be
affected, but Debian 1.18 would almost always be affected.
I rebuilt the Debian source package verbatim, and it was still failing
consistently; I rebuilt the Debian source package without
debian/patches/applets-fallback.patch and it booted.
This patch was refreshed in the latest upload:
- either some issues were introduced during refresh
- or the patch was always broken
I didn't look at why the patch breaks (yet) and I don't have a smaller
test case than the above, which is quite painful.