[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#962545: Invalid serial console blocks system boot



I'm hereby also attaching the patch to the bug, in order more audience
has access to it, given BTS is the more generic way to report fixes in
Debian. Currently, there's a merge request in salsa with this fix [0].

Cheers,

Guilherme


[0] https://salsa.debian.org/kernel-team/initramfs-tools/-/merge_requests/30
From c3cbf35505d1767189e6ae4b9d906bcf12275bb4 Mon Sep 17 00:00:00 2001
From: "Guilherme G. Piccoli" <gpiccoli@canonical.com>
Date: Mon, 8 Jun 2020 18:38:52 -0300
Subject: [PATCH] scripts/functions: Prevents printf error carry over if wrong
 console is set

Currently the _log_msg() functions is "void" typed - with no return -,
which in terms of shell means it returns whatever its last command
returns. This function is the basic building block for all error/warning
messages in initramfs-tools.

It was noticed [0] that in case of bad console is provided to kernel on
command-line, printf (and apparently all write()-related functions) returns
error, and so this error is carried over in _log_msg(). Happens that
checkfs() function has a loop that runs forever in this scenario (*if* fsck
is not present in initramfs, and obviously if "quiet" is not provided in the
command-line). The situation is easily reproducible and we can find various
reports dating back some years. The reports usually are of the form
"machine can't boot if wrong console is provided" or slightly different
forms of that, almost always relating serial consoles with boot issues.

This patch proposes a pretty simple fix: return zero on _log_msg().
We should definitely not brake the boot due to error log functions;
one could argue we could fix checkfs() and that's true, until eventually
we find another subtle corner case of "misuse" of the _log_msg() return
value (after some debugging), and fix that too, and so on...
W could also argue that printf shouldn't return error in this case,
and although a valid discussion, it's not worth to have users waiting
on a dilemma while boot is quite easy to brake, just by passing a wrong
kernel parameter (or having the underlying serial console device changed
to output to a different port than the previously set on kernel cmdline).

[0] bugs.launchpad.net/cloud-images/+bug/1573095/comments/46
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
---
 scripts/functions | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/functions b/scripts/functions
index db7c833a8356..d8fb0ca7be2b 100644
--- a/scripts/functions
+++ b/scripts/functions
@@ -5,6 +5,7 @@ _log_msg()
 	if [ "${quiet?}" = "y" ]; then return; fi
 	# shellcheck disable=SC2059
 	printf "$@"
+	return 0 # Prevents error carry over in case of unavailable console
 }
 
 log_success_msg()
-- 
2.27.0


Reply to: