[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1039710: debian-installer: Grub installation fails and /var/log/syslog is empty



Steve McIntyre <steve@einval.com> writes:

> On Wed, Jul 12, 2023 at 11:15:57AM +0200, Cyril Brulebois wrote:
>>Hi Michael,
>>
>>Cyril Brulebois <kibi@debian.org> (2023-06-28):
>>> Control: reassign -1 busybox-udeb 1:1.36.1-3
>>
>>[…]
>>
>>> With a local build, confirmed -3 is buggy, and that reverting only
>>> busybox-udeb to -1 is sufficient to restore syslog support in the
>>> installer.
>>> 
>>> Reassigning there; the GRUB thing can be filed separately once we have
>>> actual information via syslog.
>>
>>A fix would be appreciated, we've got reports piling up about things we
>>have no logs for.
>
> After weeks with this breakage, I've just uploaded a minimal NMU to
> fix it, reverting the syslog changes since -1. I've buit and tested
> successfully locally.

Thanks -- and I agree, it works :-)

  https://openqa.debian.net/tests/178534/logfile?filename=DI_syslog.txt

As it happens, over the weekend it occurred to me that I might be able
to pave the way to a fix for this bug by coming up with a test for the
failure.

Awkwardly, syslogd wants to open /dev/log and bails out if it's already
in use, so I resorted to (the somewhat disgusting hack of) using podman:

   https://salsa.debian.org/philh/busybox/-/commit/2697f7cce81d1a70de202a30e7062dc9f64a94b1

At least it allows syslogd to run well enough to succeed or fail
similarly to the behaviour seen in the bug.

Here it is going wrong with the -3 code:

  https://salsa.debian.org/philh/busybox/-/jobs/4523822#L3963
  (lines 3969-3975, with the last line showing the entire syslog)

and here is an example of it going right:

  https://salsa.debian.org/philh/busybox/-/jobs/4523808#L3668

  Line 3668 here, saying "PASS: syslogd-works" indicates that we
  succeeded in grepping the test string in /var/log/syslog

The difference between these two is simply disabling
CONFIG_FEATURE_REMOTE_LOG, as seen here:

  https://salsa.debian.org/philh/busybox/-/commit/89c253f75690dd41487b6fd6f9356a1e890a6ac2

I'm not proposing that as a fix, but it does seem to indicate where the
problem may be located. I'm afraid I've failed to work out what's
actually going wrong here (my C's pretty rusty).

BTW At one point I thought I'd narrowed it down to the while loop here:

  https://salsa.debian.org/philh/busybox/-/commit/328fdfbe43cd8d9e4425c3ee1c68aadfa44ee434

but if that did work, it does no longer. Either I was mistaken about it
having worked earlier (I'm at least 80% sure that's not the case) or
something non-deterministic is going on ... which makes me wonder if the
underlying cause might be something to do with using uninitialised data
somewhere.

Hopefully this will be of some use to those more familiar with the code.

Cheers, Phil.
-- 
Philip Hands -- https://hands.com/~phil

Attachment: signature.asc
Description: PGP signature


Reply to: