Your message dated Wed, 26 May 2021 18:07:07 +0200 with message-id <87v9755vis.fsf@hands.com> and subject line Re: Bug#989124: grub-installer: occasional failure to install grub (when two DEs selected) has caused the Debian Bug report #989124, regarding grub-installer: occasional failure to install grub (when two DEs selected) to be marked as done. This means that you claim that the problem has been dealt with. If this is not the case it is now your responsibility to reopen the Bug report if necessary, and/or fix the problem forthwith. (NB: If you are a system administrator and have no idea what this message is talking about, this may indicate a serious mail system misconfiguration somewhere. Please contact owner@bugs.debian.org immediately.) -- 989124: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=989124 Debian Bug Tracking System Contact owner@bugs.debian.org with problems
--- Begin Message ---
- To: Debian Bug Tracking System <submit@bugs.debian.org>
- Subject: grub-installer: occasional failure to install grub (when two DEs selected)
- From: Philip Hands <phil@hands.com>
- Date: Wed, 26 May 2021 11:41:42 +0200
- Message-id: <[🔎] 162202210278.1370811.1043745577649459795.reportbug@rummy.hk.hands.com>
Package: grub-installer Version: 1.178 Severity: minor Dear Maintainer, While testing under openQA (so in qemu/kvm) if selecting more than one DE, somthing like one in ten installs will fail to install grub, resulting in an unbootable system. Given that this is only happening in the unusual circumstance of selecting multiple desktops, and even then is only an intermitent bug, I've tagged it as minor. An example of this can be found here: https://openqa.debian.net/tests/4457 which one can see hanging at the initial boot screen, rather than booting to a login prompt. One of the assets being collected it a dump of the start of the target block device, which in the failing case looks like this: https://openqa.debian.net/tests/4457/file/complete_install-dev_vda_dump.txt whereas when things are working it looks like this: https://openqa.debian.net/tests/4439/file/complete_install-dev_vda_dump.txt I have tried making it collect data earlier during the install but doing so resulted in bug going away. [I had it flip to the console when mandb is being installed, as that sits on the screen for quite a while so provides a good trigger for the action, and run a few commands to collect state, then flip back to the graphical screen.] BTW The syslog from that failing run is here: https://openqa.debian.net/tests/4457/file/complete_install-syslog.txt If there's more information that could usefully be collected, please mention what you think might help and I'll add it to the openqa scripts. Cheers, Phil.
--- End Message ---
--- Begin Message ---
- To: 989124-done@bugs.debian.org
- Subject: Re: Bug#989124: grub-installer: occasional failure to install grub (when two DEs selected)
- From: Philip Hands <phil@hands.com>
- Date: Wed, 26 May 2021 18:07:07 +0200
- Message-id: <87v9755vis.fsf@hands.com>
- In-reply-to: <[🔎] 20210526103614.4z5vppknuyw2aysh@mraw.org>
- References: <[🔎] 162202210278.1370811.1043745577649459795.reportbug@rummy.hk.hands.com> <[🔎] 162202210278.1370811.1043745577649459795.reportbug@rummy.hk.hands.com> <[🔎] 20210526103614.4z5vppknuyw2aysh@mraw.org>
Hi Cyril, I'm going to close the bug for now, and reopen it if I manage to come up with evidence this isn't just an artefact of the way I wrote the test. Thanks for the helpful feedback -- in particular noticing the missing boolean screenshot, which I'd missed (no need to read the rest of this unless you're interested in the details) Cyril Brulebois <kibi@debian.org> writes: > Philip Hands <phil@hands.com> (2021-05-26): >> Dear Maintainer, > > Dear Bug Reporter, > > (:D) :-) > I'm not sure I really trust the screenshots that show /dev/vda selected > in both cases. After all, looking one step before, the boolean regarding > installing GRUB wasn't captured at all in the failing case, compare the > screenshots starting here: > > - https://openqa.debian.net/tests/4457#step/grub/45 (ko) > - https://openqa.debian.net/tests/4439#step/grub/45 (ok) Ah, well spotted -- I'd not noticed the missing boolean shot there. Looking at the video, that would seem to be a real difference: https://openqa.debian.net/tests/4457/file/video.ogv#t=41.55,41.60 vs. https://openqa.debian.net/tests/4439/file/video.ogv#t=41.55,41.60 The OpenQA code assumes that that will be selected, and hits <TAB> <RET> so if it comes up as having "No" selected, that's what happens, and you get no bootloader. It really ought to take a screenshot there though, because of the assert, so I'm suspecting that it's somehow getting past that prompt without needing to look for that screen. If it had hit one return too many earlier, perhaps that's buffered and getting used to jump past that prompt without a screenshot. It seems a bit odd that d-i might occasionally present the alternative default, so I suspect that's not what's happening at all. It strikes me as rather more likely that the openqa worker might be running unusually slowly on occasion, and that may provoke one of the: wait_screen_change { send_key 'ret'; }; bits to send unneeded ret's, which may be messing things up later. BTW That usage was inspired by some of Fedora's tests, but always struck me as a bit suspect -- I'll probably eliminate that from out tests if it turns out to be an issue. > but maybe that's just a side effect of the console switching gymnastics > you mentioned? (Sending left Ctrl or the like every few minutes avoids > running into DPMS/blanking issues, I'm using that trick.) > > Anyway, any chance you could add `DEBCONF_DEBUG=developer` on the kernel > command line, so that we have a chance of understanding what's happening > on the debconf level? Otherwise, we might try and hotpatch > grub-installer to add some more logging but if we could avoid that… It's really easy to add the debug stuff on the kernel command line, so if it turns out not to be an openQA issue, I'll try adding a job with debugging turned on -- I'm not certain, but I have a feeling that I tried that without it being very informative before, but I'm afraid I forgot why (probably I just never saw it fail in this way). Cheers, Phil. -- |)| Philip Hands [+44 (0)20 8530 9560] HANDS.COM Ltd. |-| http://www.hands.com/ http://ftp.uk.debian.org/ |(| Hugo-Klemm-Strasse 34, 21075 Hamburg, GERMANYAttachment: signature.asc
Description: PGP signature
--- End Message ---