[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#993821: After upgrading libc, some services are unable to restart (including systemd-resolved)





On Tue, 7 Sept 2021 at 17:49, Michael Biebl <biebl@debian.org> wrote:
Control: reassign -1 libc6
Control: found -1 2.32-1
Control: severity -1 serious
Control: affects -1 + systemd

Hi Michael

Am 07.09.21 um 00:39 schrieb Michael Hudson-Doyle:
> On Tue, 7 Sept 2021 at 10:21, Michael Biebl <biebl@debian.org
> <mailto:biebl@debian.org>> wrote:
>
>     Am 06.09.21 um 23:45 schrieb Vincent Bernat:
>       > Package: systemd
>       > Version: 247.9-1
>       > Severity: normal
>       >
>      > Hey!
>      >
>      > After upgrading to libc6 2.32-1, some services are unable to restart.
>      > In my case, systemd-resolved, systemd-timesyncd and colord. Using
>      > "systemctl daemon-reexec" fixes the issue. Unsure if there is really
>      > something to be fixed but as I didn't find anything about that, a bug
>      > report may help others. I suppose the problem is related to NSS.
>      >
>      > Sep 06 23:06:43 chocobo systemd[1]: Starting Network Time
>     Synchronization...
>      > Sep 06 23:06:43 chocobo systemd[236983]:
>     systemd-timesyncd.service: Failed to determine user credentials: No
>     such process
>      > Sep 06 23:06:43 chocobo systemd[236983]:
>     systemd-timesyncd.service: Failed at step USER spawning
>     /lib/systemd/systemd-timesyncd: No such process
>      >
>      >
>
>
>     @libc maintainers: any ideas what could be causing this? If this is
>     triggered by a libc6 update, should this be reassigned to glibc?
>
>
> We went through this in Ubuntu recently and decided that restarting
> systemd in glibc's postinst was the safest option:
> https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1942276
> <https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1942276>
>
> What's happening is that systemd is running with the old glibc, forks
> and then does NSS things that cause the new glibc's NSS modules to load
> and they don't necessarily work, leading to failures in any unit that
> specifies User=. At least for Ubuntu's builds the NSS modules seem to be
> ABI compatible between 2.32 and 2.33 (I didn't try 2.31 vs 2.32) but
> they are definitely not between 2.33 and 2.34.

Thanks for this information. This is indeed an icky issue and I feel
like we are between a rock and a hard place.

Yeah. I guess one could say that having a long running process that forks and then does NSS stuff is skating on thin ice a bit. At least the changes in glibc 2.34 to move nss_files functionality into glibc itself will reduce the fallout of this considerably.
 
I'm not a huge fan of going back to re-exec systemd again directly in
libc6.postinst, but your proposed patch to at least check that the
systemd binary can be sucessfully executed should at least deal with the
situation sufficiently, where a library is (temporarily) missing.
I do wonder though, if this this will mean that on dist-upgrades the
daemon-reexec will be skipped.

FWIW I had a long chat with Julian (the apt maintainer) about this and he thought there were three potential situations that could be a problem:

1) a new systemd is unpacked before its Depends
2) one of systemd dependencies has a Breaks: systemd (<< new)
3) in some cases a cycle has to be broken by removing a package with --force-deps

It think 1) is by some margin the most likely to actually happen, and at least in that situation systemd will be restarted shortly by its own postinst.

Cheers,
Michael

Anyway, I think it's best to reassign this libc6 for now and mark it as
RC so the package doesn't migrate to testing for now.

Regards,
Michael


Reply to: