[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#985617: glibc: flaky autopkgtest on most architectures



On 2021-04-25 10:39, Simon McVittie wrote:
> On Sun, 25 Apr 2021 at 10:14:51 +0100, Simon McVittie wrote:
> > On Sun, 25 Apr 2021 at 08:11:48 +0200, Paul Gevers wrote:
> > > On 25-04-2021 01:55, Aurelien Jarno wrote:
> > > > It appears that all the failures are related to containers. I have been
> > > > able to reproduce the issue with a bullseye kernel, which defaults to
> > > > kernel.unprivileged_userns_clone=1. It seems the autopkgtest runners
> > > > still use a buster kernel (at least in the case of this build log).
> 
> Looking at support/test-container.c, it seems that these tests will
> automatically be skipped (FAIL_UNSUPPORTED) on a kernel that restricts
> userns creation (like buster), and will be run (and perhaps fail)
> on a kernel that does not (like bullseye). So it is not necessarily
> a *regression* that they fail - they might just never have been tried
> before we started using bullseye kernels.
> 
> The brute-force approach to making the autopkgtest not be flaky would be
> to make these tests FAIL_UNSUPPORTED unconditionally, which will result
> in the same coverage we would have had on buster kernels. Obviously it
> would be better if they could be made to pass, but some reliable testing
> is better than none.
> 
> These tests seem to be failing here (support/test-container.c:1095):
> 
>   execvp (new_child_proc[0], new_child_proc);
> 
>   /* Or don't run the child?  */
>   FAIL_EXIT1 ("Unable to exec %s\n", new_child_proc[0]);
> 
> It would be useful if this printed strerror(errno) at least, so that we
> can see whether it's ENOENT or EACCES or something else.
> 
> Perhaps the test support code is not copying/mounting everything that needs
> to be copied/mounted into the container's filesystem? More debug logging in
> support/test-container.c would probably be helpful here - perhaps even
> running 'find . -ls' in the new_root_path before chrooting into it?

Yes, this is exactly the problem. This is due to patch
any/local-rtlddir-cross.diff, which remove a snippet of code installing
the ld.so symlink. Instead this is done in an ugly way in the
debian/rules.d/build.mk. Both can be dropped to make things working
fine. However I am not sure what are the consequences on cross builds,
which anyway also use the same code from build.mk. I am currently
investigating.

Aurelien

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net


Reply to: