[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1020500: glibc: flaky autopkgtest on armel: multiple different failures



Hi Aurelien,

Thanks for your thorough testing.

First off, we have recently changed our setup for armel and armhf testing. The real host is the same, but instead of one VM for armel where we ran 10 debci workers in parallel, we now have smaller VM's with only 4 parallel debci workers per VM. Maybe this changes some of the metrics.

On 07-10-2022 20:55, Aurelien Jarno wrote:
https://ci.debian.net/data/autopkgtest/testing/armel/g/glibc/23501044/log.gz

----------
FAIL: elf/tst-debug1
original exit status 1
Didn't expect signal from child: got `Bus error'
----------

I have not been able to reproducible this bug after 1M tests on
amdahl.d.o, an RPI3 (running an arm64 kernel) and a STM32MP1 board
(armhf). Would it be possible to give more details, like any
corresponding dmesg entry to have a better idea of the issue?

I'll try to have a look if I spot this again. The original dmesg is gone by now.

https://ci.debian.net/data/autopkgtest/testing/armel/g/glibc/26218800/log.gz
https://ci.debian.net/data/autopkgtest/testing/armel/g/glibc/26223226/log.gz
https://ci.debian.net/data/autopkgtest/testing/armel/g/glibc/26322746/log.gz

----------
FAIL: rt/tst-cpuclock2-time64
original exit status 1
live thread clock ffb6e90e resolution 0.000000001
live thread before sleep => 0.000254800
self thread before sleep => 0.000728320
live thread after sleep => 0.473986200
self thread after sleep => 0.001080840
clock_nanosleep on process slept 97739240 (outside reasonable range)
----------

I also can't reproduce this one after 100000 tests on amdahl.d.o, an
RPI3 (running an arm64 kernel) and a STM32MP1 board (armhf). According
to upstream it seems that this test is known to fail heavy loaded hosts
as it relies on wall time. Is it the case of the debci workers, do they
have dedicated CPUs to run their tests? Are the armel workers different
than the others?

Yes, and as mentioned above we changed it too. But as said, we ran a lot of parallel workers, so they could be heavy loaded. We also have an amd64 host that runs lots of parallel workers, and so does s390x, but maybe they are a bit better spec-ed than the armel VM was.

Nevertheless the part of the test that relies on wall time has been
removed from upstream so this should be considered as fixed in glibc
2.35 that is now in testing:
https://sourceware.org/git/?p=glibc.git;a=commit;h=f3c6c190388bb445568cfbf190a0942fc3c28553

That's good to hear.

So, lets see the coming time if thing changed (hopefully for the better)..

Paul

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


Reply to: