[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: help needed to manage s390x host for ci.debian.net



Attempting to sum together what look, to me, like a pair of 2s:

  * The s390x Debian CI queue size[1] is growing again.

  * A recent bug report[2] by Dipak describes userspace processes
getting stuck on an s390 Linux kernel version that Debian's CI infra
has been using

The bug does seem to have caused CI package build timeouts, as Paul
and others have discussed[3].  I was skeptical about the
kernel-as-cause theory, but now agree with it.

Perhaps the timeouts explain the queue backlog?


Also note: Sumanth has offered a fix as an s390 kernel patch[4], and
it is pending -- that is, the fix has been uploaded and is awaiting
general availability after a delay for people to review the relevant
changes -- for distribution in Debian stable.


I'm puzzled by some conflicting data, though: the ppc64 queue _isn't_
growing currently.  Why did it follow the s390x trend so closely
during the previous queue buildup, and yet doesn't appear to be doing
so this time?


[1] - https://ci.debian.net/munin/debian.net/ci-master.debian.net/debci_queue_size.html

[2] - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1031753

[3] - https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1030545

[4] - https://lists.debian.org/debian-kernel/2023/02/msg00124.html

On Sat, 18 Feb 2023 at 14:23, James Addison <jay@jp-hosting.net> wrote:
>
> > James Addison suggested in [3] to increase a prefetch counter in amqp (although its the same on all hosts); I have done so on the s390x host and at least initially it seems to help keeping the host busier.
>
> Thanks for applying that - I was hoping that the change might also
> result in reductions in the debci queue size for s390x, but that
> doesn't appear to have happened, going by
> https://ci.debian.net/munin/debian.net/ci-master.debian.net/debci_queue_size.html
>
> [3] https://salsa.debian.org/ci-team/debci/-/issues/92#note_381306


Reply to: