Re: s390x hardly processing packages
On 2021-03-04 13:51, Paul Gevers wrote:
Since around week 8, the s390x worker for ci.debian.net is as busy 
as before week 6, but processing only 1/10 of the amount of packages.
you have any idea what could be going on and how to debug this? I did
several update in between, including reboots. I'm at a loss.
While I am an SRE by profession, I am not sure how I should help you at
application-level debugging - given that I have zero involvement and
knowledge about debci. Now I can do some reverse engineering of debci
from the worker's installation, but I don't think that's a good use of
my time. This is about work reaching the machine rather than lack of
You did not link to the AMQP queue length graph, but I managed to find
it on . This shows that there are presumably 250 items in the queue
that the worker polls (debci-tests-s390x-lxc). According to "ss" there
is an open connection to the AMQP broker, so it shouldn't be the
firewall either. The connection sometimes breaks with an SSL error (as
found in journalctl) but then the queue listener is promptly restarted.
Trying to spawn amqp-consume manually using sudo it fails after
"creating SSL/TLS socket" which is confirmed by ltrace/strace. I don't
even see any OpenSSL calls in ltrace. It looks like the Munin graph is
using ISO weeks, so if my math is correct week 6 started on 2021-02-08.
It looks like there were no package upgrades between 2021-02-02 and
2021-02-19 according to history.log.1.gz. I can also successfully
establish a connection via openssl s_client with the client cert and
key. I don't know how to coerce amqp-consume into more debugging output,
though. I also can't speak AMQP over telnet, I'm afraid.
I can't rule out the firewall, of course. It bans me after two SSH
connections or something. But I'd suggest you first dig into the
question, if you can get something from the queue yourself at all.