[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: s390x hardly processing packages

Hi Paul,

On 2021-03-04 13:51, Paul Gevers wrote:
Since around week 8, the s390x worker for ci.debian.net is as busy [1]
as before week 6, but processing only 1/10 of the amount of packages. Do
you have any idea what could be going on and how to debug this? I did
several update in between, including reboots. I'm at a loss.

While I am an SRE by profession, I am not sure how I should help you at application-level debugging - given that I have zero involvement and knowledge about debci. Now I can do some reverse engineering of debci from the worker's installation, but I don't think that's a good use of my time. This is about work reaching the machine rather than lack of resources.

You did not link to the AMQP queue length graph, but I managed to find it on [1]. This shows that there are presumably 250 items in the queue that the worker polls (debci-tests-s390x-lxc). According to "ss" there is an open connection to the AMQP broker, so it shouldn't be the firewall either. The connection sometimes breaks with an SSL error (as found in journalctl) but then the queue listener is promptly restarted.

Trying to spawn amqp-consume manually using sudo it fails after "creating SSL/TLS socket" which is confirmed by ltrace/strace. I don't even see any OpenSSL calls in ltrace. It looks like the Munin graph is using ISO weeks, so if my math is correct week 6 started on 2021-02-08. It looks like there were no package upgrades between 2021-02-02 and 2021-02-19 according to history.log.1.gz. I can also successfully establish a connection via openssl s_client with the client cert and key. I don't know how to coerce amqp-consume into more debugging output, though. I also can't speak AMQP over telnet, I'm afraid.

I can't rule out the firewall, of course. It bans me after two SSH connections or something. But I'd suggest you first dig into the question, if you can get something from the queue yourself at all.

Kind regards
Philipp Kern

[1] https://ci.debian.net/munin/debian.net/ci-master.debian.net/debci_queue_size.html

Reply to: