[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1001001: linux-image-5.10.0-9-arm64: kernel BUG at include/linux/swapops.h:204!



Hi,

On Thu, 02 Dec 2021 13:44:15 +0100 Paul Gevers <elbrus@debian.org> wrote:
The last couple of days, two of the ci.debian.net arm64 workers became
unresponsive. The systems were rebooted and I found the message in
the journal pasted below.

Please let me know if you need more info about these systems.

As requested by carnil on IRC, let me try to add some things I checked.

In contrast to the previous kernel bug I reported, this time the two machines that hang were testing different packages (syslog-ng being one of them) that succeed often on arm64.

I noticed in the logs that *after* the reported kernel bug but before the actual hang, I see multiple instances of:
watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [apt-get:2204621]
and
watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kcompactd0:40]
on ci-worker-arm64-07.

The other system (ci-worker-arm64-02) has
watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [khugepaged:42]
and
watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [apt-get:4191233]

I found a third system that had to be rebooted recently (ci-worker-arm64-08 on 18 November):
watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [apt-get:3325970]
and
watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [python3:3275229]

Although the journal is lost by now, we had more arm64 VM's hang;
ci-worker-arm64-03 on 6 November 2021

Probably worth to mention, albeit hopefully unrelated, we had issues in the recent past (ci-worker-arm64-06 on 29 October 2021) with virtio_gpu so we blocked that module on all our workers from loading as we believe we don't need it. [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1202 (command 0x103)

Paul

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


Reply to: