Bug#1107521: ath12k_pci errors and loss of connectivity in 6.12.y branch
- To: Matt Mower <mowerm@gmail.com>
- Cc: Vasant Hegde <vasant.hegde@amd.com>, Robin Murphy <robin.murphy@arm.com>, Jeff Johnson <jjohnson@kernel.org>, will@kernel.org, joro@8bytes.org, linux-wireless@vger.kernel.org, ath12k@lists.infradead.org, 1107521@bugs.debian.org, iommu@lists.linux.dev
- Subject: Bug#1107521: ath12k_pci errors and loss of connectivity in 6.12.y branch
- From: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
- Date: Thu, 3 Jul 2025 10:19:51 +0800
- Message-id: <[🔎] 979be2a9-9d0b-4382-8519-2f6fbcac5375@oss.qualcomm.com>
- Reply-to: Baochen Qiang <baochen.qiang@oss.qualcomm.com>, 1107521@bugs.debian.org
- In-reply-to: <[🔎] CAPDiVH-xPDmx-KQx7YJY=7+kwJNbGY-rEu-w+cz18p=kjnKFsw@mail.gmail.com>
- References: <CAPDiVH8gaBH6o_OY-zUWYpDbj5mhiqmofKGb71gLgHOi4vA=Vw@mail.gmail.com> <0ba2176e-3339-4a8b-850a-ca5643939c8b@oss.qualcomm.com> <fd3bd8b1-4108-445a-b65f-4769d73e6e63@arm.com> <4a13d862-1bbb-4a98-bc1d-219bf78f7c0d@amd.com> <[🔎] CAPDiVH-kVCUY8DKexT9OqAZsvkZ5_CGo8d8nENYA-kD=s_x8wA@mail.gmail.com> <[🔎] e008afed-819d-43eb-8895-2c7aaf24ec13@oss.qualcomm.com> <[🔎] CAPDiVH-xPDmx-KQx7YJY=7+kwJNbGY-rEu-w+cz18p=kjnKFsw@mail.gmail.com> <174939484316.7705.5967923154709480099.reportbug@AI360>
On 7/2/2025 10:53 PM, Matt Mower wrote:
>> Matt, could you help enable verbose ath12k log to verify my guess?
>
> Here are kernel logs with ath12k debugging enabled:
Thanks Matt.
I see firmware crash before IOMMU fault in both logs, which verifies my guess.
> 1. WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
> https://cmphys.com/ath12k/dmesg-6.12.35-ath12kdebug-fw0x100301e1-20250702.log
[ 91.625809] ath12k_pci 0000:c2:00.0: mhi notify status reason MHI_CB_EE_RDDM
[ 91.625916] ath12k_pci 0000:c2:00.0: reset starting
[ 91.674375] ath12k_pci 0000:c2:00.0: waiting recovery start...
[ 91.679445] ath12k_pci 0000:c2:00.0: setting mhi state: POWER_OFF(3)
[ 91.680721] ath12k_pci 0000:c2:00.0: qmi wifi fw del server
[ 91.680754] ath12k_pci 0000:c2:00.0: setting mhi state: DEINIT(1)
[ 91.681842] ath12k_pci 0000:c2:00.0: cookie:0x0
[ 91.681858] ath12k_pci 0000:c2:00.0: WLAON_WARM_SW_ENTRY 0x14c4e54
[ 91.687109] ath12k_pci 0000:c2:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010
address=0xfe980000 flags=0x0020]
> 2. WLAN.HMT.1.1.c5-00284.1-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
> https://cmphys.com/ath12k/dmesg-6.12.35-ath12kdebug-fw0x1108811c-20250702.log
>
[ 113.621429] ath12k_pci 0000:c2:00.0: mhi notify status reason MHI_CB_EE_RDDM
[ 113.621794] ath12k_pci 0000:c2:00.0: reset starting
[ 113.670134] ath12k_pci 0000:c2:00.0: waiting recovery start...
[ 113.675177] ath12k_pci 0000:c2:00.0: setting mhi state: POWER_OFF(3)
[ 113.676331] ath12k_pci 0000:c2:00.0: setting mhi state: DEINIT(1)
[ 113.676581] ath12k_pci 0000:c2:00.0: qmi wifi fw del server
[ 113.676874] ath12k_pci 0000:c2:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0010
address=0xfea50000 flags=0x0020]
> I captured these after setting CONFIG_ATH12K_DEBUG=y and running "echo
> 0xffffffff > /sys/module/ath12k/parameters/debug_mask" during boot
> (using @reboot in crontab).
Unfortunately I can not tell the root cause to the firmware crash from host log.
Internally I will try to repro this issue, in the meanwhile, Matt, could you help do some
more work to narrow down the problematic change?
Reply to: