[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1061688: rtl8821: WARNING: CPU: 37 PID: 1366 at drivers/iommu/dma-iommu.c:1091 iommu_dma_unmap_page+0x7d/0x90



Control: tags -1 + moreinfo

Hi,

On Sun, Jan 28, 2024 at 06:02:44PM +0000, Breno Leitao wrote:
> Package: src:linux
> Version: 6.6.13-1
> Severity: critical
> X-Debbugs-Cc: leitao@debian.org
> 
> 
> System is crashing from time to time with the most recent kernel
> (6.6.13).
> 
> I was able to get the last kernel messages, and it is related to
> dma-iommu. I am not sure why the system is crashing, since I didn't have
> kdump, but, there is a clear warning in the wifi driver.
> 
> 
> 	Jan 28 17:05:21.414052 xeon kernel: Process accounting resumed
> 	Jan 28 17:05:21.530027 xeon kernel: warning: `atop' uses wireless extensions which will stop working for Wi-Fi 7 hardware; use nl80211
> 	Jan 28 17:05:21.550117 xeon kernel: espeakup[1527]: memfd_create() called without MFD_EXEC or MFD_NOEXEC_SEAL set
> 	Jan 28 17:05:21.586066 xeon kernel: NET: Registered PF_QIPCRTR protocol family
> 	Jan 28 17:05:21.606106 xeon kernel: block nvme3n1: No UUID available providing old NGUID
> 	Jan 28 17:05:25.354046 xeon kernel: rfkill: input handler disabled
> 	Jan 28 17:05:27.294079 xeon kernel: wlp134s0: authenticate with 80:72:15:b4:aa:6d
> 	Jan 28 17:05:27.694105 xeon kernel: wlp134s0: send auth to 80:72:15:b4:aa:6d (try 1/3)
> 	Jan 28 17:05:27.702030 xeon kernel: wlp134s0: authenticated
> 	Jan 28 17:05:27.710089 xeon kernel: wlp134s0: associate with 80:72:15:b4:aa:6d (try 1/3)
> 	Jan 28 17:05:27.718085 xeon kernel: wlp134s0: RX AssocResp from 80:72:15:b4:aa:6d (capab=0x1011 status=0 aid=6)
> 	Jan 28 17:05:27.718145 xeon kernel: wlp134s0: associated
> 	Jan 28 17:05:27.730080 xeon kernel: wlp134s0: Limiting TX power to 23 (23 - 0) dBm as advertised by 80:72:15:b4:aa:6d
> 	Jan 28 17:05:31.799817 xeon systemd-journald[764]: /var/log/journal/338f646113274ac1b9a4e000c0f8c95c/user-1000.journal: Journal file uses a different sequence number ID, rotating.
> 	Jan 28 17:05:32.074056 xeon kernel: rfkill: input handler enabled
> 	Jan 28 17:05:33.862039 xeon kernel: rfkill: input handler disabled
> 	Jan 28 17:05:45.370136 xeon kernel: logitech-hidpp-device 0003:046D:406D.0008: HID++ 4.5 device connected.
> 	Jan 28 17:40:26.302054 xeon kernel: rtlwifi: AP off, try to reconnect now
> 	Jan 28 17:40:26.302220 xeon kernel: wlp134s0: Connection to AP 80:72:15:b4:aa:6d lost
> 	Jan 28 17:40:30.661645 xeon kernel: wlp134s0: authenticate with 80:72:15:b4:aa:6a
> 	Jan 28 17:40:30.661756 xeon kernel: wlp134s0: 80 MHz not supported, disabling VHT
> 	Jan 28 17:40:30.680151 xeon kernel: wlp134s0: send auth to 80:72:15:b4:aa:6a (try 1/3)
> 	Jan 28 17:40:30.686063 xeon kernel: wlp134s0: authenticated
> 	Jan 28 17:40:30.686136 xeon kernel: wlp134s0: associate with 80:72:15:b4:aa:6a (try 1/3)
> 	Jan 28 17:40:30.696593 xeon kernel: wlp134s0: RX AssocResp from 80:72:15:b4:aa:6a (capab=0x1411 status=0 aid=2)
> 	Jan 28 17:40:30.702085 xeon kernel: wlp134s0: associated
> 	Jan 28 17:40:36.710058 xeon kernel: wlp134s0: deauthenticated from 80:72:15:b4:aa:6a (Reason: 2=PREV_AUTH_NOT_VALID)
> 	Jan 28 17:42:36.946218 xeon kernel: ------------[ cut here ]------------
> 	Jan 28 17:42:36.946357 xeon kernel: WARNING: CPU: 37 PID: 1366 at drivers/iommu/dma-iommu.c:1091 iommu_dma_unmap_page+0x7d/0x90
> 	Jan 28 17:42:36.946403 xeon kernel: Modules linked in: ccm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device qrtr binfmt_misc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common isst_if_common skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm rtl8821ae btcoexist irqbypass rtl_pci ghash_clmulni_intel rtlwifi sha512_ssse3 sha256_ssse3 mac80211 sha1_ssse3 snd_hda_codec_realtek aesni_intel snd_hda_codec_generic crypto_simd cryptd snd_hda_codec_hdmi ledtrig_audio libarc4 snd_hda_intel rapl snd_intel_dspcfg snd_intel_sdw_acpi intel_cstate snd_hda_codec cfg80211 snd_hda_core snd_hwdep iTCO_wdt snd_pcm rfkill snd_timer intel_pmc_bxt mei_me intel_uncore iTCO_vendor_support snd pcspkr ioatdma mei watchdog soundcore intel_pch_thermal dca joydev acpi_pad acpi_power_meter sg evdev msr parport_pc ppdev lp parport loop nvme_fabrics dm_mod efi_pstore configfs nfnetlink ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic speakup_soft speakup hid_logitech_hidpp hid_logitech_dj
> 	Jan 28 17:42:36.946556 xeon kernel:  nouveau sr_mod hid_generic sd_mod cdrom usbhid drm_exec hid gpu_sched video nvme i2c_algo_bit drm_display_helper nvme_core cec t10_pi rc_core drm_ttm_helper ahci ttm xhci_pci crc64_rocksoft drm_kms_helper crc64 libahci xhci_hcd crc_t10dif libata crct10dif_generic drm mxm_wmi usbcore crc32_pclmul scsi_mod crct10dif_pclmul i2c_i801 crc32c_intel crct10dif_common lpc_ich vmd i2c_smbus usb_common scsi_common wmi button
> 	Jan 28 17:42:36.946611 xeon kernel: CPU: 37 PID: 1366 Comm: NetworkManager Not tainted 6.6.13-amd64 #1  Debian 6.6.13-1
> 	Jan 28 17:42:36.946648 xeon kernel: Hardware name: ASUSTeK COMPUTER INC. WS-C621E-SAGE Series/WS-C621E-SAGE Series, BIOS 6801 04/26/2022
> 	Jan 28 17:42:36.946685 xeon kernel: RIP: 0010:iommu_dma_unmap_page+0x7d/0x90
> 	Jan 28 17:42:36.946721 xeon kernel: Code: 2b 48 3b 28 72 26 48 3b 68 08 73 20 4d 89 f8 44 89 f1 4c 89 ea 48 89 ee 48 89 df 5b 5d 41 5c 41 5d 41 5e 41 5f e9 83 ed 8f ff <0f> 0b 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 66 90 90 90 90
> 	Jan 28 17:42:36.946761 xeon kernel: RSP: 0018:ffffb034a5cc7440 EFLAGS: 00010046
> 	Jan 28 17:42:36.946796 xeon kernel: RAX: 0000000000000000 RBX: ffff90a2e01070c0 RCX: 0000000000000012
> 	Jan 28 17:42:36.946832 xeon kernel: RDX: 0000000000000000 RSI: ffff90ba5569b000 RDI: 0000000000000000
> 	Jan 28 17:42:36.946868 xeon kernel: RBP: ffff90ba50140900 R08: 0000000000000000 R09: 0000000000000003
> 	Jan 28 17:42:36.946903 xeon kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> 	Jan 28 17:42:36.947204 xeon kernel: R13: 00000000000009d8 R14: 0000000000000001 R15: 0000000000000000
> 	Jan 28 17:42:36.947377 xeon kernel: FS:  00007f5766591500(0000) GS:ffff90d1dfc40000(0000) knlGS:0000000000000000
> 	Jan 28 17:42:36.947416 xeon kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> 	Jan 28 17:42:36.947452 xeon kernel: CR2: 00007fa83488e000 CR3: 000000010f5e0001 CR4: 00000000007726e0
> 	Jan 28 17:42:36.947487 xeon kernel: PKRU: 55555554
> 	Jan 28 17:42:36.947522 xeon kernel: Call Trace:
> 	Jan 28 17:42:36.947557 xeon kernel:  <TASK>
> 	Jan 28 17:42:36.947592 xeon kernel:  ? iommu_dma_unmap_page+0x7d/0x90
> 	Jan 28 17:42:36.947627 xeon kernel:  ? __warn+0x81/0x130
> 	Jan 28 17:42:36.947658 xeon kernel:  ? iommu_dma_unmap_page+0x7d/0x90
> 	Jan 28 17:42:36.947688 xeon kernel:  ? report_bug+0x171/0x1a0
> 	Jan 28 17:42:36.947723 xeon kernel:  ? handle_bug+0x3c/0x80
> 	Jan 28 17:42:36.947758 xeon kernel:  ? exc_invalid_op+0x17/0x70
> 	Jan 28 17:42:36.947799 xeon kernel:  ? asm_exc_invalid_op+0x1a/0x20
> 	Jan 28 17:42:36.947836 xeon kernel:  ? iommu_dma_unmap_page+0x7d/0x90
> 	Jan 28 17:42:36.947866 xeon kernel:  rtl_pci_reset_trx_ring+0x195/0x390 [rtl_pci]
> 	Jan 28 17:42:36.947901 xeon kernel:  rtl_ps_enable_nic+0x29/0x120 [rtlwifi]
> 	Jan 28 17:42:36.947936 xeon kernel:  rtl8821ae_phy_set_rf_power_state+0x71/0x2d0 [rtl8821ae]
> 	Jan 28 17:42:36.947967 xeon kernel:  rtl_ps_set_rf_state.isra.0+0xbb/0xf0 [rtlwifi]
> 	Jan 28 17:42:36.948002 xeon kernel:  _rtl_ps_inactive_ps+0x36/0xd0 [rtlwifi]
> 	Jan 28 17:42:36.948032 xeon kernel:  rtl_ips_nic_on+0x7c/0xc0 [rtlwifi]
> 	Jan 28 17:42:36.948068 xeon kernel:  rtl_op_stop+0xfd/0x110 [rtlwifi]
> 	Jan 28 17:42:36.948108 xeon kernel:  drv_stop+0x34/0x100 [mac80211]
> 	Jan 28 17:42:36.948143 xeon kernel:  ieee80211_do_stop+0x5df/0x8a0 [mac80211]
> 	Jan 28 17:42:36.948177 xeon kernel:  ieee80211_stop+0x4d/0x180 [mac80211]
> 	Jan 28 17:42:36.948212 xeon kernel:  __dev_close_many+0x9b/0x110
> 	Jan 28 17:42:36.948242 xeon kernel:  __dev_change_flags+0x1a6/0x240
> 	Jan 28 17:42:36.948276 xeon kernel:  dev_change_flags+0x26/0x70
> 	Jan 28 17:42:36.948311 xeon kernel:  do_setlink+0x39c/0x12d0
> 	Jan 28 17:42:36.948341 xeon kernel:  ? intel_iommu_iotlb_sync_map+0x8d/0xe0
> 	Jan 28 17:42:36.948371 xeon kernel:  ? __nla_validate_parse+0x61/0xd10
> 	Jan 28 17:42:36.948401 xeon kernel:  ? update_load_avg+0x7e/0x780
> 	Jan 28 17:42:36.948436 xeon kernel:  __rtnl_newlink+0x651/0xa10
> 	Jan 28 17:42:36.948470 xeon kernel:  ? sched_clock+0x10/0x30
> 	Jan 28 17:42:36.948507 xeon kernel:  ? __kmem_cache_alloc_node+0x196/0x330
> 	Jan 28 17:42:36.948542 xeon kernel:  ? rtnl_newlink+0x2e/0x70
> 	Jan 28 17:42:36.948577 xeon kernel:  rtnl_newlink+0x47/0x70
> 	Jan 28 17:42:36.948611 xeon kernel:  rtnetlink_rcv_msg+0x14f/0x3c0
> 	Jan 28 17:42:36.948645 xeon kernel:  ? path_lookupat+0x96/0x1a0
> 	Jan 28 17:42:36.948680 xeon kernel:  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
> 	Jan 28 17:42:36.948715 xeon kernel:  netlink_rcv_skb+0x58/0x110
> 	Jan 28 17:42:36.948745 xeon kernel:  netlink_unicast+0x1a3/0x290
> 	Jan 28 17:42:36.948776 xeon kernel:  netlink_sendmsg+0x254/0x4d0
> 	Jan 28 17:42:36.948806 xeon kernel:  ____sys_sendmsg+0x396/0x3d0
> 	Jan 28 17:42:36.948841 xeon kernel:  ? copy_msghdr_from_user+0x7d/0xc0
> 	Jan 28 17:42:36.948871 xeon kernel:  ___sys_sendmsg+0x9a/0xe0
> 	Jan 28 17:42:36.948905 xeon kernel:  __sys_sendmsg+0x7a/0xd0
> 	Jan 28 17:42:36.948941 xeon kernel:  do_syscall_64+0x5d/0xc0
> 	Jan 28 17:42:36.948975 xeon kernel:  ? syscall_exit_to_user_mode+0x2b/0x40
> 	Jan 28 17:42:36.949009 xeon kernel:  ? do_syscall_64+0x6c/0xc0
> 	Jan 28 17:42:36.949043 xeon kernel:  ? do_syscall_64+0x6c/0xc0
> 	Jan 28 17:42:36.949073 xeon kernel:  ? __fget_light+0x99/0x100
> 	Jan 28 17:42:36.949107 xeon kernel:  ? ksys_write+0xd8/0xf0
> 	Jan 28 17:42:36.949137 xeon kernel:  ? exit_to_user_mode_prepare+0x40/0x1e0
> 	Jan 28 17:42:36.949171 xeon kernel:  ? syscall_exit_to_user_mode+0x2b/0x40
> 	Jan 28 17:42:36.949201 xeon kernel:  ? do_syscall_64+0x6c/0xc0
> 	Jan 28 17:42:36.949230 xeon kernel:  ? exit_to_user_mode_prepare+0x40/0x1e0
> 	Jan 28 17:42:36.949260 xeon kernel:  ? syscall_exit_to_user_mode+0x2b/0x40
> 	Jan 28 17:42:36.949290 xeon kernel:  ? do_syscall_64+0x6c/0xc0
> 	Jan 28 17:42:36.949319 xeon kernel:  ? do_syscall_64+0x6c/0xc0
> 	Jan 28 17:42:36.949348 xeon kernel:  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> 	Jan 28 17:42:36.949384 xeon kernel: RIP: 0033:0x7f576788ba5d
> 	Jan 28 17:42:36.949420 xeon kernel: Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 1a a0 f7 ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 44 24 08 e8 6e a0 f7 ff 48
> 	Jan 28 17:42:36.949448 xeon kernel: RSP: 002b:00007ffdd6e2ece0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
> 	Jan 28 17:42:36.949483 xeon kernel: RAX: ffffffffffffffda RBX: 000055d226a1c3f0 RCX: 00007f576788ba5d
> 	Jan 28 17:42:36.949517 xeon kernel: RDX: 0000000000000000 RSI: 00007ffdd6e2ed30 RDI: 000000000000000d
> 	Jan 28 17:42:36.949551 xeon kernel: RBP: 00007ffdd6e2ed30 R08: 0000000000000000 R09: 0000000000000000
> 	Jan 28 17:42:36.949586 xeon kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000040
> 	Jan 28 17:42:36.949616 xeon kernel: R13: 000055d226a9f3a0 R14: 0000000000000000 R15: 0000000000000000
> 	Jan 28 17:42:36.949650 xeon kernel:  </TASK>
> 	Jan 28 17:42:36.949684 xeon kernel: ---[ end trace 0000000000000000 ]---

Can you check if this happens as well with 6.7.1-1~exp1 in
experimental?

As I understand this is a regression from 6.6.11-1 to 6.6.13-1, any
chance you could bisect the upstream versions between 6.6.11 and
6.6.13 to identify the culprit?

Regards,
Salvatore


Reply to: