[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1034676: linux-image-5.10.0-20-rt-amd64: RT-Kernel freeze during IO load



Subject: linux-image-5.10.0-20-rt-amd64: RT-Kernel freeze during IO load
Package: linux-image-5.10.0-20-rt-amd64
X-Debbugs-Cc: C.Barth@goepel.com
Version: 5.10.158-2
Severity: normal

Dear Maintainer,

I have a problem with an application that leads to a
possible kernel freeze.
We tried to split the problem, but unfortunately we did not succeed.

We were able to recreate a system freeze after about 3-15h using simple echo
transfers over 6 PCIe cards. We ran our PCIe driver without the API part and
with a simple echo test several times with no problem for more than 60h.
On the other hand, we tested the API part by simulating the interface to the
driver using predefined responses.

As you can see in the logs the first error seems to be always the same and
has to do with messagequeues (our server application uses mq as ipc).
Sometimes seconds or hours later there are more errors like in the 2nd log.

Kind regards,
Christian


[42546.217248] WARNING: CPU: 5 PID: 1717 at kernel/sched/core.c:2500 set_task_cpu+0x1cd/0x1e0
[42546.217255] Modules linked in: udp_diag tcp_diag inet_diag Pxi62xx(OE) rfkill binfmt_misc intel_rapl_msr intel_rapl_common joydev snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp dell_smm_hwmon snd_hda_codec_realtek coretemp snd_hda_codec_generic ledtrig_audio kvm_intel kvm irqbypass dell_wmi ghash_clmulni_intel snd_hda_intel snd_intel_dspcfg aesni_intel soundwire_intel libaes soundwire_generic_allocation crypto_simd cryptd glue_helper snd_soc_core rapl snd_compress soundwire_cadence intel_cstate snd_hda_codec dell_smbios snd_hda_core dcdbas snd_hwdep intel_uncore sparse_keymap soundwire_bus dell_wmi_descriptor wmi_bmof intel_wmi_thunderbolt mei_hdcp mei_wdt iTCO_wdt snd_pcm intel_pmc_bxt ee1004 snd_timer iTCO_vendor_support serio_raw pcspkr sg watchdog snd soundcore mei_me mei intel_pch_thermal evdev intel_pmc_core acpi_pad parport_pc ppdev lp parport fuse configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs blake2b_generic raid10 raid456 async_raid6_recov
[42546.217314]  async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod hid_microsoft ff_memless hid_generic sd_mod t10_pi crc_t10dif crct10dif_generic usbhid hid i915 ahci i2c_algo_bit xhci_pci drm_kms_helper libahci xhci_hcd crct10dif_pclmul crct10dif_common cec e1000e libata crc32_pclmul crc32c_intel psmouse i2c_i801 ptp usbcore drm pps_core scsi_mod i2c_smbus usb_common wmi fan video button
[42546.217346] CPU: 5 PID: 1717 Comm: EchoTest Tainted: G           OE     5.10.0-21-rt-amd64 #1 Debian 5.10.162-1
[42546.217349] Hardware name: Dell Inc. Precision Tower 3620/0MWYPT, BIOS 2.18.0 06/10/2021
[42546.217350] RIP: 0010:set_task_cpu+0x1cd/0x1e0
[42546.217353] Code: c9 0f 84 a0 fe ff ff 8b 43 68 83 f8 02 0f 84 94 fe ff ff 0f 0b e9 8d fe ff ff 0f 0b 66 83 bb c0 07 00 00 00 0f 84 9c fe ff ff <0f> 0b e9 95 fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44
[42546.217356] RSP: 0018:ffffb96002873d48 EFLAGS: 00010002
[42546.217358] RAX: 0000000000000000 RBX: ffff9245595d8000 RCX: 0000000000000000
[42546.217359] RDX: ffff924c0dcf1278 RSI: 0000000000000002 RDI: ffff9245595d8000
[42546.217361] RBP: 0000000000000002 R08: ffff924c0dcf1278 R09: 0000000000000000
[42546.217362] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000002
[42546.217363] R13: ffff9245595d8028 R14: 0000000000000000 R15: ffff924c0dcf09c0
[42546.217365] FS:  00007f97fa35b740(0000) GS:ffff924c0dd40000(0000) knlGS:0000000000000000
[42546.217367] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[42546.217369] CR2: 00007fcbfc13d000 CR3: 0000000138674005 CR4: 00000000003706e0
[42546.217370] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[42546.217371] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[42546.217373] Call Trace:
[42546.217377]  push_rt_task.part.0+0x1bf/0x420
[42546.217382]  task_woken_rt+0x5d/0x70
[42546.217385]  ttwu_do_wakeup+0x45/0x190
[42546.217388]  try_to_wake_up+0x19b/0x6b0
[42546.217391]  __wake_up_q+0x99/0xd0
[42546.217395]  do_mq_timedsend+0x23b/0x410
[42546.217400]  __x64_sys_mq_timedsend+0x70/0xb0
[42546.217403]  ? exit_to_user_mode_prepare+0x33/0x1a0
[42546.217407]  do_syscall_64+0x30/0x80
[42546.217410]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
[42546.217414] RIP: 0033:0x7f97fa698040
[42546.217417] Code: 89 4c 24 14 e8 61 04 00 00 4c 8b 44 24 18 44 8b 54 24 14 41 89 c1 48 8b 54 24 08 48 8b 34 24 b8 f2 00 00 00 8b 7c 24 10 0f 05 <48> 3d 00 f0 ff ff 77 30 44 89 cf 89 44 24 10 e8 8c 04 00 00 8b 44
[42546.217418] RSP: 002b:00007ffc369eeb90 EFLAGS: 00000293 ORIG_RAX: 00000000000000f2
[42546.217421] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f97fa698040
[42546.217422] RDX: 000000000000003f RSI: 000055b54a7a31c8 RDI: 0000000000000008
[42546.217424] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[42546.217425] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000007
[42546.217426] R13: 000055b54a7b4230 R14: 000000000000002b R15: 00007f97fa9167c0
[42546.217430] ---[ end trace 0000000000000002 ]---
[42547.207213] sched: RT throttling activated


[23686.539555] WARNING: CPU: 6 PID: 1811 at kernel/sched/core.c:2500 set_task_cpu+0x1cd/0x1e0
[23686.539563] Modules linked in: Pxi62xx(OE) rfkill binfmt_misc joydev intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal dell_smm_hwmon snd_hda_codec_hdmi intel_powerclamp coretemp snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio kvm_intel kvm irqbypass dell_wmi ghash_clmulni_intel mei_hdcp mei_wdt aesni_intel snd_hda_intel libaes snd_intel_dspcfg crypto_simd cryptd soundwire_intel glue_helper soundwire_generic_allocation rapl snd_soc_core snd_compress intel_cstate soundwire_cadence dell_smbios dcdbas snd_hda_codec iTCO_wdt serio_raw sparse_keymap dell_wmi_descriptor wmi_bmof intel_wmi_thunderbolt intel_uncore snd_hda_core intel_pmc_bxt snd_hwdep iTCO_vendor_support ee1004 pcspkr sg watchdog soundwire_bus snd_pcm snd_timer snd mei_me soundcore mei intel_pch_thermal evdev acpi_pad intel_pmc_core parport_pc ppdev lp parport fuse configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor
[23686.539604]  async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod hid_microsoft ff_memless hid_generic sd_mod t10_pi crc_t10dif crct10dif_generic usbhid hid i915 i2c_algo_bit xhci_pci drm_kms_helper ahci xhci_hcd libahci cec libata crct10dif_pclmul crct10dif_common crc32_pclmul e1000e usbcore crc32c_intel i2c_i801 ptp drm psmouse scsi_mod pps_core i2c_smbus usb_common wmi fan video button
[23686.539625] CPU: 6 PID: 1811 Comm: EchoTest Tainted: G           OE     5.10.0-21-rt-amd64 #1 Debian 5.10.162-1
[23686.539627] Hardware name: Dell Inc. Precision Tower 3620/0MWYPT, BIOS 2.18.0 06/10/2021
[23686.539628] RIP: 0010:set_task_cpu+0x1cd/0x1e0
[23686.539630] Code: c9 0f 84 a0 fe ff ff 8b 43 68 83 f8 02 0f 84 94 fe ff ff 0f 0b e9 8d fe ff ff 0f 0b 66 83 bb c0 07 00 00 00 0f 84 9c fe ff ff <0f> 0b e9 95 fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44
[23686.539632] RSP: 0018:ffffa0f3c2467d48 EFLAGS: 00010002
[23686.539634] RAX: 0000000000000000 RBX: ffff94b28ed6e400 RCX: 0000000000000000
[23686.539635] RDX: ffff94b68dcf1278 RSI: 0000000000000000 RDI: ffff94b28ed6e400
[23686.539636] RBP: 0000000000000000 R08: ffff94b68dcf1278 R09: 0000000000000000
[23686.539637] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[23686.539638] R13: ffff94b28ed6e428 R14: 0000000000000000 R15: ffff94b68dcf09c0
[23686.539639] FS:  00007f4d28db7740(0000) GS:ffff94b68dd80000(0000) knlGS:0000000000000000
[23686.539640] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[23686.539641] CR2: 00007f89d826c000 CR3: 0000000189bd2002 CR4: 00000000003706e0
[23686.539642] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[23686.539643] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[23686.539644] Call Trace:
[23686.539647]  push_rt_task.part.0+0x1bf/0x420
[23686.539651]  task_woken_rt+0x5d/0x70
[23686.539652]  ttwu_do_wakeup+0x45/0x190
[23686.539655]  try_to_wake_up+0x19b/0x6b0
[23686.539657]  __wake_up_q+0x99/0xd0
[23686.539659]  do_mq_timedsend+0x23b/0x410
[23686.539662]  ? pick_next_task_fair+0x182/0x3d0
[23686.539664]  ? _raw_spin_unlock_irq+0x1d/0x60
[23686.539667]  ? __schedule+0x824/0x980
[23686.539670]  __x64_sys_mq_timedsend+0x70/0xb0
[23686.539672]  do_syscall_64+0x30/0x80
[23686.539674]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
[23686.539677] RIP: 0033:0x7f4d290f4040
[23686.539679] Code: 89 4c 24 14 e8 61 04 00 00 4c 8b 44 24 18 44 8b 54 24 14 41 89 c1 48 8b 54 24 08 48 8b 34 24 b8 f2 00 00 00 8b 7c 24 10 0f 05 <48> 3d 00 f0 ff ff 77 30 44 89 cf 89 44 24 10 e8 8c 04 00 00 8b 44
[23686.539680] RSP: 002b:00007fffbee9b040 EFLAGS: 00000293 ORIG_RAX: 00000000000000f2
[23686.539682] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f4d290f4040
[23686.539683] RDX: 0000000000000021 RSI: 000055e8bffa81c8 RDI: 0000000000000008
[23686.539683] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[23686.539684] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000007
[23686.539685] R13: 000055e8bffb9230 R14: 000000000000000d R15: 00007f4d293727c0
[23686.539689] ---[ end trace 0000000000000002 ]---
[23687.529423] sched: RT throttling activated
[23687.529423] sched: RT throttling activated
[23687.529438] ------------[ cut here ]------------
[23687.529440] DEBUG_LOCKS_WARN_ON(l->nestcnt == 0)
[23687.529448] WARNING: CPU: 0 PID: 1825 at include/linux/local_lock_internal.h:67 refill_obj_stock+0x1ca/0x260
[23687.529452] Modules linked in: Pxi62xx(OE) rfkill binfmt_misc joydev intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal dell_smm_hwmon snd_hda_codec_hdmi intel_powerclamp coretemp snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio kvm_intel kvm irqbypass dell_wmi ghash_clmulni_intel mei_hdcp mei_wdt aesni_intel snd_hda_intel libaes snd_intel_dspcfg crypto_simd cryptd soundwire_intel glue_helper soundwire_generic_allocation rapl snd_soc_core snd_compress intel_cstate soundwire_cadence dell_smbios dcdbas snd_hda_codec iTCO_wdt serio_raw sparse_keymap dell_wmi_descriptor wmi_bmof intel_wmi_thunderbolt intel_uncore snd_hda_core intel_pmc_bxt snd_hwdep iTCO_vendor_support ee1004 pcspkr sg watchdog soundwire_bus snd_pcm snd_timer snd mei_me soundcore mei intel_pch_thermal evdev acpi_pad intel_pmc_core parport_pc ppdev lp parport fuse configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor
[23687.529504]  async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod hid_microsoft ff_memless hid_generic sd_mod t10_pi crc_t10dif crct10dif_generic usbhid hid i915 i2c_algo_bit xhci_pci drm_kms_helper ahci xhci_hcd libahci cec libata crct10dif_pclmul crct10dif_common crc32_pclmul e1000e usbcore crc32c_intel i2c_i801 ptp drm psmouse scsi_mod pps_core i2c_smbus usb_common wmi fan video button
[23687.529538] CPU: 0 PID: 1825 Comm: Srv-RequestT Tainted: G        W  OE     5.10.0-21-rt-amd64 #1 Debian 5.10.162-1
[23687.529540] Hardware name: Dell Inc. Precision Tower 3620/0MWYPT, BIOS 2.18.0 06/10/2021
[23687.529541] RIP: 0010:refill_obj_stock+0x1ca/0x260
[23687.529544] Code: 07 ff ff ff e8 77 bd 1b 00 85 c0 74 1f 8b 0d b5 07 61 01 85 c9 75 15 48 c7 c6 06 42 ad 89 48 c7 c7 e2 f1 ab 89 e8 eb f0 60 00 <0f> 0b 8b 15 f6 57 cb 01 85 d2 0f 84 1a ff ff ff e9 28 ff ff ff e8
[23687.529547] RSP: 0018:ffffa0f3c255fdd0 EFLAGS: 00010286
[23687.529549] RAX: 0000000000000000 RBX: ffff94b68dcedd60 RCX: 0000000000000000
[23687.529550] RDX: 0000000000000001 RSI: ffffffff89b10c0c RDI: 00000000ffffffff
[23687.529552] RBP: ffff94b68dc2dd60 R08: ffffffff8a063d00 R09: 000000008a992eeb
[23687.529553] R10: ffffffffffffffff R11: 0000000000000001 R12: ffff94b68dc2dd60
[23687.529553] R13: 00000000000017d0 R14: ffff94afa89714c0 R15: ffff94af80043a00
[23687.529555] FS:  00007f2cba7fc700(0000) GS:ffff94b68dc00000(0000) knlGS:0000000000000000
[23687.529556] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[23687.529557] CR2: 00007fcbd7797000 CR3: 00000001aa754004 CR4: 00000000003706f0
[23687.529560] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[23687.529561] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[23687.529562] Call Trace:
[23687.529564]  ? free_msg+0x1b/0x40
[23687.529566]  kfree+0x382/0x500
[23687.529572]  free_msg+0x1b/0x40
[23687.529574]  do_mq_timedreceive+0x370/0x550
[23687.529578]  __x64_sys_mq_timedreceive+0x70/0xb0
[23687.529581]  do_syscall_64+0x30/0x80
[23687.529584]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
[23687.529588] RIP: 0033:0x7f2cd31ce0f3
[23687.529592] Code: 4c 24 20 e8 af 03 00 00 4c 8b 44 24 28 4c 8b 54 24 20 41 89 c1 48 8b 54 24 18 48 8b 74 24 10 b8 f3 00 00 00 8b 7c 24 08 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 cf 48 89 44 24 08 e8 d8 03 00 00 48
[23687.529593] RSP: 002b:00007f2cba7fbdd0 EFLAGS: 00000293 ORIG_RAX: 00000000000000f3
[23687.529595] RAX: ffffffffffffffda RBX: 000000000000000a RCX: 00007f2cd31ce0f3
[23687.529596] RDX: 000000000000800c RSI: 00007f2cc400a530 RDI: 000000000000001c
[23687.529597] RBP: 00007f2cc400a530 R08: 00007f2cba7fbe10 R09: 0000000000000000
[23687.529598] R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000800c
[23687.529599] R13: 000000000000001c R14: 0000000000000000 R15: 0000000000000000
[23687.529603] ---[ end trace 0000000000000003 ]---


-- System Information:
Debian Release: 11.6
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 5.10.0-21-rt-amd64 (SMP w/8 CPU threads; PREEMPT)
Kernel taint flags: TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages linux-image-5.10.0-20-rt-amd64 depends on:
ii  initramfs-tools [linux-initramfs-tool]  0.140
ii  kmod                                    28-1
ii  linux-base                              4.6

Versions of packages linux-image-5.10.0-20-rt-amd64 recommends:
ii  apparmor             2.13.6-10
ii  firmware-linux-free  20200122-1

Versions of packages linux-image-5.10.0-20-rt-amd64 suggests:
pn  debian-kernel-handbook  <none>
ii  grub-pc                 2.06-3~deb11u5
pn  linux-doc-5.10          <none>
Mit freundlichen Grüßen / Kind regards

Christian Barth

__________________________________________________
GÖPEL electronic GmbH
Göschwitzer Straße 58/60
07745 Jena / Germany

Tel.: +49-3641-6896-562
Fax: +49-3641-6896-944
E-Mail: C.Barth@goepel.com
www.goepel.com

25.-27.04. - Embedded JTAG Solutions Seminar für Fortgeschrittene
02.-03.05. - Embedded JTAG Solutions Seminar für Einsteiger
04.05. - Embedded JTAG Solutions - CASLAN-Seminar
09.-11.05. - SMT connect - Nürnberg - 4A-222
Sitz der Gesellschaft: GÖPEL electronic GmbH, Göschwitzer Straße 58/60, D-07745 Jena
Geschäftsführer: Alice Göpel, Jörg Schneider, Thomas Wenzel
Registergericht: Amtsgericht Jena HR B 201550

Attachment: smime.p7s
Description: S/MIME cryptographic signature


Reply to: