[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: What is wrong with kernels 5.X on sparc64 ?





29.03.20 22:27 John Paul Adrian Glaubitz kirjutas:
Does anybody have some information or experience with this phenomenon?

I think I have seen instability issues on the one IIIi machine we have
as well as this machine is running 4.19 for that reason. Other machines
such as the T2 or later models don't show any issues.

So, I think this issue needs to be bisected and reported to the SPARC
Linux kernel mailing list [1]. I'm CC'ing Meelis Roos who has been doing
a lot of kernel regression testing on sparc64 in the past and might have
already found the problematic commit.

My test matrix is publicly available at
https://docs.google.com/spreadsheets/d/1nCMi_lQWA0q97VPh5m7zP3SDpw6NiVm6tRxQxVww7-o/

Here, my Ultra 45 has some stability problems with 5.2 and 5.3 and failure with 5.4.
The failures looked same as with Netra 240 (for some reason, 5.3 workd there but 5.2
and 5.4 did not) - had something to do with RCU frequently. I never came to reporting
this fully since I could not exactly pinpoint it, then I started two office moves in
parallel and the quarantine came after that so I did not manage to set up my testbed
in the new office yet.

Looking for some notes - see below for some other crash notes - but no successful bisect.
However, there was a on irc (hopkirk, I think) who had a working 5.4.2 kernel config for
his Ultra45 with gcc-8.3.0 (I only tested with gcc-9).


Also, it appears I have some sparc with me at home - Ultra 45, E420R,Fujitsu M3000
(does not run Linux) and V445 with something broken (probably power backplane).

I might want to install Linux on E420R but do not have FC-AL disks with me.
If I get the disks, what's the state of Debian-ports installer - is it worth trying on sparc64 now?

So I reconnected and started the Ultra 45. It boots up with 5.4 (with warnings in dmesg)
and whatever Debian I had in December 2019. It did successful git fetch but died soon
after that with something that ended in this below. WIll try curent kernel to id I succeed.

[  828.422505] Kernel unaligned access at TPC[5a6b20] kmem_cache_alloc+0x60/0x280
[  828.511966] Unable to handle kernel paging request in mna handler
[  828.511968]  at virtual address 91d0200591d02005
[  828.643393] current->{active_,}mm->context = 000000000000021f
[  828.713824] current->{active_,}mm->pgd = fff0000279d00000
[  828.780076]               \|/ ____ \|/
[  828.780076]               "@'/ .. \`@"
[  828.780076]               /_| \__/ |_\
[  828.780076]                  \__U_/
[  828.962220] (journald)(398): Oops [#45]
[  829.009552] CPU: 0 PID: 398 Comm: (journald) Tainted: G      D           5.4.0 #26
[  829.103434] TSTATE: 0000000011001600 TPC: 00000000005a6b20 TNPC: 00000000005a6b24 Y: 03049a33    Tainted: G      D
[  829.241234] TPC: <kmem_cache_alloc+0x60/0x280>
[  829.296086] g0: fff0000279cf77d0 g1: 0000000000000000 g2: fff000027fda9fa8 g3: 0000000000000030
[  829.403697] g4: fff000027917c000 g5: fff00001000377d8 g6: fff0000279cf4000 g7: 00000000000012cd
[  829.511433] o0: fff00002787c3c80 o1: 0000000000000dc0 o2: 0000000000408001 o3: fff000027fda9fb0
[  829.619212] o4: 0000000000000bd0 o5: fff0000278b11000 sp: fff0000279cf7071 ret_pc: 00000000005a6c10
[  829.731189] RPC: <kmem_cache_alloc+0x150/0x280>
[  829.787146] l0: 0000000000000000 l1: 0000000000000000 l2: 0000000000000000 l3: 0000000000000000
[  829.894885] l4: 0000000000000000 l5: 0000000000000000 l6: 0000000000000000 l7: fff0000100484000
[  830.002727] i0: fff00002780c1e00 i1: 0000000000000dc0 i2: 00000000005be394 i3: 0000000000404000
[  830.110872] i4: fff00002787c3c80 i5: 91d0200591d02005 i6: fff0000279cf7121 i7: 00000000005be394
[  830.219104] I7: <__alloc_file+0x14/0xe0>
[  830.268042] Call Trace:
[  830.299131]  [00000000005be394] __alloc_file+0x14/0xe0
[  830.362645]  [00000000005be7ec] alloc_empty_file+0x4c/0xc0
[  830.430384]  [00000000005cd430] path_openat+0x10/0x1580
[  830.494997]  [00000000005cfbb0] do_filp_open+0x50/0xc0
[  830.558562]  [00000000005ba164] do_sys_open+0x144/0x240
[  830.623157]  [00000000005ba2c0] sys_openat+0x20/0x40
[  830.684571]  [0000000000406154] linux_sparc_syscall+0x34/0x44
[  830.755357] Caller[00000000005be394]: __alloc_file+0x14/0xe0
[  830.825019] Caller[00000000005be7ec]: alloc_empty_file+0x4c/0xc0
[  830.898870] Caller[00000000005cd430]: path_openat+0x10/0x1580
[  830.969623] Caller[00000000005cfbb0]: do_filp_open+0x50/0xc0
[  831.039277] Caller[00000000005ba164]: do_sys_open+0x144/0x240
[  831.109987] Caller[00000000005ba2c0]: sys_openat+0x20/0x40
[  831.177544] Caller[0000000000406154]: linux_sparc_syscall+0x34/0x44
[  831.254478] Caller[fff0000100303200]: 0xfff0000100303200
[  831.319880] Instruction DUMP:
[  831.319882]  02c74013
[  831.356956]  01000000
[  831.38668

I remember seeing these "Kernel unaligned access at TPC[..." on USIII class machines for long now and I have
mostly ignored them after reporting once but they seem to be a pattern in these crashes now.


5.4.0-rc4:

[   41.216971] Kernel unaligned access at TPC[4c5864] __cgroup_account_cputime+0x4/0x20
[   41.309705] Unable to handle kernel paging request in mna handler
[   41.309707]  at virtual address 91d0200591d02015
[   41.437811] current->{active_,}mm->context = 000000000000001c
[   41.506565] current->{active_,}mm->pgd = fff0000278cf4000
[   41.571154]               \|/ ____ \|/
[   41.571154]               "@'/ .. \`@"
[   41.571154]               /_| \__/ |_\
[   41.571154]                  \__U_/
[   41.747216] systemd-journal(65): Oops [#1]
[   41.796171] CPU: 0 PID: 65 Comm: systemd-journal Not tainted 5.4.0-rc4 #2
[   41.877426] TSTATE: 0000000011e01600 TPC: 00000000004c5864 TNPC: 00000000004c5868 Y: 039aed10    Not tainted
[   41.995145] TPC: <__cgroup_account_cputime+0x4/0x20>
[   42.054518] g0: 0000000000000201 g1: 91d0200591d02005 g2: 91d0200591d02005 g3: 0000000000b8e0b0
[   42.158695] g4: fff0000278b2bba0 g5: 000000000071dd90 g6: fff000027899c000 g7: 0000000000000000
[   42.262869] o0: fff0000278ace000 o1: 000000000004e99e o2: 0000000000000011 o3: 0000000000000001
[   42.367042] o4: fff0000278d04200 o5: fff0000278d04200 sp: fff000027899efa1 ret_pc: 0000000000489710
[   42.475390] RPC: <update_curr+0xb0/0x120>
[   42.523299] l0: 000000000000000d l1: 0000000030000000 l2: 000001000012bb50 l3: 0000000030200000
[   42.627474] l4: 0000000000000000 l5: 0000000000000000 l6: 0000000000000000 l7: fff00001009fdd20
[   42.731649] i0: 0000000000b8cda0 i1: 000000000004e99e i2: 0000000000000011 i3: ffffffffffffffff
[   42.835822] i4: 00000000001bc400 i5: fff0000278b2bbd8 i6: fff000027899f051 i7: 00000000004899c4
[   42.939997] I7: <dequeue_entity+0x4/0x220>
[   42.988952] Call Trace:
[   43.018122]  [00000000004899c4] dequeue_entity+0x4/0x220
[   43.081669]  [0000000000489c10] dequeue_task_fair+0x30/0x1c0
[   43.149382]  [00000000004852e4] deactivate_task+0x64/0xc0
[   43.213972]  [00000000009b787c] __schedule+0x27c/0x388
[   43.275430]  [00000000009b7b4c] schedule+0x2c/0xe0
[   43.332728]  [00000000009bae3c] schedule_hrtimeout_range_clock+0xbc/0xe0
[   43.412948]  [00000000005bec70] do_epoll_wait+0x390/0x3e0
[   43.477528]  [00000000005bf870] sys_epoll_wait+0x10/0x20
[   43.541077]  [0000000000406154] linux_sparc_syscall+0x34/0x44
[   43.609826] Disabling lock debugging due to kernel taint
[   43.673374] Caller[00000000004899c4]: dequeue_entity+0x4/0x220
[   43.743170] Caller[0000000000489c10]: dequeue_task_fair+0x30/0x1c0
[   43.817132] Caller[00000000004852e4]: deactivate_task+0x64/0xc0
[   43.887970] Caller[00000000009b787c]: __schedule+0x27c/0x388
[   43.955683] Caller[00000000009b7b4c]: schedule+0x2c/0xe0
[   44.019228] Caller[00000000009bae3c]: schedule_hrtimeout_range_clock+0xbc/0xe0
[   44.105695] Caller[00000000005bec70]: do_epoll_wait+0x390/0x3e0
[   44.176530] Caller[00000000005bf870]: sys_epoll_wait+0x10/0x20
[   44.246326] Caller[0000000000406154]: linux_sparc_syscall+0x34/0x44
[   44.321331] Caller[fff0000100216508]: 0xfff0000100216508




         Starting Load AppArmor profiles...
       [   48.573670] Kernel unaligned access at TPC[6f56d4] __list_add_valid+0x14/0xe0
 [   48.667771] Unable to handle kernel paging request in mna handler
 [   48.667773]  at virtual address 91d0200591d02005
S[   48.812970] current->{active_,}mm->context = 0000000000000064
t[   48.890314] current->{active_,}mm->pgd = fff0000278c1c000
a[   48.963307]               \|/ ____ \|/
[   48.963307]               "@'/ .. \`@"
[   48.963307]               /_| \__/ |_\
[   48.963307]                  \__U_/
r[   49.168319] (md-udevd)(103): Oops [#1]
t[   49.220890] CPU: 0 PID: 103 Comm: (md-udevd) Not tainted 5.4.0-rc4 #1
i[   49.305872] TSTATE: 0000004411001602 TPC: 00000000006f56d4 TNPC: 00000000006f56d8 Y: 00000000    Not tainted
n[   49.438100] TPC: <__list_add_valid+0x14/0xe0>
g[   49.497926] g0: 0000000000000000 g1: 0000000000000058 g2: 0000000000000001 g3: 0000000000000001
 [   49.616664] g4: fff0000278b80000 g5: 0000000000008000 g6: fff0000279104000 g7: fff0000278f2fbc0
  49.735458] o0: 0000000000000003 o1: 000007feff952710 o2: 0000000000000006 o3: 0000000000b02800
[[   49.854191] o4: 0000000000000000 o5: 91d0200591d02005 sp: fff00002791071c1 ret_pc: fff000010020caec
0[   49.977655] RPC: <0xfff000010020caec>
;[   50.029371] l0: 0000000000000000 l1: 0000000000000000 l2: 0000000000000000 l3: 0000000000000000
1[   50.148585] l4: 0000000000000000 l5: 0000000000000000 l6: 0000000000000000 l7: fff0000100388000
;[   50.268326] i0: fff000027638a200 i1: 91d0200591d02005 i2: fff0000278ae0840 i3: 0000000000000000
3[   50.388331] i4: 00000100002698c0 i5: 0000000000000000 i6: fff0000279107281 i7: 000000000052fea8
9[   50.508424] I7: <list_lru_add+0x48/0x140>
m[   50.564794] Call Trace:
u[   50.602276]  [000000000052fea8] list_lru_add+0x48/0x140
d[   50.673238]  [000000000058c980] d_lru_add+0x80/0x100
e[   50.741020]  [000000000058e340] dput+0x1a0/0x1c0
v[   50.804743]  [0000000000575fc8] __fput+0xe8/0x240
 [   50.869615]  [000000000047c270] task_work_run+0x70/0xa0
K[   50.940767]  [000000000042e500] do_notify_resume+0xa0/0xc0
e[   51.015052]  [0000000000404b08] __handle_signal+0xc/0x30
r[   51.087274] Disabling lock debugging due to kernel taint
n[   51.159412] Caller[000000000052fea8]: list_lru_add+0x48/0x140
e[   51.236618] Caller[000000000058c980]: d_lru_add+0x80/0x100
l[   51.310680] Caller[000000000058e340]: dput+0x1a0/0x1c0
 [   51.380446] Caller[0000000000575fc8]: __fput+0xe8/0x240
D[   51.451087] Caller[000000000047c270]: task_work_run+0x70/0xa0
e[   51.527820] Caller[000000000042e500]: do_notify_resume+0xa0/0xc0
v[   51.607596] Caller[0000000000404b08]: __handle_signal+0xc/0x30
i[   51.685178] Caller[fff000010020caec]: 0xfff000010020caec
c[   51.756284] Instruction DUMP:
e[   51.756286]  80a34019
 [   51.799093]  12600010
M[   51.834408]  17002c0a
a[   51.869687] <fa5b4000>
n[   51.904761]  80a7401a
a[   51.939492]  12600017
g[   51.974044]  8e1b4018
e[   52.008275]  84102000
r[   52.042288]  861f4018
  52.076018]
[0m...[   52.138193] Kernel unaligned access at TPC[6f56d4] __list_add_valid+0x14/0xe0

[   52.229458] Unable to handle kernel paging request in mna handler
[   52.229461]  at virtual address 91d0200591d02005
[   52.367609] current->{active_,}mm->context = 0000000000000063
[   52.440876] current->{active_,}mm->pgd = fff0000278bfc000
[   52.510030]               \|/ ____ \|/
[   52.510030]               "@'/ .. \`@"
[   52.510030]               /_| \__/ |_\
[   52.510030]                  \__U_/
[   52.703555] systemd-detect-(102): Oops [#2]
[   52.757709] CPU: 0 PID: 102 Comm: systemd-detect- Tainted: G      D           5.4.0-rc4 #1
[   52.865179] TSTATE: 0000004411001601 TPC: 00000000006f56d4 TNPC: 00000000006f56d8 Y: 00000000    Tainted: G      D
[   53.008752] TPC: <__list_add_valid+0x14/0xe0>
[   53.065492] g0: 0000000000000000 g1: 0000000000000038 g2: 0000000000000001 g3: 0000000000000001
[   53.179301] g4: fff0000278b82e60 g5: 0000000000008000 g6: fff000027900c000 g7: fff0000278f2fbc0
[   53.293625] o0: 0000000000000003 o1: 0000000000000000 o2: 0000000000000000 o3: 0000000000b02800
[   53.408842] o4: 0000000000000bd0 o5: 91d0200591d02005 sp: fff000027900f1c1 ret_pc: fff0000100344aec
[   53.529175] RPC: <0xfff0000100344aec>
[   53.579061] l0: 0000000000000000 l1: 0000000000000000 l2: 0000000000000000 l3: 0000000000000000
[   53.696172] l4: 0000000000000000 l5: 0000000000000000 l6: 0000000000000000 l7: fff00001004c0000
[   53.814127] i0: fff0000276374980 i1: 91d0200591d02005 i2: fff0000278ae07c0 i3: 0000000000000000
[   53.932836] i4: 0000000000000000 i5: fff00001004c2dc0 i6: fff000027900f281 i7: 000000000052fea8
[   54.051815] I7: <list_lru_add+0x48/0x140>
[   54.107194] Call Trace:
[   54.143703]  [000000000052fea8] list_lru_add+0x48/0x140
[   54.213649]  [000000000058c980] d_lru_add+0x80/0x100
[   54.280544]  [000000000058e340] dput+0x1a0/0x1c0
[   54.343349]  [0000000000575fc8] __fput+0xe8/0x240
[   54.407281]  [000000000047c270] task_work_run+0x70/0xa0
[   54.477524]  [000000000042e500] do_notify_resume+0xa0/0xc0
[   54.550837]  [0000000000404b08] __handle_signal+0xc/0x30
[   54.621952] Caller[000000000052fea8]: list_lru_add+0x48/0x140
[   54.698241] Caller[000000000058c980]: d_lru_add+0x80/0x100
[   54.771256] Caller[000000000058e340]: dput+0x1a0/0x1c0
[   54.840088] Caller[0000000000575fc8]: __fput+0xe8/0x240
[   54.909835] Caller[000000000047c270]: task_work_run+0x70/0xa0
[   54.985602] Caller[000000000042e500]: do_notify_resume+0xa0/0xc0
[   55.064423] Caller[0000000000404b08]: __handle_signal+0xc/0x30
[   55.140966] Caller[fff0000100344aec]: 0xfff0000100344aec
[   55.211149] Instruction DUMP:
[   55.211150]  80a34019
[   55.253010]  12600010
[   55.287243]  17002c0a
[   55.321307] <fa5b4000>
[   55.355337]  80a7401a
[   55.389193]  12600017
[   55.422801]  8e1b4018
[   55.456111]  84102000
[   55.489214]  861f4018
[   55.522107]
         Starting Create [   55.615046] ------------[ cut here ]------------
V[   55.675818] WARNING: CPU: 0 PID: 101 at fs/dcache.c:414 select_collect+0x94/0xc0
o[   55.774097] Modules linked in: ip_tables x_tables autofs4
l[   55.844410] CPU: 0 PID: 101 Comm: apparmor.system Tainted: G      D           5.4.0-rc4 #1
a[   55.953813] Call Trace:
t[   55.988823]  [0000000000461dc4] warn_slowpath_fmt+0x30/0x8c
i[   56.061451]  [000000000058d454] select_collect+0x94/0xc0
l[   56.130910]  [000000000058d7bc] d_walk+0x7c/0x200
e[   56.193059]  [000000000058ef90] shrink_dcache_parent+0x30/0x100
 [   56.269726]  [000000000058f0c8] d_invalidate+0x28/0xc0
F[   56.337023]  [00000000005ef36c] proc_flush_task+0x8c/0x160
i[   56.408543]  [0000000000462948] release_task+0x28/0x3c0
l[   56.477003]  [0000000000463338] wait_consider_task+0x658/0x760
e[   56.552741]  [00000000004634fc] do_wait+0xbc/0x1c0
s[   56.615915]  [0000000000464848] kernel_wait4+0x68/0x100
 [   56.684216]  [000000000046492c] __do_sys_wait4+0x4c/0x60
a[   56.753548]  [0000000000406154] linux_sparc_syscall+0x34/0x44
n[   56.828134] ---[ end trace f4b38a93c287277c ]---
d Dir[   56.893096] Kernel unaligned access at TPC[6f56d4] __list_add_valid+0x14/0xe0
e[   56.984921] Unable to handle kernel paging request in mna handler
c[   56.984924]  at virtual address 91d0200591d02005
t[   57.125559] current->{active_,}mm->context = 0000000000000024
o[   57.200804] current->{active_,}mm->pgd = fff0000278c14000
r[   57.271895]               \|/ ____ \|/
[   57.271895]               "@'/ .. \`@"
[   57.271895]               /_| \__/ |_\
[   57.271895]                  \__U_/
i[   57.470656] systemd-journal(70): Oops [#3]
e[   57.526019] CPU: 0 PID: 70 Comm: systemd-journal Tainted: G      D W         5.4.0-rc4 #1
s[   57.635748] TSTATE: 0000004411001603 TPC: 00000000006f56d4 TNPC: 00000000006f56d8 Y: 00a883c9    Tainted: G      D W
  57.782088] TPC: <__list_add_valid+0x14/0xe0>
[[   57.840532] g0: 0000000000000000 g1: 0000000000000040 g2: 0000000000000001 g3: 0000000000000001
0[   57.956824] g4: fff0000278b806a0 g5: 0000000000008000 g6: fff0000278c80000 g7: fff0000278f2fbc0
m[   58.073675] o0: 0000000000000013 o1: 0000000000000000 o2: 0000000000000000 o3: 0000000000b02800
.[   58.191188] o4: 0000000000000bd0 o5: 91d0200591d02005 sp: fff0000278c831c1 ret_pc: fff000010020caec
.[   58.313404] RPC: <0xfff000010020caec>
.[   58.364646] l0: 0000000000000000 l1: 0000000000000000 l2: 0000000000000000 l3: 0000000000000000
[   58.483280] l4: 0000000000000000 l5: 0000000000000000 l6: 0000000000000000 l7: fff0000100388000

[   58.602641] i0: fff000027622ec80 i1: 91d0200591d02005 i2: fff0000278ae07e0 i3: 0000000000000000
[   58.722618] i4: 0000000000000000 i5: fff000010038adc0 i6: fff0000278c83281 i7: 000000000052fea8
[   58.841888] I7: <list_lru_add+0x48/0x140>
[   58.897284] Call Trace:
[   58.933913]  [000000000052fea8] list_lru_add+0x48/0x140
[   59.003966]  [000000000058c980] d_lru_add+0x80/0x100
[   59.070981]  [000000000058e340] dput+0x1a0/0x1c0
[   59.133994]  [0000000000575fc8] __fput+0xe8/0x240
[   59.198136]  [000000000047c270] task_work_run+0x70/0xa0
[   59.268469]  [000000000042e500] do_notify_resume+0xa0/0xc0
[   59.341867]  [0000000000404b08] __handle_signal+0xc/0x30
[   59.413067] Caller[000000000052fea8]: list_lru_add+0x48/0x140
[   59.489353] Caller[000000000058c980]: d_lru_add+0x80/0x100
[   59.562455] Caller[000000000058e340]: dput+0x1a0/0x1c0
[   59.631291] Caller[0000000000575fc8]: __fput+0xe8/0x240
[   59.701038] Caller[000000000047c270]: task_work_run+0x70/0xa0
[   59.776920] Caller[000000000042e500]: do_notify_resume+0xa0/0xc0
[   59.855755] Caller[0000000000404b08]: __handle_signal+0xc/0x30
[   59.932383] Caller[fff000010020caec]: 0xfff000010020caec
[   60.002656] Instruction DUMP:
[   60.002658]  80a34019
[   60.044542]  12600010
[   60.078887]  17002c0a
[   60.113062] <fa5b4000>
[   60.147178]  80a7401a
[   60.181034]  12600017
[   60.214645]  8e1b4018
[   60.248044]  84102000
[   60.281251]  861f4018
[   60.314143]



[   41.216971] Kernel unaligned access at TPC[4c5864] __cgroup_account_cputime+0x4/0x20
[   41.309705] Unable to handle kernel paging request in mna handler
[   41.309707]  at virtual address 91d0200591d02015
[   41.437811] current->{active_,}mm->context = 000000000000001c
[   41.506565] current->{active_,}mm->pgd = fff0000278cf4000
[   41.571154]               \|/ ____ \|/
[   41.571154]               "@'/ .. \`@"
[   41.571154]               /_| \__/ |_\
[   41.571154]                  \__U_/
[   41.747216] systemd-journal(65): Oops [#1]
[   41.796171] CPU: 0 PID: 65 Comm: systemd-journal Not tainted 5.4.0-rc4 #2
[   41.877426] TSTATE: 0000000011e01600 TPC: 00000000004c5864 TNPC: 00000000004c5868 Y: 039aed10    Not tainted
[   41.995145] TPC: <__cgroup_account_cputime+0x4/0x20>
[   42.054518] g0: 0000000000000201 g1: 91d0200591d02005 g2: 91d0200591d02005 g3: 0000000000b8e0b0
[   42.158695] g4: fff0000278b2bba0 g5: 000000000071dd90 g6: fff000027899c000 g7: 0000000000000000
[   42.262869] o0: fff0000278ace000 o1: 000000000004e99e o2: 0000000000000011 o3: 0000000000000001
[   42.367042] o4: fff0000278d04200 o5: fff0000278d04200 sp: fff000027899efa1 ret_pc: 0000000000489710
[   42.475390] RPC: <update_curr+0xb0/0x120>
[   42.523299] l0: 000000000000000d l1: 0000000030000000 l2: 000001000012bb50 l3: 0000000030200000
[   42.627474] l4: 0000000000000000 l5: 0000000000000000 l6: 0000000000000000 l7: fff00001009fdd20
[   42.731649] i0: 0000000000b8cda0 i1: 000000000004e99e i2: 0000000000000011 i3: ffffffffffffffff
[   42.835822] i4: 00000000001bc400 i5: fff0000278b2bbd8 i6: fff000027899f051 i7: 00000000004899c4
[   42.939997] I7: <dequeue_entity+0x4/0x220>
[   42.988952] Call Trace:
[   43.018122]  [00000000004899c4] dequeue_entity+0x4/0x220
[   43.081669]  [0000000000489c10] dequeue_task_fair+0x30/0x1c0
[   43.149382]  [00000000004852e4] deactivate_task+0x64/0xc0
[   43.213972]  [00000000009b787c] __schedule+0x27c/0x388
[   43.275430]  [00000000009b7b4c] schedule+0x2c/0xe0
[   43.332728]  [00000000009bae3c] schedule_hrtimeout_range_clock+0xbc/0xe0
[   43.412948]  [00000000005bec70] do_epoll_wait+0x390/0x3e0
[   43.477528]  [00000000005bf870] sys_epoll_wait+0x10/0x20
[   43.541077]  [0000000000406154] linux_sparc_syscall+0x34/0x44
[   43.609826] Disabling lock debugging due to kernel taint
[   43.673374] Caller[00000000004899c4]: dequeue_entity+0x4/0x220
[   43.743170] Caller[0000000000489c10]: dequeue_task_fair+0x30/0x1c0
[   43.817132] Caller[00000000004852e4]: deactivate_task+0x64/0xc0
[   43.887970] Caller[00000000009b787c]: __schedule+0x27c/0x388
[   43.955683] Caller[00000000009b7b4c]: schedule+0x2c/0xe0
[   44.019228] Caller[00000000009bae3c]: schedule_hrtimeout_range_clock+0xbc/0xe0
[   44.105695] Caller[00000000005bec70]: do_epoll_wait+0x390/0x3e0
[   44.176530] Caller[00000000005bf870]: sys_epoll_wait+0x10/0x20
[   44.246326] Caller[0000000000406154]: linux_sparc_syscall+0x34/0x44
[   44.321331] Caller[fff0000100216508]: 0xfff0000100216508
[   44.384875] Instruction DUMP:
[   44.384876]  01000000
[   44.420293]  01000000
[   44.448419]  c45a22b8
[   44.476546] <c258a010>
[   44.504673]  82004009
[   44.532800]  c270a010
[   44.560926]  8213c000
[   44.589054]  106ffefb
[   44.617180]  01000000
[   44.645307]


[   44.691165] Kernel unaligned access at TPC[4c589c] __cgroup_account_cputime_field+0x1c/0x60
[   44.791157] Unable to handle kernel paging request in mna handler
[   44.791158]  at virtual address 91d0200591d02005
[   44.919284] current->{active_,}mm->context = 000000000000001c
[   44.988039] current->{active_,}mm->pgd = fff0000278cf4000
[   45.052626]               \|/ ____ \|/
[   45.052626]               "@'/ .. \`@"
[   45.052626]               /_| \__/ |_\
[   45.052626]                  \__U_/
[   45.228689] systemd-journal(65): Oops [#2]
[   45.277643] CPU: 0 PID: 65 Comm: systemd-journal Tainted: G      D           5.4.0-rc4 #2
[   45.375569] TSTATE: 0000009980e01604 TPC: 00000000004c589c TNPC: 00000000004c58a0 Y: 00000000    Tainted: G      D
[   45.509954] TPC: <__cgroup_account_cputime_field+0x1c/0x60>
[   45.576620] g0: fff000027899e581 g1: 91d0200591d02005 g2: 000000007270e000 g3: 0000000000000010
[   45.680796] g4: fff0000278b2bba0 g5: 000000000071dd90 g6: fff000027899c000 g7: 0000000000b8dfb0
[   45.784969] o0: fff0000278ace000 o1: 0000000000000000 o2: 00000000003d0900 o3: 0000000000000158
[   45.889142] o4: 0000000000c32388 o5: 0000000000000000 sp: fff000027899e5c1 ret_pc: 0000000000487bc8
[   45.997485] RPC: <account_system_index_time+0x88/0xa0>
[   46.058942] l0: 000000000000000e l1: ffffffffffffffff l2: 0000000000000000 l3: 0000000000000000
[   46.163119] l4: 000000000000b1e4 l5: 003b9aca00000000 l6: 0000000000000000 l7: 0000000000000008
[   46.267291] i0: fff0000278b2bba0 i1: 00000000003d0900 i2: 0000000000000002 i3: 0000000000000000
[   46.371464] i4: 0000000000c32000 i5: 00000000000f0000 i6: fff000027899e671 i7: 00000000004a32e8
[   46.475643] I7: <update_process_times+0x8/0x60>
[   46.529803] Call Trace:
[   46.558973]  [00000000004a32e8] update_process_times+0x8/0x60
[   46.627736]  [00000000004b06a8] tick_sched_timer+0x28/0x80
[   46.693360]  [00000000004a3c6c] __hrtimer_run_queues.constprop.0+0x10c/0x1e0
[   46.777739]  [00000000004a4304] hrtimer_interrupt+0xc4/0x200
[   46.845453]  [00000000009bb5ac] timer_interrupt+0x4c/0x80
[   46.910040]  [00000000004209d4] tl0_irq14+0x14/0x20
[   46.968381]  [0000000000463c74] do_exit+0x94/0xa80
[   47.025674]  [000000000042b44c] die_if_kernel+0x1e4/0x24c
[   47.090259]  [00000000004338f4] kernel_mna_trap_fault+0xd4/0x120
[   47.162138]  [000000000042a5e4] mem_address_unaligned+0xc4/0xe0
[   47.232976]  [0000000000405dfc] do_mna+0x3c/0x48
[   47.288188]  [00000000004c5864] __cgroup_account_cputime+0x4/0x20
[   47.361110]  [00000000004899c4] dequeue_entity+0x4/0x220
[   47.424655]  [0000000000489c10] dequeue_task_fair+0x30/0x1c0
[   47.492367]  [00000000004852e4] deactivate_task+0x64/0xc0
[   47.556955]  [00000000009b787c] __schedule+0x27c/0x388
[   47.618417] Caller[00000000004a32e8]: update_process_times+0x8/0x60
[   47.693423] Caller[00000000004b06a8]: tick_sched_timer+0x28/0x80
[   47.765302] Caller[00000000004a3c6c]: __hrtimer_run_queues.constprop.0+0x10c/0x1e0
[   47.855936] Caller[00000000004a4304]: hrtimer_interrupt+0xc4/0x200
[   47.929897] Caller[00000000009bb5ac]: timer_interrupt+0x4c/0x80
[   48.000734] Caller[00000000004209d4]: tl0_irq14+0x14/0x20
[   48.065321] Caller[0000000000463c64]: do_exit+0x84/0xa80
[   48.128866] Caller[000000000042b44c]: die_if_kernel+0x1e4/0x24c
[   48.199705] Caller[00000000004338f4]: kernel_mna_trap_fault+0xd4/0x120
[   48.277836] Caller[000000000042a5e4]: mem_address_unaligned+0xc4/0xe0
[   48.354923] Caller[0000000000405dfc]: do_mna+0x3c/0x48
[   48.416385] Caller[0000000000489710]: update_curr+0xb0/0x120
[   48.484098] Caller[00000000004899c4]: dequeue_entity+0x4/0x220
[   48.553895] Caller[0000000000489c10]: dequeue_task_fair+0x30/0x1c0
[   48.627857] Caller[00000000004852e4]: deactivate_task+0x64/0xc0
[   48.698696] Caller[00000000009b787c]: __schedule+0x27c/0x388
[   48.766408] Caller[00000000009b7b4c]: schedule+0x2c/0xe0
[   48.829953] Caller[00000000009bae3c]: schedule_hrtimeout_range_clock+0xbc/0xe0
[   48.916420] Caller[00000000005bec70]: do_epoll_wait+0x390/0x3e0
[   48.987254] Caller[00000000005bf870]: sys_epoll_wait+0x10/0x20
[   49.057052] Caller[0000000000406154]: linux_sparc_syscall+0x34/0x44
[   49.132056] Caller[fff0000100216508]: 0xfff0000100216508
[   49.195600] Instruction DUMP:
[   49.195601]  80a26002
[   49.231018]  18400005
[   49.259145]  01000000
[   49.287271] <c4584000>
[   49.315398]  9400800a
[   49.343524]  d4704000
[   49.371652]  8213c000
[   49.399779]  106ffeed
[   49.427905]  01000000
[   49.456031]
[   49.501869] Kernel panic - not syncing: Aiee, killing interrupt handler!




Now Linux version 5.4.0 (mroos@u45) (gcc version 8.3.0 (Debian 8.3.0-26)):

[   41.142331] Unable to handle kernel NULL pointer dereference
[   41.210139] tsk->{mm,active_mm}->context = 0000000000000001
[   41.276886] tsk->{mm,active_mm}->pgd = fff0000278868000
[   41.339382]               \|/ ____ \|/
[   41.339382]               "@'/ .. \`@"
[   41.339382]               /_| \__/ |_\
[   41.339382]                  \__U_/
[   41.515446] systemd(1): Oops [#1]
[   41.555024] CPU: 0 PID: 1 Comm: systemd Not tainted 5.4.0 #9
[   41.622735] TSTATE: 0000009911001606 TPC: 0000000000887148 TNPC: 000000000088714c Y: 00000000    Not tainted
[   41.740465] TPC: <__netlink_lookup+0x28/0x1a0>
[   41.793577] g0: fff0000278b57c00 g1: fff0000278097930 g2: 0000000000b13000 g3: fff0000278097d18
[   41.897754] g4: fff0000278098000 g5: 000000003d3f2074 g6: fff0000278094000 g7: 0000000000000000
[   42.001927] o0: fff0000278007080 o1: 0000000000082cc0 o2: 000000000000007c o3: fff000027fda4140
[   42.106101] o4: 000007feffa1a210 o5: 000000000000000c sp: fff0000278097071 ret_pc: 0000000000561b10
[   42.214446] RPC: <__kmalloc_track_caller+0x170/0x260>
[   42.274858] l0: 0000000000000000 l1: 0000000000000000 l2: 0000000000000000 l3: 0000000000000000
[   42.379035] l4: 0000000000000000 l5: 0000000000000000 l6: 0000000000000000 l7: fff00001005b8000
[   42.483208] i0: fff000027807e280 i1: 0000000000000000 i2: 91d0200591d02005 i3: 00000000deadbefb
[   42.587381] i4: 0000000000000000 i5: fff0000278007080 i6: fff0000278097151 i7: 00000000008872e8
[   42.691556] I7: <netlink_lookup+0x28/0x60>
[   42.740511] Call Trace:
[   42.769682]  [00000000008872e8] netlink_lookup+0x28/0x60
[   42.833229]  [00000000008885a4] netlink_unicast+0x44/0x1e0
[   42.898858]  [00000000008888fc] netlink_sendmsg+0x1bc/0x360
[   42.965529]  [000000000081c634] sock_sendmsg+0x34/0x60
[   43.026992]  [000000000081e9d0] __sys_sendto+0xb0/0x100
[   43.089494]  [000000000081ea38] sys_sendto+0x18/0x40
[   43.148877]  [0000000000406154] linux_sparc_syscall+0x34/0x44
[   43.217626] Disabling lock debugging due to kernel taint
[   43.281174] Caller[00000000008872e8]: netlink_lookup+0x28/0x60
[   43.350970] Caller[00000000008885a4]: netlink_unicast+0x44/0x1e0
[   43.422850] Caller[00000000008888fc]: netlink_sendmsg+0x1bc/0x360
[   43.495774] Caller[000000000081c634]: sock_sendmsg+0x34/0x60
[   43.563483] Caller[000000000081e9d0]: __sys_sendto+0xb0/0x100
[   43.632238] Caller[000000000081ea38]: sys_sendto+0x18/0x40
[   43.697867] Caller[0000000000406154]: linux_sparc_syscall+0x34/0x44
[   43.772873] Caller[fff0000100447d18]: 0xfff0000100447d18
[   43.836415] Instruction DUMP:
[   43.836418]  f077a7ef
[   43.871834]  b616e2fb
[   43.899960]  c277a7f7
[   43.928087] <c6072008>
[   43.956213]  fa07a7e3
[   43.984340]  8600c01b
[   44.012466]  c407a7e7
[   44.040594]  ba00c01d
[   44.068721]  d2070000
[   44.096847]
[   44.161184] Unable to handle kernel NULL pointer dereference
[   44.228994] tsk->{mm,active_mm}->context = 0000000000000041
[   44.295723] tsk->{mm,active_mm}->pgd = fff0000278cfc000
[   44.358220]               \|/ ____ \|/
[   44.358220]               "@'/ .. \`@"
[   44.358220]               /_| \__/ |_\
[   44.358220]                  \__U_/
[   44.534284] systemd(1): Oops [#2]
[   44.573862] CPU: 0 PID: 1 Comm: systemd Tainted: G      D           5.4.0 #9
[   44.658243] TSTATE: 0000004411001600 TPC: 0000000000887ac8 TNPC: 0000000000887acc Y: 02ea6093    Tainted: G      D
[   44.792639] TPC: <netlink_release+0x48/0x5c0>
[   44.844710] g0: 0000000000000100 g1: 000000000000a280 g2: 00000000000000d0 g3: fff0000278074000
[   44.948886] g4: fff0000278098000 g5: 000000000040dd90 g6: fff0000278094000 g7: fff0000278042c80
[   45.053059] o0: 0000000000000000 o1: 000000000000000c o2: fff00002780051c8 o3: 0000000000000000
[   45.157233] o4: 0000000000b0c850 o5: 00000000000070c0 sp: fff0000278096891 ret_pc: 000000000055fb60
[   45.265577] RPC: <__slab_free+0x1e0/0x240>
[   45.314531] l0: 0000000000000000 l1: 0000000000000000 l2: 0000000000000001 l3: ffffffffffffffff
[   45.418709] l4: 0000000000000100 l5: 0000000000000350 l6: 000c0000059c7d20 l7: fff0000278ae63d8
[   45.522881] i0: fff00002761b8d20 i1: 0000000000b13000 i2: 0000000000887b20 i3: fff000027807e280
[   45.627055] i4: fff0000278a9b3d0 i5: fff0000278a9b000 i6: fff0000278096951 i7: 000000000081ad6c
[   45.731236] I7: <__sock_release+0x2c/0xc0>
[   45.780184] Call Trace:
[   45.809356]  [000000000081ad6c] __sock_release+0x2c/0xc0
[   45.872901]  [000000000081ae0c] sock_close+0xc/0x20
[   45.931244]  [0000000000575494] __fput+0x94/0x260
[   45.987494]  [000000000047c6f4] task_work_run+0x74/0xa0
[   46.050003]  [0000000000464208] do_exit+0x268/0xa60
[   46.108337]  [000000000042b518] die_if_kernel+0x1ec/0x258
[   46.172921]  [0000000000450b64] unhandled_fault+0x84/0xa0
[   46.237508]  [0000000000450abc] do_sparc64_fault+0x79c/0x7c0
[   46.305227]  [0000000000407440] sparc64_realfault_common+0x10/0x20
[   46.379185]  [0000000000887148] __netlink_lookup+0x28/0x1a0
[   46.445856]  [00000000008872e8] netlink_lookup+0x28/0x60
[   46.509401]  [00000000008885a4] netlink_unicast+0x44/0x1e0
[   46.575031]  [00000000008888fc] netlink_sendmsg+0x1bc/0x360
[   46.641700]  [000000000081c634] sock_sendmsg+0x34/0x60
[   46.703165]  [000000000081e9d0] __sys_sendto+0xb0/0x100
[   46.765667]  [000000000081ea38] sys_sendto+0x18/0x40
[   46.825046] Caller[000000000081ad6c]: __sock_release+0x2c/0xc0
[   46.894842] Caller[000000000081ae0c]: sock_close+0xc/0x20
[   46.959431] Caller[0000000000575494]: __fput+0x94/0x260
[   47.021934] Caller[000000000047c6f4]: task_work_run+0x74/0xa0
[   47.090688] Caller[0000000000464208]: do_exit+0x268/0xa60
[   47.155276] Caller[000000000042b518]: die_if_kernel+0x1ec/0x258
[   47.226113] Caller[0000000000450b64]: unhandled_fault+0x84/0xa0
[   47.296951] Caller[0000000000450abc]: do_sparc64_fault+0x79c/0x7c0
[   47.370915] Caller[0000000000407440]: sparc64_realfault_common+0x10/0x20
[   47.451129] Caller[0000000000561b10]: __kmalloc_track_caller+0x170/0x260
[   47.531346] Caller[00000000008872e8]: netlink_lookup+0x28/0x60
[   47.601138] Caller[00000000008885a4]: netlink_unicast+0x44/0x1e0
[   47.673018] Caller[00000000008888fc]: netlink_sendmsg+0x1bc/0x360
[   47.745938] Caller[000000000081c634]: sock_sendmsg+0x34/0x60
[   47.813650] Caller[000000000081e9d0]: __sys_sendto+0xb0/0x100
[   47.882405] Caller[000000000081ea38]: sys_sendto+0x18/0x40
[   47.948035] Caller[0000000000406154]: linux_sparc_syscall+0x34/0x44
[   48.023040] Caller[fff0000100447d18]: 0xfff0000100447d18
[   48.086583] Instruction DUMP:
[   48.086584]  b416a320
[   48.122001]  d016e016
[   48.150128]  9210200c
[   48.178254] <d4042008>
[   48.206381]  7ffffd65
[   48.234508]  90270008
[   48.262635]  d2040000
[   48.290761]  c2042004
[   48.318888]  92027fff
[   48.347015]
[   48.392850] Fixing recursive fault but reboot is needed!


--
Meelis Roos <mroos@linux.ee>


Reply to: