Re: [cfarm-admins] Trouble "Exception in kernel mode" on gcc203 test machine
Pierre,
I actually tried to check for this kernel bug (OOPS) on the latest
vanilla kernel and I wasn't able to reproduce it (with my test
powerpc64 lpar), so let me manually update kernel on gcc203 to the
latest vanilla stable (something like v6.12-rc4), up until more recent
debian sid/unstable kernel package is available.
I'll send an update on kernel installation.
Thanks.
On Tue, Oct 22, 2024 at 1:43 PM Pierre Muller via cfarm-admins
<cfarm-admins@lists.tetaneutral.net> wrote:
>
> Hi,
>
> I tried to reinstall a git checkout on that machine, but
> got blocked with a 'git' executable that did not even react to a 'kill -SIGKILL $PID'
>
> muller@cfarm203:~$ ps xf
> PID TTY STAT TIME COMMAND
> 396017 ? S 0:00 sshd-session: muller@pts/0
> 396020 pts/0 Ss 0:00 \_ -bash
> 396640 pts/0 S 0:00 \_ bash bin/fpc-source-git.sh
> 397177 pts/0 D 0:00 | \_ [git]
> 398725 pts/0 R+ 0:00 \_ ps xf
> 395989 ? Ss 0:00 /usr/lib/systemd/systemd --user
> 395994 ? S 0:00 \_ (sd-pam)
>
> Using dmesg, I found a relevant part below,
> is there a real hardware problem on that machine?
>
> Can I do something to help to find the origin of the problem?
>
> Pierre Muller
>
>
> [237844.541314] ------------[ cut here ]------------
> [237844.541338] kernel BUG at arch/powerpc/include/asm/book3s/64/hash-4k.h:141!
> [237844.541345] Oops: Exception in kernel mode, sig: 5 [#1]
> [237844.541350] BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
> [237844.541356] Modules linked in: tun binfmt_misc xfs xts ctr ibmveth pseries_rng vmx_crypto sg gf128mul nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables libcrc32c drm configfs drm_panel_orientation_quirks nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vsock ip_tables
> x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic dm_mod sd_mod t10_pi crc64_rocksoft crc64 crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp crc32c_vpmsum
> [237844.541418] CPU: 9 PID: 397177 Comm: git Not tainted 6.10.12-powerpc64 #1 Debian 6.10.12-1
> [237844.541425] Hardware name: IBM,8284-22A POWER8 (architected) 0x4b0201 0xf000004 of:IBM,FW860.42 (SV860_138) hv:phyp pSeries
> [237844.541431] NIP: c000000000088cc0 LR: c0000000004c0bf4 CTR: c000000000451bf4
> [237844.541435] REGS: c00000029f1677c0 TRAP: 0700 Not tainted (6.10.12-powerpc64 Debian 6.10.12-1)
> [237844.541441] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI> CR: 24222828 XER: 20000000
> [237844.541452] CFAR: c000000000088cb0 IRQMASK: 0
> GPR00: c0000000004c0bf4 c00000029f167a60 c000000001708100 80000001ed400104
> GPR04: 8000000000000104 0000000000000000 0000000000000013 fffffffffffe0000
> GPR08: 0000000000000000 00000001ed400000 00000000090f0180 0000000084222828
> GPR12: c000000000451bf4 c00000001dc09700 0000000000000000 c0003f0007b50000
> GPR16: 000000012dfc2a40 000000012dec8778 c00000029f167c08 0000000000000438
> GPR20: c000000214f04fb8 0000000000000000 60000000000000e0 0000000000000001
> GPR24: 0000000000001254 0000000000001254 c000000001e5bb18 c0003f00090f01a8
> GPR28: c00000002a47a210 00003fff7ca00000 c0003f0007b50000 c00000029f167c08
> [237844.541506] NIP [c000000000088cc0] mk_pmd+0x3c/0x40
> [237844.541516] LR [c0000000004c0bf4] do_set_pmd+0x17c/0x3d8
> [237844.541523] Call Trace:
> [237844.541525] [c00000029f167a60] [c0000000004515fc] next_uptodate_folio+0xe4/0x404 (unreliable)
> [237844.541534] [c00000029f167ac0] [c000000000451d08] filemap_map_pages+0x114/0x814
> [237844.541542] [c00000029f167be0] [c0000000004c7280] __handle_mm_fault+0x10e4/0x1f70
> [237844.541548] [c00000029f167cf0] [c0000000004c82f8] handle_mm_fault+0x1ec/0x374
> [237844.541555] [c00000029f167d40] [c000000000082ee0] ___do_page_fault+0x2ec/0xb50
> [237844.541561] [c00000029f167df0] [c000000000083978] hash__do_page_fault+0x30/0x74
> [237844.541567] [c00000029f167e20] [c00000000008b97c] do_hash_fault+0x1c4/0x2fc
> [237844.541574] [c00000029f167e50] [c000000000008918] data_access_common_virt+0x198/0x1f0
> [237844.541581] --- interrupt: 300 at 0x12dc58da0
> [237844.541588] NIP: 000000012dc58da0 LR: 000000012dc5b0b4 CTR: 00003fff80cd5a80
> [237844.541592] REGS: c00000029f167e80 TRAP: 0300 Not tainted (6.10.12-powerpc64 Debian 6.10.12-1)
> [237844.541597] MSR: 800000000280f032 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI> CR: 44222848 XER: 20000000
> [237844.541610] CFAR: 000000012dc5b0b0 DAR: 00003fff7ca00000 DSISR: 40000000 IRQMASK: 0
> GPR00: 000000012dc5d3c4 00003fffef920140 000000012dfa7900 00003fff7ca00000
> GPR04: 000000000055fe00 00003fffef9201b8 000000000055fe00 00003fffef9203b8
> GPR08: 0000000000000005 0000000000000000 0000000000000000 ffffffffffffffff
> GPR12: 0000000024222848 00003fff811187e0 0000000000000000 0000000000000087
> GPR16: 000000012dfc2a40 000000012dec8778 0000000000000087 0000000000000438
> GPR20: 0000000000000000 0000000000000087 0000000000000000 0000000000000000
> GPR24: 000000012dfd3e00 0000010036c5407c 00003fffef9203b8 00003fff7ca00000
> GPR28: 0000000000000000 00003fffef9203b8 0000000000000005 000000000055fe00
> [237844.541665] NIP [000000012dc58da0] 0x12dc58da0
> [237844.541668] LR [000000012dc5b0b4] 0x12dc5b0b4
> [237844.541672] --- interrupt: 300
> [237844.541674] Code: 60000000 3d220075 39293980 e9290000 7d291850 79298a24 7929aac2 7d232378 48000010 3920ffff 7923f04e 4e800020 <0fe00000> 7c0802a6 60000000 3d2051ff
> [237844.541694] ---[ end trace 0000000000000000 ]---
>
> [237844.543747] note: git[397177] exited with irqs disabled
>
> _______________________________________________
> cfarm-admins mailing list
> cfarm-admins@lists.tetaneutral.net
> https://lists.tetaneutral.net/listinfo/cfarm-admins
Reply to: