[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1053122: linux-image-6.5.0-1-amd64: using smp_processor_id() in preemptible



Control: retitle -1 linux-image-6.5.0-1-amd64: Kernel page fault in
process exit due to bit flip
Control: tag -1 moreinfo

On Wed, 2023-09-27 at 20:45 +0200, Gabriel Francisco wrote:
> Package: src:linux
> Version: 6.5.3-1
> Severity: important
> Tags: upstream
> X-Debbugs-Cc: frc.gabriel@gmail.com
> 
> Dear Maintainer,
> 
> First of all thanks for your hard work!
> 
> I noticed my computer started freezing for few seconds when entering/exiting
> full screen videos in youtube using firefox and while trying to check if the
> issue also afected chromium I saw the following message in dmesg:
> 
> [12569.564300] BUG: unable to handle page fault for address: ffff991989e936b8
> [12569.564304] #PF: supervisor write access in kernel mode
> [12569.564306] #PF: error_code(0x0002) - not-present page

The first BUG message should be more meaningful that what comes after.
This shows the kernel tried to access non-existent memory.

> [12569.564308] PGD 0 P4D 0 
> [12569.564311] Oops: 0002 [#1] PREEMPT SMP NOPTI
> [12569.564314] CPU: 10 PID: 328649 Comm: Chroot Helper Not tainted 6.5.0-1-amd64 #1  Debian 6.5.3-1
> [12569.564317] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING WIFI II, BIOS 3205 08/14/2023
> [12569.564318] RIP: 0010:down_write+0x23/0x70
> [12569.564324] Code: 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 53 48 89 fb e8 2e bc ff ff bf 01 00 00 00 e8 74 3a 53 ff 31 c0 ba 01 00 00 00 <f0> 48 0f b1 13 75 33 65 48 8b 04 25 80 29 03 00 48 89 43 08 bf 01
> [12569.564326] RSP: 0018:ffffa189d736fc70 EFLAGS: 00010246
> [12569.564328] RAX: 0000000000000000 RBX: ffff991989e936b8 RCX: ffff891797aaef00
> [12569.564330] RDX: 0000000000000001 RSI: ffff891989e645c0 RDI: ffffffff8e7c95dc
> [12569.564331] RBP: ffffffffffffffff R08: 0000000000000060 R09: 0000000080400014
> [12569.564333] R10: ffff8918cbfeb7f8 R11: 0000000000000006 R12: 00007f7e5fd00000
> [12569.564334] R13: 0000000000000001 R14: ffff891989e645c0 R15: ffff891989e64958

The CPU registers contain several addresses starting ffff89, except for
rbx which starts ffff99 (and is the faulting address).  That looks like
a single bit got flipped.

This could be due to a kernel bug, but is more likely a hardware
problem.  Please test the RAM with memtest86+.  Also if you've enabled
any overclocking options, turn those off.

[...]
> After that the computer can't shutdown and systemd keeps waiting on process PID 328649 (Chroot Helper).

This (and the other BUG messages) are because that process crashed in
kernel mode and couldn't properly exit.

Ben.

-- 
Ben Hutchings
Beware of bugs in the above code;
I have only proved it correct, not tried it. - Donald Knuth

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: