[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#883413: src:linux: Still reproducible with linux-image-4.15.0-rc8-amd64



Package: src:linux
Followup-For: Bug #883413

Hi Ben,

Unfortunately I can still reproduce this problem on 4.15-rc8 from
experimental.

The cmdline for this boot was:

BOOT_IMAGE=/boot/vmlinuz-4.15.0-rc8-amd64
root=/dev/mapper/vg_tarquin-rootfs ro intel_iommu=on vsyscall=emulate
scsi_mod.use_blk_mq=Y dm_mod.use_blk_mq=Y intel_pstate=passive
i915.disable_display=Y i915.enable_gvt=Y apparmor=0
systemd.unified_cgroup_hierarchy=1 console=ttyS1,115200n8 console=tty0

This triggers with DefaultMemoryAccounting=yes enabled in
/etc/systemd/system.conf, and NUT seems to regularly be involved in the
crash on my system. Sadly the systemd unit is very simple indeed, and
because my UPS is network-connected I'm not even doing dodgy things like
USB from within NUT.

Quite how the kernel thinks that nut-server.service is using 16 ZiB of
memory is beyond me; presumably this is a slightly negative 64-bit int
bring cast unsigned. The following also feels like a smoking gun:

[ 2982.158622] percpu ref (css_release) <= 0 (-197) after switching to atomic

The kernel log is:

[ 2611.549862] WARNING: CPU: 0 PID: 20830 at /build/linux-b8fmzT/linux-4.15~rc8/mm/page_counter.c:27 page_counter_cancel+0x17/0x20
[ 2611.561360] Modules linked in: binfmt_misc fuse vhost_net vhost tap tun devlink bridge 8021q garp mrp stp llc nls_ascii nls_cp437 vfat fat intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel i915 kvm ast irqbypass crct10dif_pclmul crc32_pclmul ttm drm_kms_helper ghash_clmulni_intel intel_cstate sg efi_pstore mei_me intel_uncore iTCO_wdt evdev iTCO_vendor_support intel_rapl_perf efivars pcspkr drm mei cdc_acm intel_pch_thermal shpchp joydev ie31200_edac video acpi_power_meter button acpi_pad nfsd nfs_acl lockd grace auth_rpcgss ipmi_si ipmi_devintf sunrpc ipmi_msghandler efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic fscrypto ecb dm_mod ses enclosure scsi_transport_sas sd_mod hid_generic usbhid hid xhci_pci xhci_hcd ahci crc32c_intel ixgbe libahci igb i2c_algo_bit
[ 2611.633015]  aesni_intel aes_x86_64 dca ptp usbcore megaraid_sas crypto_simd libata cryptd glue_helper i2c_i801 pps_core usb_common mdio scsi_mod fan thermal
[ 2611.647163] CPU: 0 PID: 20830 Comm: check_ups Not tainted 4.15.0-rc8-amd64 #1 Debian 4.15~rc8-1~exp1
[ 2611.656338] Hardware name: Supermicro Super Server/X11SSH-F, BIOS 2.0c 10/06/2017
[ 2611.663857] RIP: 0010:page_counter_cancel+0x17/0x20
[ 2611.668765] RSP: 0018:ffffa74c8433fc70 EFLAGS: 00010097
[ 2611.674017] RAX: 0000000000000000 RBX: ffff8bc863c0b4c0 RCX: 0000000000000000
[ 2611.681186] RDX: 00003b83ba4109d0 RSI: 0000000000000001 RDI: ffff8bc863c0b4c0
[ 2611.688370] RBP: 0000000000000001 R08: ffff8bc8c50da8a0 R09: 0000000000000001
[ 2611.695556] R10: ffffa74c8433fd48 R11: 0000000001000000 R12: ffff8bc863c0b400
[ 2611.702740] R13: ffff8bc89c092800 R14: ffff8bc8a1270e10 R15: ffff8bc76955ec30
[ 2611.709924] FS:  00007f0669316fc0(0000) GS:ffff8bc8c5000000(0000) knlGS:0000000000000000
[ 2611.718063] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2611.723853] CR2: 00007f0668550930 CR3: 000000075ce30005 CR4: 00000000003626f0
[ 2611.731036] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2611.738218] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2611.745397] Call Trace:
[ 2611.747881]  page_counter_uncharge+0x1d/0x30
[ 2611.752195]  drain_stock.isra.37+0x32/0xa0
[ 2611.756327]  refill_stock+0x41/0x70
[ 2611.759855]  __sk_mem_reduce_allocated+0x83/0xd0
[ 2611.764508]  tcp_write_queue_purge+0x1a7/0x1d0
[ 2611.768990]  tcp_v4_destroy_sock+0x3f/0x180
[ 2611.773208]  tcp_v6_destroy_sock+0xe/0x20
[ 2611.777257]  inet_csk_destroy_sock+0x47/0x100
[ 2611.781650]  tcp_rcv_state_process+0x980/0xe20
[ 2611.786130]  ? tcp_v6_do_rcv+0x1a7/0x3e0
[ 2611.790090]  tcp_v6_do_rcv+0x1a7/0x3e0
[ 2611.793880]  __release_sock+0x76/0xc0
[ 2611.797581]  release_sock+0x2b/0x90
[ 2611.801107]  tcp_close+0x165/0x3f0
[ 2611.804547]  inet_release+0x36/0x60
[ 2611.808075]  sock_release+0x1a/0x70
[ 2611.811601]  sock_close+0xe/0x20
[ 2611.814861]  __fput+0xd5/0x210
[ 2611.819465]  task_work_run+0x84/0xa0
[ 2611.824577]  exit_to_usermode_loop+0xb9/0xc0
[ 2611.830383]  syscall_return_slowpath+0x88/0x90
[ 2611.836364]  system_call_fast_compare_end+0x73/0x75
[ 2611.842741] RIP: 0033:0x7f0668ac8d84
[ 2611.847774] RSP: 002b:00007ffe23f9c7b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[ 2611.856787] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f0668ac8d84
[ 2611.865332] RDX: 0000000000001fff RSI: 00007ffe23f9c800 RDI: 0000000000000000
[ 2611.873833] RBP: 0000000000000006 R08: 0000000000000000 R09: 0000000000000000
[ 2611.882405] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe23f9e800
[ 2611.890813] R13: 00007ffe23f9c800 R14: 0000000000002000 R15: 0000000000000000
[ 2611.899185] Code: e8 39 b5 eb ff e9 49 ff ff ff 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 89 f0 48 f7 d8 f0 48 0f c1 07 48 39 f0 78 02 f3 c3 <0f> ff c3 66 0f 1f 44 00 00 0f 1f 44 00 00 eb 19 48 89 f0 f0 48 
[ 2611.920537] ---[ end trace 306225c4342d4340 ]---
[ 2981.898837] upsd invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=0
[ 2981.909192] upsd cpuset=/ mems_allowed=0
[ 2981.913519] CPU: 0 PID: 3295 Comm: upsd Tainted: G        W        4.15.0-rc8-amd64 #1 Debian 4.15~rc8-1~exp1
[ 2981.923783] Hardware name: Supermicro Super Server/X11SSH-F, BIOS 2.0c 10/06/2017
[ 2981.931647] Call Trace:
[ 2981.934647]  dump_stack+0x5c/0x85
[ 2981.938305]  dump_header+0x6b/0x289
[ 2981.942379]  oom_kill_process+0x228/0x430
[ 2981.947113]  out_of_memory+0x2ab/0x4b0
[ 2981.951949]  mem_cgroup_out_of_memory+0x49/0x80
[ 2981.957643]  mem_cgroup_oom_synchronize+0x2ed/0x320
[ 2981.963664]  ? get_mem_cgroup_from_mm+0x90/0x90
[ 2981.969334]  pagefault_out_of_memory+0x32/0x77
[ 2981.974906]  __do_page_fault+0x4a7/0x4e0
[ 2981.979879]  ? page_fault+0x36/0x60
[ 2981.984384]  page_fault+0x4c/0x60
[ 2981.988804] RIP: 0033:0x7f084e1fbca0
[ 2981.993483] RSP: 002b:00007ffd0a3bf0c8 EFLAGS: 00010202
[ 2981.993517] Task in /system.slice/nut-server.service killed as a result of limit of /system.slice/nut-server.service
[ 2982.011567] memory: usage 18446744073709550932kB, limit 9007199254740988kB, failcnt 33
[ 2982.020705] memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
[ 2982.028532] kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
[ 2982.035766] Memory cgroup stats for /system.slice/nut-server.service: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
[ 2982.059152] [ pid ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[ 2982.069091] [ 3295]   114  3295    14245        0   131072      106             0 upsd
[ 2982.078534] Memory cgroup out of memory: Kill process 3295 (upsd) score 0 or sacrifice child
[ 2982.088491] Killed process 3295 (upsd) total-vm:56980kB, anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 2982.099113] oom_reaper: reaped process 3295 (upsd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[ 2982.152146] ------------[ cut here ]------------
[ 2982.158622] percpu ref (css_release) <= 0 (-197) after switching to atomic
[ 2982.158641] WARNING: CPU: 0 PID: 7 at /build/linux-b8fmzT/linux-4.15~rc8/lib/percpu-refcount.c:155 percpu_ref_switch_to_atomic_rcu+0xf6/0x100
[ 2982.183896] Modules linked in: binfmt_misc fuse vhost_net vhost tap tun devlink bridge 8021q garp mrp stp llc nls_ascii nls_cp437 vfat fat intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel i915 kvm ast irqbypass crct10dif_pclmul crc32_pclmul ttm drm_kms_helper ghash_clmulni_intel intel_cstate sg efi_pstore mei_me intel_uncore iTCO_wdt evdev iTCO_vendor_support intel_rapl_perf efivars pcspkr drm mei cdc_acm intel_pch_thermal shpchp joydev ie31200_edac video acpi_power_meter button acpi_pad nfsd nfs_acl lockd grace auth_rpcgss ipmi_si ipmi_devintf sunrpc ipmi_msghandler efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic fscrypto ecb dm_mod ses enclosure scsi_transport_sas sd_mod hid_generic usbhid hid xhci_pci xhci_hcd ahci crc32c_intel ixgbe libahci igb i2c_algo_bit
[ 2982.268964]  aesni_intel aes_x86_64 dca ptp usbcore megaraid_sas crypto_simd libata cryptd glue_helper i2c_i801 pps_core usb_common mdio scsi_mod fan thermal
[ 2982.287046] CPU: 0 PID: 7 Comm: ksoftirqd/0 Tainted: G        W        4.15.0-rc8-amd64 #1 Debian 4.15~rc8-1~exp1
[ 2982.299367] Hardware name: Supermicro Super Server/X11SSH-F, BIOS 2.0c 10/06/2017
[ 2982.308886] RIP: 0010:percpu_ref_switch_to_atomic_rcu+0xf6/0x100
[ 2982.316899] RSP: 0018:ffffa74c831a7df8 EFLAGS: 00010282
[ 2982.324118] RAX: 0000000000000000 RBX: 7fffffffffffff3e RCX: ffffffffa064d748
[ 2982.333350] RDX: 0000000000000001 RSI: 0000000000000096 RDI: 0000000000000283
[ 2982.342431] RBP: ffff8bc863c0b438 R08: 0000000000000462 R09: ffffffffa0b98160
[ 2982.351496] R10: 0000000000000000 R11: 0000000000000000 R12: 00003b83b9e11040
[ 2982.360536] R13: ffffffffa071a5e0 R14: 7fffffffffffffff R15: 0000000000000202
[ 2982.369682] FS:  0000000000000000(0000) GS:ffff8bc8c5000000(0000) knlGS:0000000000000000
[ 2982.379801] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2982.387446] CR2: 00005592b84ecd50 CR3: 000000072c20a005 CR4: 00000000003626f0
[ 2982.396512] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2982.405282] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2982.414337] Call Trace:
[ 2982.418723]  rcu_process_callbacks+0x1af/0x4c0
[ 2982.425118]  ? sort_range+0x20/0x20
[ 2982.430531]  __do_softirq+0xd9/0x2a9
[ 2982.436043]  ? sort_range+0x20/0x20
[ 2982.441441]  run_ksoftirqd+0x25/0x40
[ 2982.446928]  smpboot_thread_fn+0xdf/0x150
[ 2982.452865]  kthread+0x111/0x130
[ 2982.458012]  ? kthread_create_worker_on_cpu+0x70/0x70
[ 2982.464998]  ret_from_fork+0x32/0x40
[ 2982.470463] Code: 89 df ff 55 e8 eb c6 80 3d 86 15 d7 00 00 75 8a 48 8b 55 d8 48 8b 75 e8 48 c7 c7 28 18 45 a0 c6 05 6e 15 d7 00 01 e8 3a 62 cf ff <0f> ff e9 68 ff ff ff 0f 1f 00 41 54 55 49 89 f4 53 48 89 fb 48 
[ 2982.493194] ---[ end trace 306225c4342d4341 ]---

Best regards,
Chris

-- System Information:
Debian Release: buster/sid
  APT prefers unstable-debug
  APT policy: (500, 'unstable-debug'), (500, 'testing-debug'), (500, 'testing'), (100, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 4.15.0-rc8-amd64 (SMP w/8 CPU cores)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB:en (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)


Reply to: