[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#794680: marked as done (drbd kernel module incompatible with drbd-utils -> kernel panics)



Your message dated Fri, 30 Apr 2021 21:35:55 +0200
with message-id <E1lcYvJ-001MIq-Ox@hullmann.westfalen.local>
and subject line Closing this bug
has caused the Debian Bug report #794680,
regarding drbd kernel module incompatible with drbd-utils -> kernel panics
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
794680: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=794680
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: linux
Version: 3.16.7-ckt11-1+deb8u2
Severity: critical

Hi,

TL;DR - please provide a kernel with a newer drbd module (e.g. 8.4.6),
as the current version is incompatible with stable's drbd-utils and
will result in kernel panics under load.

I have the following kernel:

Linux version 3.16.0-4-amd64 (debian-kernel@lists.debian.org) (gcc version 4.8.4 (Debian 4.8.4-1) ) #1 SMP Debian 3.16.7-ckt11-1+deb8u2 (2015-07-17)

This ships with version 8.4.3 of the drbd kernel module (which
advertises '(api:1/proto:86-101)').

Using that version of the module with stable's drbd-utils (8.9.2rc1)
results in kernel panics under heavy I/O load, fairly repeatedly. A
kernel log of a typical crash is attached to this report. I intially
reported this issue to Xen (since it happened in a dom0), and they
referred me to this blog post:

http://blog.chinewalking.com/drbd-kernel-oops-w-trim/

Notably, you will observe that drbd-module >=8.4.4 supports "trim", whereas
8.4.3 does not. Yet the userland tools arrange to use trim anyway:

Aug  4 14:28:24 ophon kernel: [2856757.049680] drbd mws-02474: Agreed to support TRIM on protocol level 

Following that suggestion, I installed the kernel module 8.4.6 from
upstream, and the kernel has stopped panicking.

You might argue that drbd upstream's api/proto discrimination is
inadequate (and perhaps a bug report should go there), but nonetheless
kernel panics are a serious flaw in the kernel (or the offending
module) IMAO.

Regards,

Matthew

-- System Information:
Debian Release: 8.1
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable-backports'), (500, 'stable\
')
Architecture: amd64 (x86_64)

Kernel: Linux 3.16.0-4-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
Aug  3 16:03:13 opus kernel: [ 1250.026811] drbd mws-priv-1: Starting worker thread (from drbdsetup-84 [12987])
Aug  3 16:03:13 opus kernel: [ 1250.027313] block drbd4: disk( Diskless -> Attaching ) 
Aug  3 16:03:13 opus kernel: [ 1250.027409] drbd mws-priv-1: Method to ensure write ordering: flush
Aug  3 16:03:13 opus kernel: [ 1250.027413] block drbd4: max BIO size = 4096
Aug  3 16:03:13 opus kernel: [ 1250.027418] block drbd4: drbd_bm_resize called with capacity == 41941688
Aug  3 16:03:13 opus kernel: [ 1250.027558] block drbd4: resync bitmap: bits=5242711 words=81918 pages=160
Aug  3 16:03:13 opus kernel: [ 1250.027561] block drbd4: size = 20 GB (20970844 KB)
Aug  3 16:03:13 opus kernel: [ 1250.032268] block drbd4: Writing the whole bitmap, size changed
Aug  3 16:03:13 opus kernel: [ 1250.047827] block drbd4: bitmap WRITE of 160 pages took 4 jiffies
Aug  3 16:03:13 opus kernel: [ 1250.061634] block drbd4: 20 GB (5242711 bits) marked out-of-sync by on disk bit-map.
Aug  3 16:03:13 opus kernel: [ 1250.180186] block drbd4: bitmap READ of 160 pages took 2 jiffies
Aug  3 16:03:13 opus kernel: [ 1250.180291] block drbd4: recounting of set bits took additional 0 jiffies
Aug  3 16:03:13 opus kernel: [ 1250.180293] block drbd4: 20 GB (5242711 bits) marked out-of-sync by on disk bit-map.
Aug  3 16:03:13 opus kernel: [ 1250.180304] block drbd4: Suspended AL updates
Aug  3 16:03:13 opus kernel: [ 1250.180307] block drbd4: disk( Attaching -> Inconsistent ) 
Aug  3 16:03:13 opus kernel: [ 1250.180310] block drbd4: attached to UUIDs 0000000000000004:0000000000000000:0000000000000000:0000000000000000
Aug  3 16:03:13 opus kernel: [ 1250.191161] drbd mws-priv-1: conn( StandAlone -> Unconnected ) 
Aug  3 16:03:13 opus kernel: [ 1250.191183] drbd mws-priv-1: Starting receiver thread (from drbd_w_mws-priv [12989])
Aug  3 16:03:13 opus kernel: [ 1250.191345] drbd mws-priv-1: receiver (re)started
Aug  3 16:03:13 opus kernel: [ 1250.191360] drbd mws-priv-1: conn( Unconnected -> WFConnection ) 
Aug  3 16:03:13 opus kernel: [ 1250.689576] drbd mws-priv-1: Handshake successful: Agreed network protocol version 101
Aug  3 16:03:13 opus kernel: [ 1250.689580] drbd mws-priv-1: Agreed to support TRIM on protocol level
Aug  3 16:03:13 opus kernel: [ 1250.689616] drbd mws-priv-1: conn( WFConnection -> WFReportParams ) 
Aug  3 16:03:13 opus kernel: [ 1250.689631] drbd mws-priv-1: Starting asender thread (from drbd_r_mws-priv [12992])
Aug  3 16:03:13 opus kernel: [ 1250.737084] block drbd4: max BIO size = 1048576
Aug  3 16:03:13 opus kernel: [ 1250.737091] block drbd4: drbd_sync_handshake:
Aug  3 16:03:13 opus kernel: [ 1250.737094] block drbd4: self 0000000000000004:0000000000000000:0000000000000000:0000000000000000 bits:5242711 flags:0
Aug  3 16:03:13 opus kernel: [ 1250.737096] block drbd4: peer 0000000000000004:0000000000000000:0000000000000000:0000000000000000 bits:5242711 flags:0
Aug  3 16:03:13 opus kernel: [ 1250.737098] block drbd4: uuid_compare()=0 by rule 10
Aug  3 16:03:13 opus kernel: [ 1250.737100] block drbd4: No resync, but 5242711 bits in bitmap!
Aug  3 16:03:13 opus kernel: [ 1250.737105] block drbd4: peer( Unknown -> Secondary ) conn( WFReportParams -> Connected ) pdsk( DUnknown -> Inconsistent ) 
Aug  3 16:03:13 opus kernel: [ 1250.737109] block drbd4: Resumed AL updates
Aug  3 16:03:14 opus kernel: [ 1250.773903] block drbd4: Accepted new current UUID, preparing to skip initial sync
Aug  3 16:03:14 opus kernel: [ 1250.777061] block drbd4: bitmap WRITE of 160 pages took 1 jiffies
Aug  3 16:03:14 opus kernel: [ 1250.788564] block drbd4: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
Aug  3 16:03:14 opus kernel: [ 1250.788573] block drbd4: disk( Inconsistent -> UpToDate ) pdsk( Inconsistent -> UpToDate ) 
Aug  3 16:03:14 opus kernel: [ 1250.797104] block drbd4: receiver updated UUIDs to 14460554106EF79A:0000000000000000:0000000000000000:0000000000000000
Aug  3 16:03:14 opus kernel: [ 1250.797117] block drbd4: peer( Secondary -> Primary ) 
Aug  3 16:03:15 opus kernel: [ 1251.748952] kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
Aug  3 16:03:15 opus kernel: [ 1251.748983] BUG: unable to handle kernel paging request at ffff8800022f3d88
Aug  3 16:03:15 opus kernel: [ 1251.749016] IP: [<ffff8800022f3d88>] 0xffff8800022f3d88
Aug  3 16:03:15 opus kernel: [ 1251.749041] PGD 1814067 PUD 1815067 PMD 2f81067 PTE 80100000022f3067
Aug  3 16:03:15 opus kernel: [ 1251.749082] Oops: 0011 [#1] SMP 
Aug  3 16:03:15 opus kernel: [ 1251.749106] Modules linked in: xt_physdev iptable_filter ip_tables x_tables xen_netback xen_blkback xen_gntdev xen_evtchn xenfs xen_privcmd nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc bridge stp llc intel_powerclamp coretemp crc32_pclmul ghash_clmulni_intel joydev hid_generic iTCO_wdt iTCO_vendor_support aesni_intel evdev aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd usbhid ttm hid drm_kms_helper pcspkr drm i2c_i801 lpc_ich mfd_core i7core_edac ioatdma edac_core tpm_tis tpm ipmi_si ipmi_msghandler button shpchp processor thermal_sys drbd lru_cache libcrc32c autofs4 ext4 crc16 mbcache jbd2 dm_mod raid1 md_mod sg sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_intel ahci libahci libata ehci_pci uhci_hcd ehci_hcd scsi_mod usbcore usb_common igb i2c_algo_bit i2c_core dca ptp pps_core
Aug  3 16:03:15 opus kernel: [ 1251.749691] CPU: 0 PID: 12993 Comm: drbd_a_mws-priv Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt11-1+deb8u2
Aug  3 16:03:15 opus kernel: [ 1251.749726] Hardware name: Intel Corporation S5500WBV/S5500WB, BIOS S5500.86B.01.00.0061.030920121535 03/09/2012
Aug  3 16:03:15 opus kernel: [ 1251.749765] task: ffff8800171742d0 ti: ffff8800022f0000 task.ti: ffff8800022f0000
Aug  3 16:03:15 opus kernel: [ 1251.749847] RIP: e030:[<ffff8800022f3d88>]  [<ffff8800022f3d88>] 0xffff8800022f3d88
Aug  3 16:03:15 opus kernel: [ 1251.749934] RSP: e02b:ffff8800022f3d90  EFLAGS: 00010212
Aug  3 16:03:15 opus kernel: [ 1251.749984] RAX: 00000000fffffffc RBX: ffffffffffffffff RCX: 0000000000000113
Aug  3 16:03:15 opus kernel: [ 1251.750039] RDX: 0000000000000113 RSI: 00000000fffffe01 RDI: ffffffff81463f75
Aug  3 16:03:15 opus kernel: [ 1251.750094] RBP: ffff8800171742d0 R08: ffff8800022f0000 R09: 0000000000000000
Aug  3 16:03:15 opus kernel: [ 1251.750150] R10: ffff88001751b890 R11: 0000000000000000 R12: 0000000000000001
Aug  3 16:03:15 opus kernel: [ 1251.750205] R13: 0000000000000000 R14: 0000000000000010 R15: ffff880016c92000
Aug  3 16:03:15 opus kernel: [ 1251.750263] FS:  00007f90c43c0740(0000) GS:ffff88001fa00000(0000) knlGS:0000000000000000
Aug  3 16:03:15 opus kernel: [ 1251.750348] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
Aug  3 16:03:15 opus kernel: [ 1251.750399] CR2: ffff8800022f3d88 CR3: 000000001073f000 CR4: 0000000000002660
Aug  3 16:03:15 opus kernel: [ 1251.750454] Stack:
Aug  3 16:03:15 opus kernel: [ 1251.750493]  ffff8800022f3d88 0000000000000010 0000000000000000 0000000000000000
Aug  3 16:03:15 opus kernel: [ 1251.750589]  ffff8800022f3d90 0000000000000001 0000000000000000 0000000000000000
Aug  3 16:03:15 opus kernel: [ 1251.750686]  0000000000004100 ffffffffa02577be ffff880016c92080 0000001000000000
Aug  3 16:03:15 opus kernel: [ 1251.750783] Call Trace:
Aug  3 16:03:15 opus kernel: [ 1251.750829]  [<ffffffffa02577be>] ? drbd_asender+0x27e/0x750 [drbd]
Aug  3 16:03:15 opus kernel: [ 1251.750887]  [<ffffffffa0260d00>] ? drbd_destroy_connection+0xc0/0xc0 [drbd]
Aug  3 16:03:15 opus kernel: [ 1251.750947]  [<ffffffffa0260d46>] ? drbd_thread_setup+0x46/0x130 [drbd]
Aug  3 16:03:15 opus kernel: [ 1251.751006]  [<ffffffffa0260d00>] ? drbd_destroy_connection+0xc0/0xc0 [drbd]
Aug  3 16:03:15 opus kernel: [ 1251.751065]  [<ffffffff81087fad>] ? kthread+0xbd/0xe0
Aug  3 16:03:15 opus kernel: [ 1251.751114]  [<ffffffff81087ef0>] ? kthread_create_on_node+0x180/0x180
Aug  3 16:03:15 opus kernel: [ 1251.751170]  [<ffffffff815114d8>] ? ret_from_fork+0x58/0x90
Aug  3 16:03:15 opus kernel: [ 1251.751221]  [<ffffffff81087ef0>] ? kthread_create_on_node+0x180/0x180
Aug  3 16:03:15 opus kernel: [ 1251.751274] Code: ff ff ff ff ff ff ff ff ff ff ff 88 3d 2f 02 00 88 ff ff 30 e0 00 00 00 00 00 00 12 02 01 00 00 00 00 00 90 3d 2f 02 00 88 ff ff <2b> e0 00 00 00 00 00 00 88 3d 2f 02 00 88 ff ff 10 00 00 00 00 
Aug  3 16:03:15 opus kernel: [ 1251.751694] RIP  [<ffff8800022f3d88>] 0xffff8800022f3d88
Aug  3 16:03:15 opus kernel: [ 1251.751747]  RSP <ffff8800022f3d90>
Aug  3 16:03:15 opus kernel: [ 1251.751790] CR2: ffff8800022f3d88
Aug  3 16:03:15 opus kernel: [ 1251.752128] ---[ end trace 975e04f66c2d9004 ]---
Aug  3 16:03:15 opus kernel: [ 1251.835012] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
Aug  3 16:03:15 opus kernel: [ 1251.835235] IP: [<ffffffffa02453bd>] drbd_endio_write_sec_final+0x9d/0x480 [drbd]
Aug  3 16:03:15 opus kernel: [ 1251.835408] PGD 0 
Aug  3 16:03:15 opus kernel: [ 1251.835531] Oops: 0002 [#2] SMP 
Aug  3 16:03:15 opus kernel: [ 1251.835704] Modules linked in: xt_physdev iptable_filter ip_tables x_tables xen_netback xen_blkback xen_gntdev xen_evtchn xenfs xen_privcmd nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc bridge stp llc intel_powerclamp coretemp crc32_pclmul ghash_clmulni_intel joydev hid_generic iTCO_wdt iTCO_vendor_support aesni_intel evdev aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd usbhid ttm hid drm_kms_helper pcspkr drm i2c_i801 lpc_ich mfd_core i7core_edac ioatdma edac_core tpm_tis tpm ipmi_si ipmi_msghandler button shpchp processor thermal_sys drbd lru_cache libcrc32c autofs4 ext4 crc16 mbcache jbd2 dm_mod raid1 md_mod sg sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_intel ahci libahci libata ehci_pci uhci_hcd ehci_hcd scsi_mod usbcore usb_common igb i2c_algo_bit i2c_core dca ptp pps_core
Aug  3 16:03:15 opus kernel: [ 1251.840379] CPU: 0 PID: 12992 Comm: drbd_r_mws-priv Tainted: G      D       3.16.0-4-amd64 #1 Debian 3.16.7-ckt11-1+deb8u2
Aug  3 16:03:15 opus kernel: [ 1251.840515] Hardware name: Intel Corporation S5500WBV/S5500WB, BIOS S5500.86B.01.00.0061.030920121535 03/09/2012
Aug  3 16:03:15 opus kernel: [ 1251.840648] task: ffff880016c1a050 ti: ffff8800173e4000 task.ti: ffff8800173e4000
Aug  3 16:03:15 opus kernel: [ 1251.840771] RIP: e030:[<ffffffffa02453bd>]  [<ffffffffa02453bd>] drbd_endio_write_sec_final+0x9d/0x480 [drbd]
Aug  3 16:03:15 opus kernel: [ 1251.840951] RSP: e02b:ffff8800173e7ce0  EFLAGS: 00010097
Aug  3 16:03:15 opus kernel: [ 1251.841041] RAX: 0000000000000000 RBX: ffff880017439700 RCX: 000000000000009c
Aug  3 16:03:15 opus kernel: [ 1251.841137] RDX: 0000000000000000 RSI: ffff88000c850200 RDI: ffff88000c85bed0
Aug  3 16:03:15 opus kernel: [ 1251.841233] RBP: ffff88000cb22800 R08: 0000000000000cce R09: ffff88000c850200
Aug  3 16:03:15 opus kernel: [ 1251.841329] R10: 0000000000007ff0 R11: 0000000000000000 R12: ffff880002b676a0
Aug  3 16:03:15 opus kernel: [ 1251.841424] R13: ffff88001f8463b0 R14: ffff88000cb22bb0 R15: ffff88000cb22800
Aug  3 16:03:15 opus kernel: [ 1251.841523] FS:  00007f90c43c0740(0000) GS:ffff88001fa00000(0000) knlGS:0000000000000000
Aug  3 16:03:15 opus kernel: [ 1251.841648] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
Aug  3 16:03:15 opus kernel: [ 1251.841739] CR2: 0000000000000008 CR3: 000000001073f000 CR4: 0000000000002660
Aug  3 16:03:15 opus kernel: [ 1251.841835] Stack:
Aug  3 16:03:15 opus kernel: [ 1251.841913]  0000000000000000 0000000000030003 ffff8800174397b8 0000000000000000
Aug  3 16:03:15 opus kernel: [ 1251.842216]  0000000000000000 0000000000102800 0000000000400000 0000000000104800
Aug  3 16:03:15 opus kernel: [ 1251.842518]  0000000000000000 0000000000102800 0000000000000000 0000000000000000
Aug  3 16:03:15 opus kernel: [ 1251.842820] Call Trace:
Aug  3 16:03:15 opus kernel: [ 1251.842906]  [<ffffffffa0254bb5>] ? drbd_submit_peer_request+0x85/0x330 [drbd]
Aug  3 16:03:15 opus kernel: [ 1251.843033]  [<ffffffffa02556ea>] ? receive_Data+0x36a/0xe40 [drbd]
Aug  3 16:03:15 opus kernel: [ 1251.843130]  [<ffffffffa0257407>] ? drbd_receiver+0x117/0x250 [drbd]
Aug  3 16:03:15 opus kernel: [ 1251.843228]  [<ffffffffa0260d00>] ? drbd_destroy_connection+0xc0/0xc0 [drbd]
Aug  3 16:03:15 opus kernel: [ 1251.843328]  [<ffffffffa0260d46>] ? drbd_thread_setup+0x46/0x130 [drbd]
Aug  3 16:03:15 opus kernel: [ 1251.843427]  [<ffffffffa0260d00>] ? drbd_destroy_connection+0xc0/0xc0 [drbd]
Aug  3 16:03:15 opus kernel: [ 1251.843524]  [<ffffffff81087fad>] ? kthread+0xbd/0xe0
Aug  3 16:03:15 opus kernel: [ 1251.843614]  [<ffffffff81087ef0>] ? kthread_create_on_node+0x180/0x180
Aug  3 16:03:15 opus kernel: [ 1251.843709]  [<ffffffff815114d8>] ? ret_from_fork+0x58/0x90
Aug  3 16:03:15 opus kernel: [ 1251.843801]  [<ffffffff81087ef0>] ? kthread_create_on_node+0x180/0x180
Aug  3 16:03:15 opus kernel: [ 1251.843894] Code: 04 48 8b 45 00 48 8d b8 d0 00 00 00 e8 dd bc 2c e1 8b 53 58 49 89 c1 c1 ea 09 01 95 54 02 00 00 49 83 fd ff 48 8b 13 48 8b 43 08 <48> 89 42 08 48 89 10 48 8d 85 c0 03 00 00 48 8b 95 c8 03 00 00 
Aug  3 16:03:15 opus kernel: [ 1251.846996] RIP  [<ffffffffa02453bd>] drbd_endio_write_sec_final+0x9d/0x480 [drbd]
Aug  3 16:03:15 opus kernel: [ 1251.847168]  RSP <ffff8800173e7ce0>
Aug  3 16:03:15 opus kernel: [ 1251.847252] CR2: 0000000000000008
Aug  3 16:03:15 opus kernel: [ 1251.847335] ---[ end trace 975e04f66c2d9005 ]---

--- End Message ---
--- Begin Message ---
This bug was filed for a very old kernel. If you can reproduce it with
- the current version in unstable/testing
- the latest kernel from buster.backports
please reopen the bug, see https://www.debian.org/Bugs/server-control

--- End Message ---

Reply to: