[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#812415: kernel Oops (NULL pointer dereference) while mounting drbd device



Package: linux-image-4.3.0-0.bpo.1-amd64
Version: 4.3.3-5~bpo8+1

Kernel:
# uname -a
Linux bac240a15n 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.3-5~bpo8+1 (2016-01-07) x86_64 GNU/Linux

Libc6
# dpkg -s libc6 | grep Version
Version: 2.19-18+deb8u1

I have seen this Oops a few times while mounting a drbd block device

/bin/mount -onoatime,barrier=0,nouser_xattr,noacl /dev/drbd1 /replicated

It is not easily reproducible though. I saw this on VMware ESX environment as well as in Citrix XEN.

+++++++++++++++++++++++++++++++++++++
Jan 20 15:31:40 bac1f6065n kernel: [ 244.728196] EXT4-fs (drbd1): Mount option "noacl" will be removed by 3.5
Jan 20 15:31:40 bac1f6065n kernel: [ 244.728196] Contact linux-ext4@vger.kernel.org if you think we should keep it.
Jan 20 15:31:40 bac1f6065n kernel: [ 244.728196]
Jan 20 15:31:40 bac1f6065n kernel: [ 244.728200] EXT4-fs (drbd1): mounting ext3 file system using the ext4 subsystem
Jan 20 15:31:40 bac1f6065n kernel: [ 244.766336] EXT4-fs (drbd1): barriers disabled
Jan 20 15:31:40 bac1f6065n kernel: [ 244.768365] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
Jan 20 15:31:40 bac1f6065n kernel: [ 244.768585] IP: [<ffffffff811fc822>] __mark_inode_dirty+0x1f2/0x300
Jan 20 15:31:40 bac1f6065n kernel: [ 244.768712] PGD 2a645067 PUD 3be85067 PMD 0
Jan 20 15:31:40 bac1f6065n kernel: [ 244.768824] Oops: 0000 1 SMP
Jan 20 15:31:40 bac1f6065n kernel: [ 244.768922] Modules linked in: cpuid vmw_vsock_vmci_transport iosf_mbi coretemp crct10dif_pclmul crc32_pclmul sha256_ssse3 sha256_generic hmac drbg ansi_cprng ppdev aesni_intel aes_x86_64 lrw gf128mul evdev glue_helper ablk_helper vmw_balloon cryptd pcspkr psmouse serio_raw battery shpchp vmwgfx ttm drm_kms_helper parport_pc drm i2c_piix4 8250_fintek parport acpi_cpufreq processor ac button nf_conntrack_tftp nf_conntrack bonding drbd lru_cache libcrc32c autofs4 ext4 crc16 mbcache jbd2 virtio_blk virtio_net virtio_pci virtio_ring virtio kvm vsock vmw_pvscsi sg sd_mod sr_mod cdrom ata_generic crc32c_intel vmxnet3 mptspi scsi_transport_spi mptscsih mptbase vmw_vmci ata_piix libata scsi_mod
Jan 20 15:31:40 bac1f6065n kernel: [ 244.770796] CPU: 0 PID: 9694 Comm: mount Not tainted 4.3.0-0.bpo.1-amd64 #1 Debian 4.3.3-5~bpo8+1
Jan 20 15:31:40 bac1f6065n kernel: [ 244.771808] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
Jan 20 15:31:40 bac1f6065n kernel: [ 244.773842] task: ffff88003bca1300 ti: ffff880039968000 task.ti: ffff880039968000
Jan 20 15:31:40 bac1f6065n kernel: [ 244.774928] RIP: 0010:[<ffffffff811fc822>] [<ffffffff811fc822>] __mark_inode_dirty+0x1f2/0x300
Jan 20 15:31:40 bac1f6065n kernel: [ 244.776064] RSP: 0018:ffff88003996bbc8 EFLAGS: 00010246
Jan 20 15:31:40 bac1f6065n kernel: [ 244.777223] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000019
Jan 20 15:31:40 bac1f6065n kernel: [ 244.778411] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff8800362eab88
Jan 20 15:31:40 bac1f6065n kernel: [ 244.779607] RBP: ffff8800362eab30 R08: 0000000000000000 R09: 0000000000000000
Jan 20 15:31:40 bac1f6065n kernel: [ 244.780818] R10: ffff8800391eed68 R11: 0000000000000000 R12: ffff88003dba91f0
Jan 20 15:31:40 bac1f6065n kernel: [ 244.782051] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88003dba9278
Jan 20 15:31:40 bac1f6065n kernel: [ 244.783280] FS: 00007fd858fb9840(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
Jan 20 15:31:40 bac1f6065n kernel: [ 244.784540] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 20 15:31:40 bac1f6065n kernel: [ 244.785817] CR2: 0000000000000018 CR3: 000000003a095000 CR4: 00000000000006f0
Jan 20 15:31:40 bac1f6065n kernel: [ 244.787374] Stack:
Jan 20 15:31:40 bac1f6065n kernel: [ 244.788697] ffffffff81b0d6c0 ffff8800391285d8 ffff88003bf9a400 0000000000000001
Jan 20 15:31:40 bac1f6065n kernel: [ 244.790092] 0000000000000000 ffff88003c2f3000 ffffffffa023f014 0000000000000000
Jan 20 15:31:40 bac1f6065n kernel: [ 244.791506] ffff88003c2f3000 ffff88003bf9a400 0000000000000000 ffff88003bc26000
Jan 20 15:31:40 bac1f6065n kernel: [ 244.792936] Call Trace:
Jan 20 15:31:40 bac1f6065n kernel: [ 244.794427] [<ffffffffa023f014>] ? ext4_commit_super+0x1a4/0x270 [ext4]
Jan 20 15:31:40 bac1f6065n kernel: [ 244.795914] [<ffffffffa0241746>] ? ext4_setup_super+0x106/0x160 [ext4]
Jan 20 15:31:40 bac1f6065n kernel: [ 244.797402] [<ffffffffa024421f>] ? ext4_fill_super+0x1bef/0x3420 [ext4]
Jan 20 15:31:40 bac1f6065n kernel: [ 244.798869] [<ffffffff812ed649>] ? snprintf+0x49/0x60
Jan 20 15:31:40 bac1f6065n kernel: [ 244.800358] [<ffffffff81172c89>] ? register_shrinker+0x69/0x90
Jan 20 15:31:40 bac1f6065n kernel: [ 244.801886] [<ffffffff811d2e6c>] ? sget+0x38c/0x3c0
Jan 20 15:31:40 bac1f6065n kernel: [ 244.803372] [<ffffffffa0242630>] ? ext4_calculate_overhead+0x380/0x380 [ext4]
Jan 20 15:31:40 bac1f6065n kernel: [ 244.804849] [<ffffffff811d3191>] ? mount_bdev+0x1a1/0x1d0
Jan 20 15:31:40 bac1f6065n kernel: [ 244.806312] [<ffffffff811ad83f>] ? alloc_pages_current+0x8f/0x100
Jan 20 15:31:40 bac1f6065n kernel: [ 244.807762] [<ffffffff811d3a66>] ? mount_fs+0x36/0x170
Jan 20 15:31:40 bac1f6065n kernel: [ 244.809196] [<ffffffff811eed64>] ? vfs_kern_mount+0x64/0x100
Jan 20 15:31:40 bac1f6065n kernel: [ 244.810616] [<ffffffff811f11bf>] ? do_mount+0x21f/0xd30
Jan 20 15:31:40 bac1f6065n kernel: [ 244.811995] [<ffffffff811f1fc9>] ? SyS_mount+0x99/0xf0
Jan 20 15:31:40 bac1f6065n kernel: [ 244.813325] [<ffffffff81589376>] ? system_call_fast_compare_end+0xc/0x6b
Jan 20 15:31:40 bac1f6065n kernel: [ 244.814709] Code: 03 48 8b 7b 08 48 83 c3 10 89 ea 4c 89 e6 ff d0 48 8b 03 48 85 c0 75 e9 e9 45 fe ff ff 4c 89 e7 e8 24 ee ff ff 48 89 c5 48 8b 00 <f6> 40 18 02 75 27 48 8b 55 08 83 e2 01 75 1e 48 8b 48 30 48 c7 
Jan 20 15:31:40 bac1f6065n kernel: [ 244.818856] RIP [<ffffffff811fc822>] __mark_inode_dirty+0x1f2/0x300
Jan 20 15:31:40 bac1f6065n kernel: [ 244.820141] RSP <ffff88003996bbc8>
Jan 20 15:31:40 bac1f6065n kernel: [ 244.821396] CR2: 0000000000000018
Jan 20 15:31:40 bac1f6065n kernel: [ 244.822654] --[ end trace b9602dcbc1ee51b1 ]--
Jan 20 15:31:40 bac1f6065n kernel: [ 244.847512] block drbd1: State change failed: Device is held open by someone
Jan 20 15:31:40 bac1f6065n kernel: [ 244.848765] block drbd1: state =
{ cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate r----- }
+++++++++++++++++++++++++++++++++++++

This seems to be coming from fs/fs-writeback.c (line 2063):

0x5ad2 is in __mark_inode_dirty (fs/fs-writeback.c:2063).
2058	struct list_head *dirty_list;
2059	bool wakeup_bdi = false;
2060	
2061	wb = locked_inode_to_wb_and_lock_list(inode);
2062	
2063	WARN(bdi_cap_writeback_dirty(wb->bdi) &&
2064	!test_bit(WB_registered, &wb->state),
2065	"bdi-%s not registered\n", wb->bdi->name);
2066	
2067	inode->dirtied_when = jiffies;

In another occasion, already with the new 4.3.3-7~bpo8+1 kernel, the WARN is print (no Oops):

+++++++++++++++++++++++++++++++++++++
Jan 22 14:12:10 bac1f70c8n kernel: [ 7031.790038] block drbd1: Suspended AL updates
Jan 22 14:12:10 bac1f70c8n kernel: [ 7031.790041] block drbd1: disk( Attaching -> Inconsistent )
Jan 22 14:12:10 bac1f70c8n kernel: [ 7031.790044] block drbd1: attached to UUIDs 0000000000000004:0000000000000000:0000000000000000:0000000000000000
Jan 22 14:12:10 bac1f70c8n kernel: [ 7031.793120] drbd r0: conn( StandAlone -> Unconnected )
Jan 22 14:12:10 bac1f70c8n kernel: [ 7031.793130] drbd r0: Starting receiver thread (from drbd_w_r0 [23936])
Jan 22 14:12:10 bac1f70c8n kernel: [ 7031.795370] drbd r0: receiver (re)started
Jan 22 14:12:10 bac1f70c8n kernel: [ 7031.795381] drbd r0: conn( Unconnected -> WFConnection )
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.531242] block drbd1: role( Secondary -> Primary ) disk( Inconsistent -> UpToDate )
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.531859] block drbd1: Forced to consider local data as UpToDate!
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.531865] block drbd1: new current UUID 7ED1C316EA7CAA49:0000000000000004:0000000000000000:0000000000000000
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.647785] EXT4-fs (drbd1): Mount option "nouser_xattr" will be removed by 3.5
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.647785] Contact linux-ext4@vger.kernel.org if you think we should keep it.
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.647785]
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.647790] EXT4-fs (drbd1): Mount option "noacl" will be removed by 3.5
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.647790] Contact linux-ext4@vger.kernel.org if you think we should keep it.
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.647790]
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.653295] EXT4-fs (drbd1): barriers disabled
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658219] -----------[ cut here ]-----------
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658226] WARNING: CPU: 0 PID: 24697 at /build/linux-kTc2b3/linux-4.3.3/fs/fs-writeback.c:2065 __mark_inode_dirty+0x21f/0x300()
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658227] bdi-block not registered
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658228] Modules linked in: cpuid xt_tcpudp ip6table_filter ip6_tables iptable_filter ip_tables x_tables vmw_vsock_vmci_transport iosf_mbi coretemp crct10dif_pclmul crc32_pclmul sha256_ssse3 sha256_generic hmac drbg ansi_cprng aesni_intel aes_x86_64 lrw gf128mul evdev glue_helper ablk_helper psmouse ppdev serio_raw cryptd vmw_balloon pcspkr nf_conntrack_tftp nf_conntrack acpi_cpufreq processor parport_pc parport vmwgfx ttm drm_kms_helper drm i2c_piix4 8250_fintek shpchp battery ac button bonding drbd lru_cache libcrc32c autofs4 ext4 crc16 mbcache jbd2 virtio_blk virtio_net virtio_pci virtio_ring virtio kvm vsock sg sd_mod sr_mod cdrom ata_generic crc32c_intel vmxnet3 vmw_pvscsi ata_piix libata vmw_vmci scsi_mod
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658335] CPU: 0 PID: 24697 Comm: mount Not tainted 4.3.0-0.bpo.1-amd64 #1 Debian 4.3.3-7~bpo8+1
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658336] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658338] 0000000000000000 00000000e6c78383 ffffffff812e1889 ffff88002a5f3b60
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658340] ffffffff81074451 0000000000000000 ffff88002a5f3bb8 ffff88003dba81f0
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658342] 0000000000000000 0000000000000000 ffffffff810744dc ffffffff818193eb
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658343] Call Trace:
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658351] [<ffffffff812e1889>] ? dump_stack+0x40/0x57
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658354] [<ffffffff81074451>] ? warn_slowpath_common+0x81/0xb0
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658356] [<ffffffff810744dc>] ? warn_slowpath_fmt+0x5c/0x80
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658358] [<ffffffff811fb68d>] ? locked_inode_to_wb_and_lock_list+0x4d/0xc0
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658359] [<ffffffff811fc84f>] ? __mark_inode_dirty+0x21f/0x300
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658376] [<ffffffffa01e9014>] ? ext4_commit_super+0x1a4/0x270 [ext4]
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658385] [<ffffffffa01eb746>] ? ext4_setup_super+0x106/0x160 [ext4]
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658393] [<ffffffffa01ee21f>] ? ext4_fill_super+0x1bef/0x3420 [ext4]
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658395] [<ffffffff812ed649>] ? snprintf+0x49/0x60
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658399] [<ffffffff81172c89>] ? register_shrinker+0x69/0x90
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658403] [<ffffffff811d2e6c>] ? sget+0x38c/0x3c0
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658411] [<ffffffffa01ec630>] ? ext4_calculate_overhead+0x380/0x380 [ext4]
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658413] [<ffffffff811d3191>] ? mount_bdev+0x1a1/0x1d0
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658416] [<ffffffff811ad83f>] ? alloc_pages_current+0x8f/0x100
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658421] [<ffffffff811d3a66>] ? mount_fs+0x36/0x170
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658424] [<ffffffff811eed64>] ? vfs_kern_mount+0x64/0x100
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658426] [<ffffffff811f11bf>] ? do_mount+0x21f/0xd30
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658428] [<ffffffff811f1fc9>] ? SyS_mount+0x99/0xf0
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658432] [<ffffffff8158a3f6>] ? system_call_fast_compare_end+0xc/0x6b
Jan 22 14:13:13 bac1f70c8n kernel: [ 7095.658434] --[ end trace e68b1f35b0d2408c ]--/0x90
+++++++++++++++++++++++++++++++++++++


Reply to: