[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] BUG: linux kernel panic after umount nbd dvice



Sorry, my previous dump had a segfault in the printk just before the real segfault.  Search for "NOELBK" below.  Note that _submit_bh is getting a buffer_head with bh->b_bdev==NULL


[   60.372845] nbd: registered device at major 43
[   61.500163] EXT4-fs (nbd1): mounted filesystem with ordered data mode. Opts: (null)
[   61.549895] block nbd1: NBD_DISCONNECT
[   61.562890] block nbd1: Receive control failed (result -32)
[   61.563773] block nbd1: queue cleared
[   62.581005] block nbd1: Attempted send on closed socket
[   62.581399] CPU: 3 PID: 2295 Comm: tail Tainted: G            E  3.18.0+ #4
[   62.581433] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   62.581436]  ffff8800db119800 ffff88021331b9b8 ffffffff81764bdc 0000000000000001
[   62.581440]  ffff8800db9ec0c0 ffff88021331b9e8 ffffffffa04dbb2a ffff88003787c3c8
[   62.581444]  0000000000000005 ffff8802111b9e30 ffff8800d9e85800 ffff88021331ba08
[   62.581472] Call Trace:
[   62.581482]  [<ffffffff81764bdc>] dump_stack+0x46/0x58
[   62.581512]  [<ffffffffa04dbb2a>] do_nbd_request+0x13a/0x185 [nbd]
[   62.581520]  [<ffffffff81351df7>] __blk_run_queue+0x37/0x50
[   62.581549]  [<ffffffff813569d3>] blk_queue_bio+0x323/0x380
[   62.581554]  [<ffffffff81351c40>] generic_make_request+0xc0/0x110
[   62.581560]  [<ffffffff81351cf9>] submit_bio+0x69/0x130
[   62.581589]  [<ffffffff8134bba6>] submit_bio_wait+0x56/0x70
[   62.581595]  [<ffffffff81357f6e>] blkdev_issue_flush+0x5e/0x90
[   62.581601]  [<ffffffff81272521>] ext4_sync_fs+0xc1/0x180
[   62.581630]  [<ffffffff8120ce42>] sync_filesystem+0x82/0xb0
[   62.581635]  [<ffffffff811dee64>] generic_shutdown_super+0x34/0x100
[   62.581639]  [<ffffffff811df277>] kill_block_super+0x27/0x70
[   62.581643]  [<ffffffff811df589>] deactivate_locked_super+0x49/0x60
[   62.581671]  [<ffffffff811dfb5e>] deactivate_super+0x4e/0x70
[   62.581675]  [<ffffffff811fc3a3>] cleanup_mnt+0x43/0x90
[   62.581680]  [<ffffffff811fc442>] __cleanup_mnt+0x12/0x20
[   62.581684]  [<ffffffff8108bd34>] task_work_run+0xc4/0xe0
[   62.581713]  [<ffffffff81071729>] do_exit+0x2d9/0xa80
[   62.581718]  [<ffffffff810a49ce>] ? dequeue_task_fair+0x44e/0x660
[   62.581723]  [<ffffffff8107afdf>] ? recalc_sigpending+0x1f/0x60
[   62.581751]  [<ffffffff81071f5f>] do_group_exit+0x3f/0xa0
[   62.581755]  [<ffffffff8107dd63>] get_signal+0x1e3/0x730
[   62.581761]  [<ffffffff81012508>] do_signal+0x28/0xaa0
[   62.581766]  [<ffffffff810ae580>] ? prepare_to_wait_event+0x110/0x110
[   62.581794]  [<ffffffff811dcc9c>] ? vfs_read+0x9c/0x180
[   62.581798]  [<ffffffff81012fe9>] do_notify_resume+0x69/0xb0
[   62.581803]  [<ffffffff8176d2cf>] int_signal+0x12/0x17
[   62.581831] blk_update_request: I/O error, dev nbd1, sector 0
[   62.582718] NOELBK: !buffer_mapped(bh) bh=ffff8800d201d5b0 bh->b_bdev=          (null)
[   62.583290] ------------[ cut here ]------------
[   62.583646] kernel BUG at fs/buffer.c:3013!
[   62.583848] invalid opcode: 0000 [#1] SMP 
[   62.584555] Modules linked in: nbd(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) xt_conntrack(E) nf_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_CHECKSUM(E) iptable_mangle(E) xt_tcpudp(E) bridge(E) stp(E) llc(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) ip_tables(E) ebtable_nat(E) ebtables(E) x_tables(E) ib_iser(E) rdma_cm(E) iw_cm(E) ib_cm(E) ib_sa(E) ib_mad(E) ib_core(E) ib_addr(E) iscsi_tcp(E) libiscsi_tcp(E) libiscsi(E) scsi_transport_iscsi(E) dm_crypt(E) openvswitch(E) gre(E) vxlan(E) ip6_udp_tunnel(E) udp_tunnel(E) libcrc32c(E) ppdev(E) dm_multipath(E) scsi_dh(E) snd_intel8x0(E) i2c_piix4(E) serio_raw(E) joydev(E) snd_ac97_codec(E) ac97_bus(E) parport_pc(E) snd_pcm(E) parport(E) mac_hid(E) snd_timer(E) snd(E) soundcore(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) btrfs(E) xor(E) raid6_pq(E) hid_generic(E) usbhid(E) hid(E) psmouse(E) ahci(E) libahci(E) e1000(E) video(E)
[   62.586689] CPU: 3 PID: 2295 Comm: tail Tainted: G            E  3.18.0+ #4
[   62.586689] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   62.586689] task: ffff8800da7b0000 ti: ffff880213318000 task.ti: ffff880213318000
[   62.586689] RIP: 0010:[<ffffffff81210e3a>]  [<ffffffff81210e3a>] _submit_bh+0x16a/0x190
[   62.586689] RSP: 0018:ffff88021331ba78  EFLAGS: 00010246
[   62.586689] RAX: 0000000004000005 RBX: ffff8800d201d5b0 RCX: 0000000000000006
[   62.586689] RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff88021fd8d310
[   62.586689] RBP: ffff88021331ba98 R08: 0000000000000096 R09: ffff8800000b8500
[   62.586689] R10: 000000000000021f R11: ffff88021331b75e R12: ffff880211f39000
[   62.586689] R13: 0000000000001411 R14: 0000000000000000 R15: ffff880212a75000
[   62.586689] FS:  00007f9fedd08740(0000) GS:ffff88021fd80000(0000) knlGS:0000000000000000
[   62.586689] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   62.586689] CR2: 00007f877f92a250 CR3: 00000000da54d000 CR4: 00000000000006e0
[   62.586689] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   62.586689] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   62.586689] Stack:
[   62.586689]  ffff880211f39000 ffff880211f39000 ffff8800d201d5b0 0000000000001411
[   62.586689]  ffff88021331baa8 ffffffff81210e70 ffff88021331bae8 ffffffff812add09
[   62.586689]  0000000000013640 ffff880211f39000 ffff880212a75000 ffff880211f39024
[   62.586689] Call Trace:
[   62.586689]  [<ffffffff81210e70>] submit_bh+0x10/0x20
[   62.586689]  [<ffffffff812add09>] jbd2_write_superblock+0x89/0x190
[   62.586689]  [<ffffffff812adeb4>] jbd2_mark_journal_empty+0x64/0xa0
[   62.586689]  [<ffffffff812ae0e1>] jbd2_journal_destroy+0x1f1/0x220
[   62.586689]  [<ffffffff810ae580>] ? prepare_to_wait_event+0x110/0x110
[   62.586689]  [<ffffffff8127bcc4>] ext4_put_super+0x64/0x350
[   62.586689]  [<ffffffff811deea6>] generic_shutdown_super+0x76/0x100
[   62.586689]  [<ffffffff811df277>] kill_block_super+0x27/0x70
[   62.586689]  [<ffffffff811df589>] deactivate_locked_super+0x49/0x60
[   62.586689]  [<ffffffff811dfb5e>] deactivate_super+0x4e/0x70
[   62.586689]  [<ffffffff811fc3a3>] cleanup_mnt+0x43/0x90
[   62.586689]  [<ffffffff811fc442>] __cleanup_mnt+0x12/0x20
[   62.586689]  [<ffffffff8108bd34>] task_work_run+0xc4/0xe0
[   62.586689]  [<ffffffff81071729>] do_exit+0x2d9/0xa80
[   62.586689]  [<ffffffff810a49ce>] ? dequeue_task_fair+0x44e/0x660
[   62.586689]  [<ffffffff8107afdf>] ? recalc_sigpending+0x1f/0x60
[   62.586689]  [<ffffffff81071f5f>] do_group_exit+0x3f/0xa0
[   62.586689]  [<ffffffff8107dd63>] get_signal+0x1e3/0x730
[   62.586689]  [<ffffffff81012508>] do_signal+0x28/0xaa0
[   62.586689]  [<ffffffff810ae580>] ? prepare_to_wait_event+0x110/0x110
[   62.586689]  [<ffffffff811dcc9c>] ? vfs_read+0x9c/0x180
[   62.586689]  [<ffffffff81012fe9>] do_notify_resume+0x69/0xb0
[   62.586689]  [<ffffffff8176d2cf>] int_signal+0x12/0x17
[   62.586689] Code: a1 e8 bb c5 13 00 89 d8 5b 41 5c 41 5d 41 5e 5d c3 41 f6 c5 01 0f 84 0e ff ff ff f0 80 63 01 f7 e9 04 ff ff ff 0f 0b 0f 0b 0f 0b <0f> 0b 48 8b 56 30 48 c7 c7 c8 d4 aa 81 31 c0 e8 51 f8 54 00 e9 
[   62.586689] RIP  [<ffffffff81210e3a>] _submit_bh+0x16a/0x190
[   62.586689]  RSP <ffff88021331ba78>
[   62.651459] ---[ end trace 9a5bbaaa46349cc9 ]---
[   62.652389] Fixing recursive fault but reboot is needed!


On Tue, Apr 14, 2015 at 10:12 AM, Noel Burton-Krahn <noel@...1998.....> wrote:
I'm tracking a kernel panic after umounting an nbd device.  Have you see this before?


Description
-----------

Mount an nbd device, start a process reading a file in it, umount, call
qemu-nbd -d, kill the process, kernel panic.

This can also happen after the reading process exits, maybe when a
buffer gets flushed?  We see it when a [jbd2/nbdXXX] process remains
running after a umount and the [nbdXXX] process has quit with no other
processes using the nbd device.


Tested Versions
---------------
3.13, 3.18


Steps to Reproduce
------------------

    qemu-img create f.img 1G
    yes | mkfs.ext4 f.img
    modprobe nbd
    qemu-nbd -c /dev/nbd1 f.img
    sleep 1
    mkdir -p /mnt/1
    mount /dev/nbd1 /mnt/1
    date > /mnt/1/date
    ip netns add ns
    ip netns exec ns tail -f /mnt/1/date >/dev/null 2>&1 &   # NOTE1
    #mount -o remount,ro /mnt/1    # NOTE2
    umount /mnt/1
    qemu-nbd -d /dev/nbd1
    sleep 1
    kill %-

NOTE1: umount will fail with device busy is the proc is not in a netns
NOTE2: remounting readonly avoids the bug


Dump
----

[  441.026251] nbd: registered device at major 43
[  442.168335] EXT4-fs (nbd1): mounted filesystem with ordered data mode. Opts: (null)
[  442.269904] block nbd1: NBD_DISCONNECT
[  442.288192] block nbd1: Receive control failed (result -32)
[  442.288938] block nbd1: queue cleared
[  443.308744] block nbd1: Attempted send on closed socket

(note: I insterted dump_stack() in nbd_handle_req here)

[  443.309307] CPU: 1 PID: 10506 Comm: tail Tainted: G            E  3.18.0+ #3
[  443.309310] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  443.309313]  ffff880212d1ba00 ffff880212cc39b8 ffffffff81764c1c 0000000000000001
[  443.309328]  ffff88021327a0c0 ffff880212cc39e8 ffffffffa04dbb2a ffff88021124cb50
[  443.309332]  0000000000000005 ffff8802038705c0 ffff8800da483000 ffff880212cc3a08
[  443.309336] Call Trace:
[  443.309355]  [<ffffffff81764c1c>] dump_stack+0x46/0x58
[  443.309362]  [<ffffffffa04dbb2a>] do_nbd_request+0x13a/0x185 [nbd]
[  443.309393]  [<ffffffff81351e37>] __blk_run_queue+0x37/0x50
[  443.309409]  [<ffffffff81356a13>] blk_queue_bio+0x323/0x380
[  443.309415]  [<ffffffff81351c80>] generic_make_request+0xc0/0x110
[  443.309421]  [<ffffffff81351d39>] submit_bio+0x69/0x130
[  443.309441]  [<ffffffff8134bbe6>] submit_bio_wait+0x56/0x70
[  443.309446]  [<ffffffff81357fae>] blkdev_issue_flush+0x5e/0x90
[  443.309463]  [<ffffffff81272561>] ext4_sync_fs+0xc1/0x180
[  443.309480]  [<ffffffff8120ce42>] sync_filesystem+0x82/0xb0
[  443.309486]  [<ffffffff811dee64>] generic_shutdown_super+0x34/0x100
[  443.309491]  [<ffffffff811df277>] kill_block_super+0x27/0x70
[  443.309507]  [<ffffffff811df589>] deactivate_locked_super+0x49/0x60
[  443.309512]  [<ffffffff811dfb5e>] deactivate_super+0x4e/0x70
[  443.309532]  [<ffffffff811fc3a3>] cleanup_mnt+0x43/0x90
[  443.309537]  [<ffffffff811fc442>] __cleanup_mnt+0x12/0x20
[  443.309542]  [<ffffffff8108bd34>] task_work_run+0xc4/0xe0
[  443.309558]  [<ffffffff81071729>] do_exit+0x2d9/0xa80
[  443.309593]  [<ffffffff810a49ce>] ? dequeue_task_fair+0x44e/0x660
[  443.309601]  [<ffffffff8107afdf>] ? recalc_sigpending+0x1f/0x60
[  443.309630]  [<ffffffff81071f5f>] do_group_exit+0x3f/0xa0
[  443.309636]  [<ffffffff8107dd63>] get_signal+0x1e3/0x730
[  443.309644]  [<ffffffff81012508>] do_signal+0x28/0xaa0
[  443.309674]  [<ffffffff810ae580>] ? prepare_to_wait_event+0x110/0x110
[  443.309681]  [<ffffffff811dcc9c>] ? vfs_read+0x9c/0x180
[  443.309710]  [<ffffffff81012fe9>] do_notify_resume+0x69/0xb0
[  443.309718]  [<ffffffff8176d30f>] int_signal+0x12/0x17

(this is the kernel panic stack dump immediately after)

[  443.309723] blk_update_request: I/O error, dev nbd1, sector 0
[  443.310773] BUG: unable to handle kernel NULL pointer dereference at 0000000000000088
[  443.311501] IP: [<ffffffff81367276>] bdevname+0x6/0x30
[  443.312018] PGD 210f26067 PUD 210fa6067 PMD 0 
[  443.312982] Oops: 0000 [#1] SMP 
[  443.313454] Modules linked in: nbd(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) xt_conntrack(E) nf_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_CHECKSUM(E) iptable_mangle(E) xt_tcpudp(E) bridge(E) stp(E) llc(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) ip_tables(E) ebtable_nat(E) ebtables(E) x_tables(E) ib_iser(E) rdma_cm(E) iw_cm(E) ib_cm(E) ib_sa(E) ib_mad(E) ib_core(E) ib_addr(E) iscsi_tcp(E) libiscsi_tcp(E) libiscsi(E) scsi_transport_iscsi(E) openvswitch(E) gre(E) vxlan(E) ip6_udp_tunnel(E) udp_tunnel(E) libcrc32c(E) dm_crypt(E) ppdev(E) dm_multipath(E) scsi_dh(E) serio_raw(E) snd_intel8x0(E) snd_ac97_codec(E) ac97_bus(E) joydev(E) snd_pcm(E) snd_timer(E) snd(E) i2c_piix4(E) soundcore(E) parport_pc(E) parport(E) mac_hid(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) btrfs(E) xor(E) raid6_pq(E) hid_generic(E) usbhid(E) hid(E) psmouse(E) ahci(E) libahci(E) e1000(E) video(E)
[  443.314743] CPU: 2 PID: 10506 Comm: tail Tainted: G            E  3.18.0+ #3
[  443.314743] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  443.314743] task: ffff8800d9f10000 ti: ffff880212cc0000 task.ti: ffff880212cc0000
[  443.314743] RIP: 0010:[<ffffffff81367276>]  [<ffffffff81367276>] bdevname+0x6/0x30
[  443.314743] RSP: 0018:ffff880212cc3a48  EFLAGS: 00010202
[  443.314743] RAX: 0000000000000001 RBX: ffff880072605f70 RCX: 0000000100008bf3
[  443.314743] RDX: 0000000000000001 RSI: ffff880212cc3a58 RDI: 0000000000000000
[  443.314743] RBP: ffff880212cc3a98 R08: 0000000000000206 R09: 0000000000000001
[  443.314743] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000020000
[  443.314743] R13: 0000000000001411 R14: 0000000000000000 R15: ffff880213714000
[  443.314743] FS:  00007fe1a45cd740(0000) GS:ffff88021fd00000(0000) knlGS:0000000000000000
[  443.314743] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  443.314743] CR2: 0000000000000088 CR3: 0000000210f12000 CR4: 00000000000006e0
[  443.314743] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  443.314743] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  443.314743] Stack:
[  443.314743]  ffff880212cc3a98 ffffffff81210e74 ffff880212cc3a88 0000000000000000
[  443.314743]  ffff88021fd13640 0000000000000000 ffff8800da482800 ffff8800da482800
[  443.314743]  ffff880072605f70 0000000000001411 ffff880212cc3aa8 ffffffff81210eb0
[  443.314743] Call Trace:
[  443.314743]  [<ffffffff81210e74>] ? _submit_bh+0x1a4/0x1d0
[  443.314743]  [<ffffffff81210eb0>] submit_bh+0x10/0x20
[  443.314743]  [<ffffffff812add49>] jbd2_write_superblock+0x89/0x190
[  443.314743]  [<ffffffff812adef4>] jbd2_mark_journal_empty+0x64/0xa0
[  443.314743]  [<ffffffff812ae121>] jbd2_journal_destroy+0x1f1/0x220
[  443.314743]  [<ffffffff810ae580>] ? prepare_to_wait_event+0x110/0x110
[  443.314743]  [<ffffffff8127bd04>] ext4_put_super+0x64/0x350
[  443.314743]  [<ffffffff811deea6>] generic_shutdown_super+0x76/0x100
[  443.314743]  [<ffffffff811df277>] kill_block_super+0x27/0x70
[  443.314743]  [<ffffffff811df589>] deactivate_locked_super+0x49/0x60
[  443.314743]  [<ffffffff811dfb5e>] deactivate_super+0x4e/0x70
[  443.314743]  [<ffffffff811fc3a3>] cleanup_mnt+0x43/0x90
[  443.314743]  [<ffffffff811fc442>] __cleanup_mnt+0x12/0x20
[  443.314743]  [<ffffffff8108bd34>] task_work_run+0xc4/0xe0
[  443.314743]  [<ffffffff81071729>] do_exit+0x2d9/0xa80
[  443.314743]  [<ffffffff810a49ce>] ? dequeue_task_fair+0x44e/0x660
[  443.314743]  [<ffffffff8107afdf>] ? recalc_sigpending+0x1f/0x60
[  443.314743]  [<ffffffff81071f5f>] do_group_exit+0x3f/0xa0
[  443.314743]  [<ffffffff8107dd63>] get_signal+0x1e3/0x730
[  443.314743]  [<ffffffff81012508>] do_signal+0x28/0xaa0
[  443.314743]  [<ffffffff810ae580>] ? prepare_to_wait_event+0x110/0x110
[  443.314743]  [<ffffffff811dcc9c>] ? vfs_read+0x9c/0x180
[  443.314743]  [<ffffffff81012fe9>] do_notify_resume+0x69/0xb0
[  443.314743]  [<ffffffff8176d30f>] int_signal+0x12/0x17
[  443.314743] Code: 0c 48 c7 c2 ef 58 ae 81 48 89 df be 20 00 00 00 31 c0 e8 1e 71 02 00 48 89 d8 5b 41 5c 41 5d 41 5e 5d c3 66 90 0f 1f 44 00 00 55 <48> 8b 87 88 00 00 00 48 89 f2 48 8b bf 98 00 00 00 48 89 e5 8b 
[  443.314743] RIP  [<ffffffff81367276>] bdevname+0x6/0x30
[  443.314743]  RSP <ffff880212cc3a48>
[  443.314743] CR2: 0000000000000088
[  443.314743] ---[ end trace d56c02889646ab2e ]---
[  443.314743] Fixing recursive fault but reboot is needed!

See Also
--------

[1] [SOLVED] Kernel Panic with NBD RootFS on reboot and shutdown

[2] Kernel oops when nbd device is removed before it is unmounted


Cheers,
Noel Burton-Krahn
Piston Cloud Computing






Reply to: