[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#772050: linux-image-3.16-0.bpo.3-amd64-dbg: vmlinux points to wrong source code



Hello,

On Thu, Dec 04, 2014 at 12:42:02PM -0500, Sebastian Parschauer wrote:
> we've been analyzing a kernel bug in blk-mq with the same kernel version
> where we triggered an Oops by hot-unplugging a qcow2 Qemu/KVM virtio-blk
> storage device during active I/O to that device within the virtual machine
> running this kernel.
> So we've installed linux-image-3.16-0.bpo.3-amd64-dbg (version 3.16.5-1~bpo70+1),
> gdb (version 7.4.1+dfsg-0.1) and installed the related source code. But when
> trying to list the functions from the call trace, wrong code locations are displayed.
> 
> # apt-get update
> # apt-get install gdb ctags vim apt-src linux-image-3.16-0.bpo.3-amd64-dbg
> # cd /usr/src
> # apt-src update
> # apt-src install linux-image-3.16-0.bpo.3-amd64
> 
> # dpkg -l | grep linux-image
> ii  linux-image-3.16-0.bpo.3-amd64        3.16.5-1~bpo70+1                   amd64        Linux 3.16 for 64-bit PCs
> ii  linux-image-3.16-0.bpo.3-amd64-dbg    3.16.5-1~bpo70+1                   amd64        Debugging symbols for Linux 3.16-0.bpo.3-amd64
> ii  linux-image-3.2.0-4-amd64             3.2.63-2+deb7u1                    amd64        Linux 3.2 for 64-bit PCs
> ii  linux-image-amd64                     3.2+46                             amd64        Linux for 64-bit PCs (meta-package)
> 
> # apt-src list linux-image-3.16-0.bpo.3-amd64
> i   linux          3.16.5-1~bpo70 /usr/src/linux-3.16.5
> 
> Oops call trace:
> [   81.248004] Call Trace:
> [   81.248004]  [<ffffffff81545f7b>] ? mutex_lock+0x1b/0x2a
> [   81.248004]  [<ffffffff812a75c4>] ? blk_mq_free_queue+0x24/0x150
> [   81.248004]  [<ffffffff8129e7c8>] ? blk_release_queue+0x88/0xd0
> [   81.248004]  [<ffffffff812ca160>] ? kobject_cleanup+0x80/0x1d0
> [   81.248004]  [<ffffffff812abba2>] ? disk_release+0x92/0xd0
> [   81.248004]  [<ffffffff813c4f3b>] ? device_release+0x3b/0xb0
> [   81.248004]  [<ffffffff812ca160>] ? kobject_cleanup+0x80/0x1d0
> [   81.248004]  [<ffffffff811f2095>] ? __blkdev_put+0x115/0x1a0
> [   81.248004]  [<ffffffff811f2285>] ? blkdev_close+0x25/0x30
> [   81.248004]  [<ffffffff811bd323>] ? __fput+0xb3/0x210
> [   81.257437]  [<ffffffff8108c164>] ? task_work_run+0xc4/0xe0
> [   81.257437]  [<ffffffff8106f310>] ? do_exit+0x2c0/0xa80
> [   81.257437]  [<ffffffff8106fb56>] ? do_group_exit+0x46/0xb0
> [   81.257437]  [<ffffffff8106fbd7>] ? SyS_exit_group+0x17/0x20
> [   81.257437]  [<ffffffff8154792d>] ? system_call_fast_compare_end+0x10/0x15
> [   81.257437] Code: 55 53 48 89 fb 48 83 ec 20 65 48 8b 04 25 48 c8 00 00 48 8b 80 38 c0 ff ff a8 08 75 29 48 8b 57 18 b8 01 00 00 00 48 85 d2 74 03 <8b> 42 28 85 c0 74 14 4c 8d 6b 20 4c 89 ef e8 0e eb b$
> [   81.258715] RIP  [<ffffffff81545ddf>] __mutex_lock_slowpath+0x3f/0x1c0
> 
> Let's run gdb:
> 
> # gdb /usr/lib/debug/vmlinux-3.16-0.bpo.3-amd64
> (gdb) list *blk_mq_free_queue+0x24
> 
> 96      /build/linux-LrLd2z/linux-3.16.5/include/linux/list.h: No such file
> or directory.
> 
> (gdb) quit
> # mkdir -p /build/linux-LrLd2z
> # ln -sT /usr/src/linux-3.16.5/ /build/linux-LrLd2z/linux-3.16.5
> # gdb /usr/lib/debug/vmlinux-3.16-0.bpo.3-amd64
> (gdb) list *blk_mq_free_queue+0x24
> 
> 0xffffffff812a75c4 is in blk_mq_free_queue
> (/build/linux-LrLd2z/linux-3.16.5/include/linux/list.h:101).
> 96       * in an undefined state.
> 97       */
> 98      #ifndef CONFIG_DEBUG_LIST
> 99      static inline void __list_del_entry(struct list_head *entry)
> 100     {
> 101             __list_del(entry->prev, entry->next);
> 102     }
> 103
> 104     static inline void list_del(struct list_head *entry)
> 105     {
> 
> Can't be possible! There is no mutex_lock() here!
I don't know x86, but on arm the stack trace contains the return
addresses, so they are pointing to the instruction after the branch.

Looking at blk_mq_free_queue (in v3.18-rc6 because I have that lying
around here):

	static void blk_mq_del_queue_tag_set(struct request_queue *q)
	{
		struct blk_mq_tag_set *set = q->tag_set;

		mutex_lock(&set->tag_list_lock);
		list_del_init(&q->tag_set_list);
	[...]

	void blk_mq_free_queue(struct request_queue *q)
	{
		struct blk_mq_tag_set   *set = q->tag_set;

		blk_mq_del_queue_tag_set(q);

		blk_mq_exit_hw_queues(q, set, set->nr_hw_queues);

So I bet you have to look at blk_mq_free_queue+0x1f.

> * (gdb) list *blk_release_queue+0x88
> 
> 0xffffffff8129e7c8 is in blk_release_queue
> (/build/linux-LrLd2z/linux-3.16.5/block/blk-sysfs.c:523).
> 518                     __blk_queue_free_tags(q);
> 519
> 520             if (q->mq_ops)
> 521                     blk_mq_free_queue(q);
> 522
> 523             kfree(q->flush_rq);
> 524
> 525             blk_trace_shutdown(q);
> 526
> 527             bdi_destroy(&q->backing_dev_info);
> 
> This points to kfree() - also wrong!
> 
> Let's check the disassembly!
> 
> # objdump -D /usr/lib/debug/vmlinux-3.16-0.bpo.3-amd64 | less
> (less) /<blk_mq_free_queue>:
> ffffffff812a75a0 <blk_mq_free_queue>:
> ffffffff812a75a0:       e8 1b 28 2a 00          callq  ffffffff81549dc0 <__fentry__>
> ffffffff812a75a5:       41 54                   push   %r12
> ffffffff812a75a7:       55                      push   %rbp
> ffffffff812a75a8:       53                      push   %rbx
> ffffffff812a75a9:       48 8b af a8 07 00 00    mov    0x7a8(%rdi),%rbp
> ffffffff812a75b0:       48 89 fb                mov    %rdi,%rbx
> ffffffff812a75b3:       e8 78 f1 ff ff          callq  ffffffff812a6730 <blk_mq_freeze_queue>
> ffffffff812a75b8:       4c 8d 65 38             lea    0x38(%rbp),%r12
> ffffffff812a75bc:       4c 89 e7                mov    %r12,%rdi
> ffffffff812a75bf:       e8 9c e9 29 00          callq  ffffffff81545f60 <mutex_lock>
> ffffffff812a75c4:       48 8b 8b b0 07 00 00    mov    0x7b0(%rbx),%rcx
> 
> 0xa0 + 0x24 = 0xc4
> 
> Here is definitely a call to mutex_lock() at blk_mq_free_queue+0x24 !!!
s/at/before/ !

> No call to any list stuff! So just the source code information in vmlinux is wrong!
> It's the same with the other code locations.
> 
> Please fix that in your package build!
So I don't think there is anything to fix. Do you concur?

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |


Reply to: