Bug#772050: linux-image-3.16-0.bpo.3-amd64-dbg: vmlinux points to wrong source code
Hello,
On Thu, Dec 04, 2014 at 12:42:02PM -0500, Sebastian Parschauer wrote:
> we've been analyzing a kernel bug in blk-mq with the same kernel version
> where we triggered an Oops by hot-unplugging a qcow2 Qemu/KVM virtio-blk
> storage device during active I/O to that device within the virtual machine
> running this kernel.
> So we've installed linux-image-3.16-0.bpo.3-amd64-dbg (version 3.16.5-1~bpo70+1),
> gdb (version 7.4.1+dfsg-0.1) and installed the related source code. But when
> trying to list the functions from the call trace, wrong code locations are displayed.
>
> # apt-get update
> # apt-get install gdb ctags vim apt-src linux-image-3.16-0.bpo.3-amd64-dbg
> # cd /usr/src
> # apt-src update
> # apt-src install linux-image-3.16-0.bpo.3-amd64
>
> # dpkg -l | grep linux-image
> ii linux-image-3.16-0.bpo.3-amd64 3.16.5-1~bpo70+1 amd64 Linux 3.16 for 64-bit PCs
> ii linux-image-3.16-0.bpo.3-amd64-dbg 3.16.5-1~bpo70+1 amd64 Debugging symbols for Linux 3.16-0.bpo.3-amd64
> ii linux-image-3.2.0-4-amd64 3.2.63-2+deb7u1 amd64 Linux 3.2 for 64-bit PCs
> ii linux-image-amd64 3.2+46 amd64 Linux for 64-bit PCs (meta-package)
>
> # apt-src list linux-image-3.16-0.bpo.3-amd64
> i linux 3.16.5-1~bpo70 /usr/src/linux-3.16.5
>
> Oops call trace:
> [ 81.248004] Call Trace:
> [ 81.248004] [<ffffffff81545f7b>] ? mutex_lock+0x1b/0x2a
> [ 81.248004] [<ffffffff812a75c4>] ? blk_mq_free_queue+0x24/0x150
> [ 81.248004] [<ffffffff8129e7c8>] ? blk_release_queue+0x88/0xd0
> [ 81.248004] [<ffffffff812ca160>] ? kobject_cleanup+0x80/0x1d0
> [ 81.248004] [<ffffffff812abba2>] ? disk_release+0x92/0xd0
> [ 81.248004] [<ffffffff813c4f3b>] ? device_release+0x3b/0xb0
> [ 81.248004] [<ffffffff812ca160>] ? kobject_cleanup+0x80/0x1d0
> [ 81.248004] [<ffffffff811f2095>] ? __blkdev_put+0x115/0x1a0
> [ 81.248004] [<ffffffff811f2285>] ? blkdev_close+0x25/0x30
> [ 81.248004] [<ffffffff811bd323>] ? __fput+0xb3/0x210
> [ 81.257437] [<ffffffff8108c164>] ? task_work_run+0xc4/0xe0
> [ 81.257437] [<ffffffff8106f310>] ? do_exit+0x2c0/0xa80
> [ 81.257437] [<ffffffff8106fb56>] ? do_group_exit+0x46/0xb0
> [ 81.257437] [<ffffffff8106fbd7>] ? SyS_exit_group+0x17/0x20
> [ 81.257437] [<ffffffff8154792d>] ? system_call_fast_compare_end+0x10/0x15
> [ 81.257437] Code: 55 53 48 89 fb 48 83 ec 20 65 48 8b 04 25 48 c8 00 00 48 8b 80 38 c0 ff ff a8 08 75 29 48 8b 57 18 b8 01 00 00 00 48 85 d2 74 03 <8b> 42 28 85 c0 74 14 4c 8d 6b 20 4c 89 ef e8 0e eb b$
> [ 81.258715] RIP [<ffffffff81545ddf>] __mutex_lock_slowpath+0x3f/0x1c0
>
> Let's run gdb:
>
> # gdb /usr/lib/debug/vmlinux-3.16-0.bpo.3-amd64
> (gdb) list *blk_mq_free_queue+0x24
>
> 96 /build/linux-LrLd2z/linux-3.16.5/include/linux/list.h: No such file
> or directory.
>
> (gdb) quit
> # mkdir -p /build/linux-LrLd2z
> # ln -sT /usr/src/linux-3.16.5/ /build/linux-LrLd2z/linux-3.16.5
> # gdb /usr/lib/debug/vmlinux-3.16-0.bpo.3-amd64
> (gdb) list *blk_mq_free_queue+0x24
>
> 0xffffffff812a75c4 is in blk_mq_free_queue
> (/build/linux-LrLd2z/linux-3.16.5/include/linux/list.h:101).
> 96 * in an undefined state.
> 97 */
> 98 #ifndef CONFIG_DEBUG_LIST
> 99 static inline void __list_del_entry(struct list_head *entry)
> 100 {
> 101 __list_del(entry->prev, entry->next);
> 102 }
> 103
> 104 static inline void list_del(struct list_head *entry)
> 105 {
>
> Can't be possible! There is no mutex_lock() here!
I don't know x86, but on arm the stack trace contains the return
addresses, so they are pointing to the instruction after the branch.
Looking at blk_mq_free_queue (in v3.18-rc6 because I have that lying
around here):
static void blk_mq_del_queue_tag_set(struct request_queue *q)
{
struct blk_mq_tag_set *set = q->tag_set;
mutex_lock(&set->tag_list_lock);
list_del_init(&q->tag_set_list);
[...]
void blk_mq_free_queue(struct request_queue *q)
{
struct blk_mq_tag_set *set = q->tag_set;
blk_mq_del_queue_tag_set(q);
blk_mq_exit_hw_queues(q, set, set->nr_hw_queues);
So I bet you have to look at blk_mq_free_queue+0x1f.
> * (gdb) list *blk_release_queue+0x88
>
> 0xffffffff8129e7c8 is in blk_release_queue
> (/build/linux-LrLd2z/linux-3.16.5/block/blk-sysfs.c:523).
> 518 __blk_queue_free_tags(q);
> 519
> 520 if (q->mq_ops)
> 521 blk_mq_free_queue(q);
> 522
> 523 kfree(q->flush_rq);
> 524
> 525 blk_trace_shutdown(q);
> 526
> 527 bdi_destroy(&q->backing_dev_info);
>
> This points to kfree() - also wrong!
>
> Let's check the disassembly!
>
> # objdump -D /usr/lib/debug/vmlinux-3.16-0.bpo.3-amd64 | less
> (less) /<blk_mq_free_queue>:
> ffffffff812a75a0 <blk_mq_free_queue>:
> ffffffff812a75a0: e8 1b 28 2a 00 callq ffffffff81549dc0 <__fentry__>
> ffffffff812a75a5: 41 54 push %r12
> ffffffff812a75a7: 55 push %rbp
> ffffffff812a75a8: 53 push %rbx
> ffffffff812a75a9: 48 8b af a8 07 00 00 mov 0x7a8(%rdi),%rbp
> ffffffff812a75b0: 48 89 fb mov %rdi,%rbx
> ffffffff812a75b3: e8 78 f1 ff ff callq ffffffff812a6730 <blk_mq_freeze_queue>
> ffffffff812a75b8: 4c 8d 65 38 lea 0x38(%rbp),%r12
> ffffffff812a75bc: 4c 89 e7 mov %r12,%rdi
> ffffffff812a75bf: e8 9c e9 29 00 callq ffffffff81545f60 <mutex_lock>
> ffffffff812a75c4: 48 8b 8b b0 07 00 00 mov 0x7b0(%rbx),%rcx
>
> 0xa0 + 0x24 = 0xc4
>
> Here is definitely a call to mutex_lock() at blk_mq_free_queue+0x24 !!!
s/at/before/ !
> No call to any list stuff! So just the source code information in vmlinux is wrong!
> It's the same with the other code locations.
>
> Please fix that in your package build!
So I don't think there is anything to fix. Do you concur?
Best regards
Uwe
--
Pengutronix e.K. | Uwe Kleine-König |
Industrial Linux Solutions | http://www.pengutronix.de/ |
Reply to: