Kernel hang in Xen (4.9)

To: debian-user@lists.debian.org
Subject: Kernel hang in Xen (4.9)
From: Bastien Durel <bastien@durel.org>
Date: Mon, 20 Aug 2018 10:59:59 +0200
Message-id: <[🔎] 737d272612dffd52290f677629ecca1b29c4c7e5.camel@durel.org>

Hello,

I have a xen guest that hangs after a while if running 4.9 kernel
(since upgrade to stretch)
I get theses messages in console, then I cannot do anything. I get a
login prompt on console, but do not get shell prompt; SSH connexion is
established, I get my motd (even with updated last login time), but no
shell.

[260514.780118] INFO: task jbd2/xvda5-8:192 blocked for more than 120 seconds.
[260514.780132]       Not tainted 4.9.0-7-amd64 #1 Debian 4.9.110-3+deb9u2
[260514.780137] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[260514.780145] jbd2/xvda5-8    D    0   192      2 0x00000000
[260514.780156]  ffff8800f1caec00 0000000000000000 ffff8800f0beb140 ffff8800f5c18980
[260514.780171]  ffffffff81c11500 ffffc900410bba70 ffffffff8160fed9 ffffffff81090d5a
[260514.780183]  00ffe8ffffc0b000 ffff8800f5c18980 ffffffff8130f759 ffff8800f0beb140
[260514.780196] Call Trace:
[260514.780239]  [<ffffffff8160fed9>] ? __schedule+0x239/0x6f0
[260514.780249]  [<ffffffff81090d5a>] ? queue_work_on+0x2a/0x40
[260514.780263]  [<ffffffff8130f759>] ? blk_mq_flush_plug_list+0x139/0x160
[260514.780271]  [<ffffffff81610b80>] ? bit_wait+0x50/0x50
[260514.780278]  [<ffffffff816103c2>] ? schedule+0x32/0x80
[260514.780286]  [<ffffffff8161374d>] ? schedule_timeout+0x1dd/0x380
[260514.780295]  [<ffffffff8101bc81>] ? xen_clocksource_get_cycles+0x11/0x20
[260514.780303]  [<ffffffff81610b80>] ? bit_wait+0x50/0x50
[260514.780310]  [<ffffffff8160fc3d>] ? io_schedule_timeout+0x9d/0x100
[260514.780319]  [<ffffffff810bb1a7>] ? prepare_to_wait+0x57/0x80
[260514.780326]  [<ffffffff81610b97>] ? bit_wait_io+0x17/0x60
[260514.780333]  [<ffffffff81610755>] ? __wait_on_bit+0x55/0x80
[260514.780344]  [<ffffffff81180628>] ? find_get_pages_tag+0x158/0x2e0
[260514.780353]  [<ffffffff8117f88f>] ? wait_on_page_bit+0x7f/0xa0
[260514.780360]  [<ffffffff810bb610>] ? wake_atomic_t_function+0x60/0x60
[260514.780370]  [<ffffffff8117f990>] ? __filemap_fdatawait_range+0xe0/0x140
[260514.780378]  [<ffffffff8117f9ff>] ? filemap_fdatawait_range+0xf/0x30
[260514.780397]  [<ffffffffc00907cd>] ? jbd2_journal_commit_transaction+0x73d/0x17b0 [jbd2]
[260514.780408]  [<ffffffff81614d64>] ? __switch_to_asm+0x34/0x70
[260514.780416]  [<ffffffff81614d70>] ? __switch_to_asm+0x40/0x70
[260514.780424]  [<ffffffff810156e4>] ? xen_load_sp0+0x84/0x170
[260514.780432]  [<ffffffff8109fcad>] ? finish_task_switch+0x7d/0x200
[260514.780440]  [<ffffffff816147f6>] ? _raw_spin_unlock_irqrestore+0x16/0x20
[260514.780452]  [<ffffffffc0095c62>] ? kjournald2+0xc2/0x260 [jbd2]
[260514.780463]  [<ffffffff810bb570>] ? prepare_to_wait_event+0xf0/0xf0
[260514.780474]  [<ffffffffc0095ba0>] ? commit_timeout+0x10/0x10 [jbd2]
[260514.780482]  [<ffffffff810988e9>] ? kthread+0xd9/0xf0
[260514.780489]  [<ffffffff81098810>] ? kthread_park+0x60/0x60
[260514.780496]  [<ffffffff81614df7>] ? ret_from_fork+0x57/0x70

When running on jessie's 3.16 kernel, the guest is running fine.
I have other VMs on this host running 4.9 kernel without problem.

Does anyone have an idea about this bug ?

Thanks,

-- 
Bastien

Reply to:

Follow-Ups:
- Re: Kernel hang in Xen (4.9)
  - From: Steve Kemp <skx@debian.org>

Prev by Date: Sid: rpcbind not starting (was NFSv3 mounting problem)
Next by Date: Re: Thunderbird 60 ignores LC_TIME environment variable
Previous by thread: Re: Thunderbird 60 ignores LC_TIME environment variable
Next by thread: Re: Kernel hang in Xen (4.9)
Index(es):
- Date
- Thread