[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#671860: Disk freeze on 2.6.32 and 3.x kernels: blkback blocked for more than 120 seconds



On Mon, 2012-05-07 at 17:33 +0200, Andreas Pflug wrote:
> Package: linux-image-3.2.0-0.bpo.1-amd64
> Version: 3.2.1-2~bpo60+1
> Severity: grave
> 
> Actually, all kernels from 2.6.32 to (at least) 3.2.1 seem to be affected.
> 
> I run two identical machines:
> Debian6 AMD64, raid1 software mirror for data disks, on /dev/mdx is lvm2
> configured. The /dev/mapper devices are synced to the other machine
> using drbd.
> These drb-devices are used for virtual machines.
> 
> Sometimes, one machine degrades. Apparently the mdX_resync process and
> blkback collide. Until now, I only observed this on Windows HVM used
> disk devices.
[...]
> May  6 01:06:24 lady kernel: [4979042.044157] INFO: task blkback.12.hdb:16636 blocked for more than 120 seconds.
> May  6 01:06:24 lady kernel: [4979042.044255] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> May  6 01:06:24 lady kernel: [4979042.044345] blkback.12.hdb  D ffff880002c817e0     0 16636      2 0x00000000
> May  6 01:06:24 lady kernel: [4979042.044355]  ffff880002c817e0 0000000000000246 ffff880000000000 ffffffff8160d020
> May  6 01:06:24 lady kernel: [4979042.044366]  0000000000013540 ffff880007dbffd8 ffff880007dbffd8 0000000000013540
> May  6 01:06:24 lady kernel: [4979042.044375]  ffff880002c817e0 ffff880007dbe010 ffffffff81013949 0000000107cacc78
> May  6 01:06:24 lady kernel: [4979042.044385] Call Trace:
> May  6 01:06:24 lady kernel: [4979042.044398]  [<ffffffff81013949>] ? sched_clock+0x5/0x8
> May  6 01:06:24 lady kernel: [4979042.044423]  [<ffffffffa0103673>] ? wait_barrier+0x94/0xcd [raid1]
> May  6 01:06:24 lady kernel: [4979042.044432]  [<ffffffff81045e84>] ? try_to_wake_up+0x190/0x190
> May  6 01:06:24 lady kernel: [4979042.044441]  [<ffffffffa0104a94>] ? make_request+0x11d/0x1689 [raid1]
> May  6 01:06:24 lady kernel: [4979042.044454]  [<ffffffffa010cc7b>] ? __split_and_process_bio+0x520/0x532 [dm_mod]
[...]

This looks like the same bug as #584881 and probably fixed in 3.2.14-1.
See <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=584881;msg=113>.

The current kernel version in squeeze-backports includes that bug fix;
please test whether that works for you.  If you would like to test my
backported bug fix for 2.6.32 (attached to the message linked above),
then please do so following the instructions at
<http://kernel-handbook.alioth.debian.org/ch-common-tasks.html#s-common-official>.

Ben.

-- 
Ben Hutchings
All extremists should be taken out and shot.

Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: