Bug#913138: linux: I/O on md RAID 6 hangs completely

To: 913138@bugs.debian.org
Cc: staszek3@wp.pl
Subject: Bug#913138: linux: I/O on md RAID 6 hangs completely
From: Cesare Leonardi <celeonar@gmail.com>
Date: Thu, 22 Nov 2018 16:01:23 +0100
Message-id: <[🔎] de4486bb-d1ff-d5bd-d6b1-253b3c440642@gmail.com>
Reply-to: Cesare Leonardi <celeonar@gmail.com>, 913138@bugs.debian.org
In-reply-to: <[🔎] aad9facf762a45258eab6c8716d5014a@grupawp.pl>
References: <[🔎] aad9facf762a45258eab6c8716d5014a@grupawp.pl> <[🔎] aad9facf762a45258eab6c8716d5014a@grupawp.pl> <[🔎] 154159304148.20627.13779912208793659481.reportbug@tglase.lan.tarent.de>

On Thu, 08 Nov 2018 23:28:16 +0100 =?UTF-8?Q?Stanis=C5=82aw?=<staszek3@wp.pl> wrote:

I suffer the same problem while running RAID1 with kernel 4.18.10-2.


Me too.

For me this happens since the switch from 4.16 to 4.17.x, with twodifferent PCs, both with LVM based RAID1. I've already opened bug#913119, then I've found this bug report and the reply from Stanislavwas really helpful for me.To me this bug, mine and the already closed bug #904822 have the sameroot: the stack traces reported by dmesg are very similar. And thecommon denominators are some sort of LVM RAID and the range of kernel used.

"...Someone else suggested this might be related to using "blk-mq", so
could you try with these parameter:

dm_mod.use_blk_mq=0 scsi_mod.use_blk_mq=0


This seems to have solved the problem for me.

I've tested these boot parameters on one of the affected PC and now it'srunning for more than three days. Before, with kernel from 4.17.x to thecurrent Debian's 4.18.10-2+b1, the system showed an oops within 0.5/1 day.

Disabling these parameters is plausible, since Debian's kernel enabledSCSI_MQ_DEFAULT and DM_MQ_DEFAULT with 4.17~rc7-1~exp1.

Also, do you have laptop-mode-tools installed?


No, not installed here.

I've checked with two other distributions I have here, to see what theyhave done with SCSI_MQ_DEFAULT and DM_MQ_DEFAULT parameters:


- Arch Linux (kernel 4.18.16-arch1-1-ARCH): both disabled.
- Arch Linux (kernel 4.19.2-arch1-1-ARCH): both enabled.

- Fedora server 29 (kernel 4.18.17-300.fc29.x86_64): both disabled.
- Fedora server 29 (kernel 4.19.2-301.fc29.x86_64): both disabled.

But I was unable to find if upstream is aware of this problem and ifit's already resolved in 4.19.


Cesare.

Reply to:

References:
- Bug#913138: linux: I/O on md RAID 6 hangs completely
  - From: Stanisław <staszek3@wp.pl>
- Bug#913138: linux: I/O on md RAID 6 hangs completely
  - From: Thorsten Glaser <tg@mirbsd.de>

Prev by Date: linux_4.18.20-1_multi.changes ACCEPTED into unstable, unstable
Next by Date: Bug#913119: linux-image-4.18.0-2-amd64: Hangs on lvm raid1
Previous by thread: Bug#913138: linux: I/O on md RAID 6 hangs completely
Next by thread: Bug#913199: linux: Please enable CONFIG_DRM_DP_CEC on x86
Index(es):
- Date
- Thread