[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#913138: linux: I/O on md RAID 6 hangs completely



On Thu, 08 Nov 2018 23:28:16 +0100 =?UTF-8?Q?Stanis=C5=82aw?= <staszek3@wp.pl> wrote:
I suffer the same problem while running RAID1 with kernel 4.18.10-2.

Me too.
For me this happens since the switch from 4.16 to 4.17.x, with two different PCs, both with LVM based RAID1. I've already opened bug #913119, then I've found this bug report and the reply from Stanislav was really helpful for me. To me this bug, mine and the already closed bug #904822 have the same root: the stack traces reported by dmesg are very similar. And the common denominators are some sort of LVM RAID and the range of kernel used.

"...Someone else suggested this might be related to using "blk-mq", so
could you try with these parameter:

dm_mod.use_blk_mq=0 scsi_mod.use_blk_mq=0

This seems to have solved the problem for me.
I've tested these boot parameters on one of the affected PC and now it's running for more than three days. Before, with kernel from 4.17.x to the current Debian's 4.18.10-2+b1, the system showed an oops within 0.5/1 day.

Disabling these parameters is plausible, since Debian's kernel enabled SCSI_MQ_DEFAULT and DM_MQ_DEFAULT with 4.17~rc7-1~exp1.

Also, do you have laptop-mode-tools installed?

No, not installed here.

I've checked with two other distributions I have here, to see what they have done with SCSI_MQ_DEFAULT and DM_MQ_DEFAULT parameters:

- Arch Linux (kernel 4.18.16-arch1-1-ARCH): both disabled.
- Arch Linux (kernel 4.19.2-arch1-1-ARCH): both enabled.

- Fedora server 29 (kernel 4.18.17-300.fc29.x86_64): both disabled.
- Fedora server 29 (kernel 4.19.2-301.fc29.x86_64): both disabled.

But I was unable to find if upstream is aware of this problem and if it's already resolved in 4.19.

Cesare.


Reply to: