Bug#996951: linux-image-5.14.0-3-amd64: iouring looses requests
Control: tags -1 + moreinfo
Hi Daniel,
On Thu, Oct 21, 2021 at 09:38:00AM +0000, Daniel Black wrote:
> Package: src:linux
> Version: 5.14.12-1
> Severity: grave
> Justification: causes non-serious data loss
> X-Debbugs-Cc: daniel@mariadb.org
>
> Dear Maintainer,
>
> MariaDB has been investigating a 10.6+ related problem for a while
> https://jira.mariadb.org/browse/MDEV-26674
> https://jira.mariadb.org/browse/MDEV-26555
>
> The result of this investigation is that between 5.11 and fixed in 5.15
> is a uring kernel related fault that results in a write request getting
> lost.
>
> The result of this is that MariaDB-10.6 users, and perhaps other
> applications using the iouring kernel interface will loose either
> availablity or data.
>
> The good news is I've validated that the linux mainline 5.14.14 build
> from https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.14.14/ has
> actually fixed this problem.
>
> As such this affects bullseye-backports, bookworm and side currently.
>
> This can be validated by installing mariadb-test-10.6 from MariaDB's
> repo.
> https://mariadb.org/download/#mariadb-repositories
>
> To test run:
>
> cd /usr/share/mysql/mysql-test
> ./mtr --vardir=/tmp/var --parallel=4 encryption.innochecksum{,,,,,}
> ./mtr --vardir=/tmp/var --parallel=4 stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb stress.ddl_innodb
>
> A test failure (after a large timeout 10 mins I think) results in the
> mariadb error:
>
> 2021-10-21 9:08:43 0 [ERROR] [FATAL] InnoDB: innodb_fatal_semaphore_wait_threshold was exceeded for dict_sys.latch. Please refer to https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/
>
> Marko (MariaDB innodb maintainer) in https://jira.mariadb.org/browse/MDEV-26674?focusedCommentId=202674&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-202674
> already validated the problem on sid.
>
> This is reported on an ubuntu impish machine with the Debian kernel
> installed to eliminate any other userspace effects that may have caused
> this.
Where you able to isolate the upstream change landed upstream which
fixes the issue?
Regards,
Salvatore
Reply to: