Bug#1121006: linux: reported optimal_io_size from mpt3sas devices results in 4GB raid10 optimal_io_size
Hello Salvatore,
Thank you for the quick reply.
On Wed, Nov 19, 2025 at 05:59:48PM +0100, Salvatore Bonaccorso wrote:
[...]
> > Capabilities: [348] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
> > Capabilities: [380] Data Link Feature <?>
> > Kernel driver in use: mpt3sas
>
> This sounds like quite an intersting finding but probably hard to
> reproduce without the hardware if it comes to be specific to the
> controller type and driver.
That's a great point re: reproducibility, and it got me curious on something I
hadn't thought of testing. Namely if there's another angle to this: does any
block device with the same block I/O hints exhibit the same problem? The answer is
actually "yes".
I used qemu 'scsi-hd' device to set the same values to be able to test locally.
On an already-installed VM I added the following to present four new devices:
-device virtio-scsi-pci,id=scsi0
-drive file=./workdir/disks/disk3.qcow2,format=qcow2,if=none,id=drive3
-device scsi-hd,bus=scsi0.0,drive=drive3,physical_block_size=4096,logical_block_size=512,min_io_size=4096,opt_io_size=16773120
-drive file=./workdir/disks/disk4.qcow2,format=qcow2,if=none,id=drive4
-device scsi-hd,bus=scsi0.0,drive=drive4,physical_block_size=4096,logical_block_size=512,min_io_size=4096,opt_io_size=16773120
-drive file=./workdir/disks/disk5.qcow2,format=qcow2,if=none,id=drive5
-device scsi-hd,bus=scsi0.0,drive=drive5,physical_block_size=4096,logical_block_size=512,min_io_size=4096,opt_io_size=16773120
-drive file=./workdir/disks/disk6.qcow2,format=qcow2,if=none,id=drive6
-device scsi-hd,bus=scsi0.0,drive=drive6,physical_block_size=4096,logical_block_size=512,min_io_size=4096,opt_io_size=16773120
I used 10G files with 'qemu-img create -f qcow2 <file> 10G' though size doesn't
affect anything in my testing.
Then in the VM:
# cat /sys/block/sd[cdef]/queue/optimal_io_size
16773120
16773120
16773120
16773120
# mdadm --create /dev/md1 --level 10 --bitmap none --raid-devices 4 /dev/sdc /dev/sdd /dev/sde /dev/sdf
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md1 started.
# cat /sys/block/md1/queue/optimal_io_size
4293918720
I was able to reproduce the problem with src:linux 6.18~rc6-1~exp1 as well as 6.12.57-1.
Since it is easy to test this way I tried with a few different opt_io_size values and
was able to reproduce only with 16773120 (i.e. 0xFFF000).
> I would like to ask: Do you have the possibility to make an OS
> instalaltion such that you can freely experiment with various kernels
> and then under them assemble the arrays? If so that would be great
> that you could start bisecting the changes to find where find changes.
>
> I.e. install the OS independtly on the controller, find by bisecting
> Debian versions manually the kernels between bookworm and trixie
> (6.1.y -> 6.12.y to narrow down the upsream range).
Yes I'm able to perform testing on this host, in fact I worked around the
problem for now by disabling LVM's md alignment auto detection and thus we have
an installed system.
For reference that's "devices { data_alignment_detection = 0 }" in lvm's
config.
> Then bisect the ustream changes to find the offending commits. Let me
> know if you need more specific instructions on the idea.
Having pointers on how the recommended way to build Debian kernels would be of
great help, thank you!
> Additionally it would be interesting to know if the issue persist in
> 6.17.8 or even 6.18~rc6-1~exp1 to be able to clearly indicate upstream
> that the issue persist in upper kernels.
>
> Idealy actually this goes asap to upstream once we are more confident
> ont the subsystem to where to report the issue. If we are reasonably
> confident it it mpt3sas specific already then I would say to go
> already to:
Given the qemu-based reproducer above, maybe this issue is actually two bugs:
raid10 as per above, and mpt3sas presenting 0xFFF000 as optimal_io_size. While
the latter might be suspicious maybe it is not wrong per-se though?
best,
Filippo
Reply to: