Re: Kernel driver I/O block size hinting
On Tue, Jun 14, 2022 at 08:30:15PM +0100, Nikolaus Rath wrote:
> On Jun 14 2022, "Richard W.M. Jones" <rjones@redhat.com> wrote:
> > I think we should set logical_block_size == physical_block_size ==
> > MAX (512, NBD minimum block size constraint).
>
> Why the lower bound of 512?
I suspect the kernel can't handle sector sizes smaller than 512 bytes.
By default the NBD protocol advises advertising a minimum size of 1
byte, and I'm almost certain setting logical_block_size == 1 would
break everything.
> > What should happen to the nbd-client -b option?
>
> Perhaps it should become the lower-bound (instead of the hardcoded 512)?
> That's assuming there is a reason for having a client-specified lower
> bound.
Right, I don't think there's a reason to continue with the -b option.
I only use it to set -b 512 to work around the annoying default in
older versions (which was 1024).
> > (4) Kernel blk_queue_max_hw_sectors: This is documented as: "set max
> > sectors for a request ... Enables a low level driver to set a hard
> > upper limit, max_hw_sectors, on the size of requests."
> >
> > Current behaviour of nbd.ko is that we set this to 65536 (sectors?
> > blocks?), which for 512b sectors is 32M.
>
> FWIW, on my 5.16 kernel, the default is 65 kB (according to
> /sys/block/nbdX/queue/max_sectors_kb x 512b).
I have:
$ cat /sys/devices/virtual/block/nbd0/queue/max_hw_sectors_kb
32768
(ie. 32 MB) which I think comes from the nbd module setting:
blk_queue_max_hw_sectors(disk->queue, 65536);
multiplied by 512b sectors.
> > I think we could set this to MIN (32M, NBD maximum block size constraint),
> > converting the result to sectors.
>
> I don't think that's right. Rather, it should be NBD's preferred block
> size.
>
> Setting this to the preferred block size means that NBD requests will be
> this large whenever there are enough sequential dirty pages, and that no
> requests will ever be larger than this. I think this is exactly what the
> NBD server would like to have.
This kernel setting limits the maximum request size on the queue.
In my testing reading and writing files with the default [above] the
kernel never got anywhere near sending multi-megabyte requests. In
fact the largest request it sent was 128K, even when I did stuff like:
# dd if=/dev/zero of=/tmp/mnt/zero bs=100M count=10
128K happens to be 2 x blk_queue_io_opt, but I need to do more testing
to see if that relationship always holds.
> Settings this to the maximum block size would mean that NBD requests
> will exceed the preferred size whenever there are enough sequential
> dirty pages (while still obeying the maximum). This seems strictly
> worse.
>
> Unrelated to the proposed changes (all of which I think are technically
> correct), I am wondering if this will have much practical benefits. As
> far as I can tell, the kernel currently aligns NBD requests to the
> logical/physical block size rather than the size of the NBD request. Are
> there NBD servers that would benefit from the kernel honoring the
> preferred blocksize if the data is not also aligned to this blocksize?
I'm not sure I parsed this. Can you give an example?
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
nbdkit - Flexible, fast NBD server with plugins
https://gitlab.com/nbdkit/nbdkit
Reply to: