[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Simplified protocol?



On Tue, Nov 27, 2018 at 08:00:57AM +0000, Richard W.M. Jones wrote:
[...]
> The kernel doesn't even find the partition in the GPT case, and nor
> does ‘gdisk -l /dev/nbd0’.
> 
> In both cases, this is fixed by using the ‘nbd-client -b 512’ option.
> 
> It's also reproducible by using OS images, so it's nothing to do with
> bugs in nbdkit's partitioning plugin.

So, investigating this a bit further (I am 100% certain this used to
work, I used it at some point in the past with nbd-server and an image
of something), I found this:

commit e544541b0765c341174613b416d4b074fa7571c2
Author: Josef Bacik <jbacik@fb.com>
Date:   Mon Feb 13 10:39:47 2017 -0500

    nbd: set the logical and physical blocksize properly
    
    We noticed when trying to do O_DIRECT to an export on the server side
    that we were getting requests smaller than the 4k sectorsize of the
    device.  This is because the client isn't setting the logical and
    physical blocksizes properly for the underlying device.  Fix this up by
    setting the queue blocksizes and then calling bd_set_size.
    
    Signed-off-by: Josef Bacik <jbacik@fb.com>
    Signed-off-by: Jens Axboe <axboe@fb.com>

I suspect that before this commit, the kernel completely ignored the
passed block size and just used 512 or something equally sane.

> > > The Linux kernel assumes a 1K sector size and computes all partition
> > > offsets wrongly.  Can we change this?
> > 
> > I could. But my plan was to implement block size negotiations when next
> > I have time, and then use whatever's negotiated at that level as the
> > block size that I send to the kernel. For now, it seems better to keep
> > the defaults that have been around for a long time, rather than change
> > things twice.
> 
> I would disagree since the out of the box case obviously doesn't work.

Well, it does, for some value of "work". I was mostly thinking of
systems that had been installed and partitioned *on* an nbd device (which
is possible out of the box on Debian, although you need to use some
extra options on the kernel command line to load the necessary modules),
rather than being an NBD representation of something that was originally
not on NBD; if we change block size there, then suddenly those machines
wouldn't work anymore.

However, the above kernel commit invalidates that point, so I've just
changed the default to 512 bytes. This will be part of nbd 3.19, when it
releases (not sure when that would be though)

> I don't think there is any common system that uses a 1K sector size.

According to the help message, 1024 bytes was chosen for fitting nicely
inside the 1500-byte frame size of 100Mbit Ethernet. With gigabit
ethernet though that is even less relevant, since now there are jumbo
frames...

-- 
To the thief who stole my anti-depressants: I hope you're happy

  -- seen somewhere on the Internet on a photo of a billboard


Reply to: