[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] [PATCH/RFC] Remove NBD_OPT_BLOCK_SIZE



On 04/25/2016 07:50 AM, Alex Bligh wrote:
> A server can determine whether a client supports block sizes by
> the fact it uses NBD_OPT_INFO or NBD_OPT_GO, as these are part
> of the same extension. Therefore there is no need for
> NBD_OPT_BLOCK_SIZE, or for NBD_REP_ERR_BLOCK_SIZE_REQD. This
> substantially simplifies the extension.
> 
> Signed-off-by: Alex Bligh <alex@...872...>
> ---
>  doc/proto.md | 76 ++++++++++++++++--------------------------------------------
>  1 file changed, 20 insertions(+), 56 deletions(-)
> 
> Let's see what people think before I merge this.
> 

> +If a client can honor server block size constraints (as set out below
> +and under `NBD_INFO_BLOCK_SIZE`), it SHOULD announce this during the
> +handshake phase by using `NBD_OPT_INFO`, and SHOULD use `NBD_OPT_GO`
> +rather than `NBD_OPT_EXPORT_NAME`.

In my qemu patches posted to date, I didn't bother to send NBD_OPT_INFO,
but went straight to NBD_OPT_GO.  I still think that a single NBD_OPT_GO
is sufficient for the server to know that the client is new enough to
understand and support block sizes, without having to require the client
to send NBD_OPT_INFO.

>  A server with block size
> +constraints other than the default SHOULD advertise the block size
> +contraints during handshake phase via `NBD_INFO_BLOCK_SIZE` in response
> +to `NBD_OPT_INFO` or `NBD_OPT_GO`.  A server and client MAY agree on
> +block sizes via out of band means. A server is entitled to assume
> +that a client that issues `NBD_OPT_INFO` or `NBD_OPT_GO` will either
> +support the blocksize constraints it has supplied using
> +`NBD_INFO_BLOCK_SIZE`, or will not continue the session.

The rest reads okay. It covers the case of a client that issues
NBD_OPT_INFO to learn block sizes, but then falls back to
NBD_OPT_EXPORT_NAME to start transmission phase (I don't know why anyone
would write such a client, but it doesn't hurt for the server to support
it).

In my qemu implementation, one of the things I was up against was a
question on what block sizes I should advertise to the client.  I fixed
a bug where qemu was mis-behaving on requests not aligned to 512 bytes,
and in so doing, found out that with that fixed, qemu can always support
a minimum alignment of 1.  But the ability to advertise a larger minimum
alignment (of 512, or of the size mandated by O_DIRECT, or whatever
other reason) seems like a rather compelling optimization to be able to
make, and I was debating about the presence or absence of
NBD_OPT_BLOOCK_SIZE as the flag for whether to stick to an advertisement
of a minimum size of 1 vs. 512.

Under the simplification here, the result would be that qemu always
advertises a minimum of 512 or larger for a client new enough to use
NBD_OPT_GO, but always supports a minimum of 1 for clients that used
NBD_OPT_EXPORT_NAME; which means I have to defer the choice of O_DIRECT
during open() until after the client has requested to go into
transmission phase.  Qemu is able to probe the minimum alignment of an
O_DIRECT file, but that requires the file to already be open()d.  So
that makes it a bit of a chicken-and-egg - how do you learn what sizes
to advertise during NBD_OPT_INFO (which requires an open fd) if you
don't want to open the fd until after knowing whether the client will
use NBD_OPT_GO or NBD_OPT_EXPORT_NAME (since the decision on whether to
use O_DIRECT during open depends on the client's choice of going into
transmission phase).

I'm not opposed to the simplification, just making sure we think through
the ramifications.

> -- `NBD_OPT_BLOCK_SIZE` (9)
> -
> -    The client wishes to inform the server of its intent to obey block
> -    sizes.  The option request has no additional data.
> -
> -    Some servers are able to make optimizations, such as opening files
> -    with O_DIRECT, if they know that the client will obey a particular
> -    minimum block size, where it must fall back to safer but slower code
> -    if the client might send unaligned requests.

This was the only mention of O_DIRECT; we probably ought to mention
O_DIRECT in the "Block Sizes" section as a justification for why a
server might want to advertise non-default sizes, if we get rid of the
mention here.

Otherwise, I like where this is headed in terms of simplification, and
it's only a few minor tweaks to my existing qemu patches to update to
the changes here.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


Reply to: