On 29 Apr 2016, at 15:53, Eric Blake <eblake@...696...> wrote: > The server can always do a minimum block size of 1 (it already has code > to do a read-modify-write, when needed), so it will never disconnect a > client that uses NBD_OPT_EXPORT_NAME, nor will such a client ever get an > EINVAL for an unaligned read. However, there are some files that are > more efficient with a block size of 1 (anything on the file system) than > others (an actual block device, where 512 or 4096 is more typical). So > on a per-export basis, the server will prefer to advertise a minimum > block size that matches the type of file it is serving, where possible. > It works out to these four cases: > > If the client calls NBD_OPT_INFO without NBD_INFO_BLOCK_SIZE, the server > will reply with all information it has, and advertise the block size of > the actual file. If the block size is 1, the server will conclude with > NBD_REP_ACK; if the block size is > 1, the server will conclude with > NBD_REP_ERR_BLOCK_SIZE_REQD. > > If the client calls NBD_OPT_INFO with NBD_INFO_BLOCK_SIZE, the server > will reply with all information it has, and advertise the block size of > the actual file. It will then conclude with NBD_REP_ACK. > > If the client calls NBD_OPT_GO without NBD_INFO_BLOCK_SIZE, the server > will reply with all information it has, except that the minimum block > size will be 1, then conclude with NBD_REP_ACK. That way, the client > cannot send an unaligned request, and the server doesn't have to worry > about reporting EINVAL. Note that for a file with native minimum block > size > 1, this is a success reply even when the corresponding > NBD_OPT_INFO with the same parameters would have failed, and with > different information - but the way we worded this, _it is okay_. We > have no requirement on NBD_OPT_GO due to a failed NBD_OPT_INFO, only due > to a successful one. > > If the client calls NBD_OPT_GO with NBD_INFO_BLOCK_SIZE, the server will > reply with all information it has, including correct minimum size, and > conclude with NBD_REP_ACK. If the client then sends an unaligned > request, it was the client that violated the protocol, so all bets are > now off (the server will happen to honor the request, rather than > disconnect or fail with the suggested EINVAL, but the client was > out-of-spec so it shouldn't be relying on any particular server > behavior, whether success or a particular error). > > Let me know if any of the above reasoning is wrong, or if we should try > harder to document this in the spec to make it clear to server > implementors how they can be portable to the largest number of clients. That seems perfectly legal to me, and indeed was what I was first planning on doing. Then I thought of an alternative strategy. This is in essence to always reply with a minimum block size of 1 (after all, you *can* support it), and return a preferred block size of the block size of the back-end. This strategy assumes the client will respect the preferred block size at least 'most' of the time, and also that you don't have to open the back-end in a different manner to support block sizes smaller than the native blocksize (e.g. you are not doing O_DIRECT type things). FWIW gonbdserver just has a 'GetGeometry' call which returns the block sizes the backend would like. These are mildly sanitized (and overrideable in the config file), and then passed through. -- Alex Bligh
Attachment:
signature.asc
Description: Message signed with OpenPGP using GPGMail