Eric, On 25 Apr 2016, at 15:32, Eric Blake <eblake@...696...> wrote: > On 04/25/2016 07:50 AM, Alex Bligh wrote: >> A server can determine whether a client supports block sizes by >> the fact it uses NBD_OPT_INFO or NBD_OPT_GO, as these are part >> of the same extension. Therefore there is no need for >> NBD_OPT_BLOCK_SIZE, or for NBD_REP_ERR_BLOCK_SIZE_REQD. This >> substantially simplifies the extension. >> >> Signed-off-by: Alex Bligh <alex@...872...> >> --- >> doc/proto.md | 76 ++++++++++++++++-------------------------------------------- >> 1 file changed, 20 insertions(+), 56 deletions(-) >> >> Let's see what people think before I merge this. >> > >> +If a client can honor server block size constraints (as set out below >> +and under `NBD_INFO_BLOCK_SIZE`), it SHOULD announce this during the >> +handshake phase by using `NBD_OPT_INFO`, and SHOULD use `NBD_OPT_GO` >> +rather than `NBD_OPT_EXPORT_NAME`. > > In my qemu patches posted to date, I didn't bother to send NBD_OPT_INFO, > but went straight to NBD_OPT_GO. I still think that a single NBD_OPT_GO > is sufficient for the server to know that the client is new enough to > understand and support block sizes, without having to require the client > to send NBD_OPT_INFO. Yes OK, if the client knows it can support any block size constraint provided it knows what it is, it must be. So 'during the handshake phase by using `NBD_OPT_INFO` or `NBD_OPT_GO`, and SHOULD ...' ? > >> A server with block size >> +constraints other than the default SHOULD advertise the block size >> +contraints during handshake phase via `NBD_INFO_BLOCK_SIZE` in response >> +to `NBD_OPT_INFO` or `NBD_OPT_GO`. A server and client MAY agree on >> +block sizes via out of band means. A server is entitled to assume >> +that a client that issues `NBD_OPT_INFO` or `NBD_OPT_GO` will either >> +support the blocksize constraints it has supplied using >> +`NBD_INFO_BLOCK_SIZE`, or will not continue the session. > > The rest reads okay. It covers the case of a client that issues > NBD_OPT_INFO to learn block sizes, but then falls back to > NBD_OPT_EXPORT_NAME to start transmission phase (I don't know why anyone > would write such a client, but it doesn't hurt for the server to support > it). A lazy upgrader I guess. > In my qemu implementation, one of the things I was up against was a > question on what block sizes I should advertise to the client. I fixed > a bug where qemu was mis-behaving on requests not aligned to 512 bytes, > and in so doing, found out that with that fixed, qemu can always support > a minimum alignment of 1. Yeah I saw that. > But the ability to advertise a larger minimum > alignment (of 512, or of the size mandated by O_DIRECT, or whatever > other reason) seems like a rather compelling optimization to be able to > make, and I was debating about the presence or absence of > NBD_OPT_BLOOCK_SIZE as the flag for whether to stick to an advertisement > of a minimum size of 1 vs. 512. I was actually going to suggest 512 then remembered my thought re this. > Under the simplification here, the result would be that qemu always > advertises a minimum of 512 or larger for a client new enough to use > NBD_OPT_GO, but always supports a minimum of 1 for clients that used > NBD_OPT_EXPORT_NAME; which means I have to defer the choice of O_DIRECT > during open() until after the client has requested to go into > transmission phase. Yup. > Qemu is able to probe the minimum alignment of an > O_DIRECT file, but that requires the file to already be open()d. So > that makes it a bit of a chicken-and-egg - how do you learn what sizes > to advertise during NBD_OPT_INFO (which requires an open fd) if you > don't want to open the fd until after knowing whether the client will > use NBD_OPT_GO or NBD_OPT_EXPORT_NAME (since the decision on whether to > use O_DIRECT during open depends on the client's choice of going into > transmission phase). Well, in general don't you need to open the file to get (e.g.) size? That's precisely the issue we have upgrading the main nbdserver. > I'm not opposed to the simplification, just making sure we think through > the ramifications. > >> -- `NBD_OPT_BLOCK_SIZE` (9) >> - >> - The client wishes to inform the server of its intent to obey block >> - sizes. The option request has no additional data. >> - >> - Some servers are able to make optimizations, such as opening files >> - with O_DIRECT, if they know that the client will obey a particular >> - minimum block size, where it must fall back to safer but slower code >> - if the client might send unaligned requests. > > This was the only mention of O_DIRECT; we probably ought to mention > O_DIRECT in the "Block Sizes" section as a justification for why a > server might want to advertise non-default sizes, if we get rid of the > mention here. Yes, I should put that back. > Otherwise, I like where this is headed in terms of simplification, and > it's only a few minor tweaks to my existing qemu patches to update to > the changes here. Great. Thanks. -- Alex Bligh
Attachment:
signature.asc
Description: Message signed with OpenPGP using GPGMail