[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] [PATCH] Further tidy-up on block status

On 02/16/2018 07:53 AM, Vladimir Sementsov-Ogievskiy wrote:

Good idea. But it would be tricky thing to maintain backward compatibility with published versions of virtuozzo product. And finally our implementation would be more complex because of this simplification.

Hm. Finally, you suggested several changes (including already merged 56c77720 :( ). Suggestions are logical. But if they will be accepted, we (Virtuozzo) will have to invent tricky hard-to-maintain code, to distinguish by third factors our already published versions. So, I suggest to add some negotiation flag (BLOCK_STATUS_FIXED ?), to negotiate your changes (including 56c77720). In upstream we can skip supporting BLOCK_STATUS without that special flag, but it will help us to distinguish our previous versions reliably.

So if I understand correctly, you already have an implementation of NBD servers in the wild that support a preliminary version of NBD_OPT_SET_META_CONTEXT, but which do not necessarily obey the semantics in the current extension branch (and which may not be the final semantics by the time we promote the extension branch into the NBD spec overall).

Also, the Virtuozzo product hasn't been mentioned on the NBD list, and while you are preparing patches to the public qemu based on the Virtuozzo product, I'm not sure if there is a public repository for the existing Virtuozzo product out in the wild. Is your product using NBD_OPT_LIST_META_CONTEXT, or _just_ NBD_OPT_SET_META_CONTEXT?

Now, let's assess the ramifications for changing the current NBD proposal, before finalizing it into mainline. Once it is in mainline, all other implementations have to obey the spec; but we want to leave ourselves the freedom to tweak the extension prior to that point according to things learned while implementing it. With that said, what combinations are you hoping to still support? And how long do you expect to have mismatched versions in the field, or is it something where it is relatively easy to upgrade both client and server within a short timeframe?

Here's what things look like if we do nothing, and the NBD_OPT side of the handshake stays compatible (as I understand it, your complaint about commit 56c77720 is that it affects only the transmission side; leaving aside for the moment that I have opened the door for even more changes that may affect the handshake side).

old client, old server (what is in the field now): The Virtuozzo client is able to request status from the Virtuozzo server, no other NBD client or server cares because they pre-date block status

new client, old server: Any new client that implements block_status, as currently documented, but talks to an existing Virtuozzo server, will get garbage replies to NBD_CMD_BLOCK_STATUS (status 5 instead of expected status 3). Virtuozzo's client could be taught to special-case things and accept BOTH values for status replies (as long as nothing else changes between the old implementation and when NBD finally accepts the extension into mainline)

old client, new server: The Virtuozzo client talking to a non-Virtuozzo server that understands block_status as currently documented will get garbage replies to NBD_CMD_BLOCK_STATUS (status 3 instead of expected status 5). The Virtuozzo server cannot be patched to work around this, because it has no way to detect a Virtuozzo client differently from any other client.

new client, new server: Because everyone follows the same documentation, it should be interoperable.

It's ALWAYS risky to put an experimental item into production use; some precautions such as using intentionally higher numbers than normal may make it easier to identify the experimental versions (that is, if the proposal is to implement item '10', the experimental code implements item '10010'; then, any changes to what actually gets finalized into the semantics for item '10' will not affect continued use of the '10010' semantics). But since you didn't do that, my immediate reaction is that adding a 'BLOCK_STATUS_FIXED' to the NBD protocol sounds crazy (fixed compared to what? An unpublished early implementation?). But what we CAN do is specify that NBD_OPT_ value 10 is reserved for a (withdrawn) experimental extension (similar to how NBD_OPT_ value 4 for PEEK_EXPORT is skipped), and make NBD_OPT_SET_META_CONTEXT be 11. Then you have:

old client, old server: client sends NBD_OPT 10, Virtuozzo server replies, and both sides use whatever you implemented even if it differs from the final NBD spec

new client, old server: any compliant client sends NBD_OPT 11, Virtuozzo server doesn't recognize it, and you lose the ability to query block status. Additionally, you can teach the Virtuozzo client to try 11 first, and if it fails, fall back to trying 10 - the client then has to understand both old and new flavors, but can now talk to any version of the Virtuozzo server.

old client, new server: the Virtuozzo client sends NBD_OPT 10, a compliant server doesn't recognize it, and you don't get to query block status, but you are no worse off than for any other server that doesn't understand block status. Additionally, you can teach the Virtuozzo server to accept opt 10 in addition to 11; the server now has to understand both old and new flavors, but can now talk to any version of the Virtuozzo client.

new client, new server: the client sends NBD_OPT 11, the server replies, and they communicate according to the spec

Or, we can revert the change in commit 56c77720, and keep NBD_REPLY_TYPE_BLOCK_STATUS at 5 (it leaves a hole in the NBD_REPLY_TYPE numbering, where 3 and 4 might be filled in by other future extensions, or permanently skipped). This works IF there are no OTHER incompatible changes made to the rest of the block status extension as part of promoting it to current (where we still haven't finished that debate, given my question on whether 32-bit lengths and colon-separated namespace:leaf in a single string is the best representation).

So, I'd like some feedback from Alex or Wouter on which alternatives seem nicest at this point.

Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Reply to: