[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Cross-project NBD extension proposal: NBD_INFO_INIT_STATE

I will be following up to this email with four separate threads each addressed to the appropriate single list, with proposed changes to:
- the NBD protocol
- qemu: both server and client
- libnbd: client
- nbdkit: server

The feature in question adds a new optional NBD_INFO_ packet to the NBD_OPT_GO portion of handshake, adding up to 16 bits of information that the server can advertise to the client at connection time about any known initial state of the export [review to this series may propose slight changes, such as using 32 bits; but hopefully by having all four series posted in tandem it becomes easier to see whether any such tweaks are warranted, and can keep such tweaks interoperable before any of the projects land the series upstream]. For now, only 2 of those 16 bits are defined: NBD_INIT_SPARSE (the image has at least one hole) and NBD_INIT_ZERO (the image reads completely as zero); the two bits are orthogonal and can be set independently, although it is easy enough to see completely sparse files with both bits set. Also, advertising the bits is orthogonal to whether the base:allocation metacontext is used, although a server with all possible extensions is likely to have the two concepts match one another.

The new bits are added as an information chunk rather than as runtime flags; this is because the intended client of this information is operations like copying a sparse image into an NBD server destination. Such a client only cares at initialization if it needs to perform a pre-zeroing pass or if it can rely on the destination already reading as zero. Once the client starts making modifications, burdening the server with the ability to do a live runtime probe of current reads-as-zero state does not help the client, and burning per-export flags for something that quickly goes stale on the first edit was not thought to be wise, similarly, adding a new NBD_CMD did not seem worthwhile.

The existing 'qemu-img convert source... nbd://...' is the first command line example that can benefit from the new information; the goal of adding a protocol extension was to make this benefit automatic without the user having to specify the proposed --target-is-zero when possible. I have a similar thread pending for qemu which adds similar known-reads-zero information to qcow2 files:

That qemu series is at v1, and based on review it has had so far, it will need some interface changes for v2, which means my qemu series here will need a slight rebasing, but I'm posting this series to all lists now to at least demonstrate what is possible when we have better startup information.

Note that with this new bit, it is possible to learn if a destination is sparse as part of NBD_OPT_GO rather than having to use block-status commands. With existing block-status commands, you can use an O(n) scan of block-status to learn if an image reads as all zeroes (or short-circuit in O(1) time if the first offset is reported as probable data rather than reading as zero); but with this new bit, the answer is O(1). So even with Vladimir's recent change to make the spec permit 4G block-status even when max block size is 32M, or the proposed work to add 64-bit block-status, you still end up with more on-the-wire traffic for block-status to learn if an image is all zeroes than if the server just advertises this bit. But by keeping both extensions orthogonal, a server can implement whichever one or both reporting methods it finds easiest, and a client can work with whatever a server supplies with sane fallbacks when the server lacks either extension. Conversely, block-status tracks live changes to the image, while this bit is only valid at connection time.

My repo for each of the four projects contains a tag 'nbd-init-v1':

For doing interoperability testing, I find it handy to use:

/path/to/libnbd/run your command here

to pick up just-built qemu-nbd, nbdsh, and nbdkit that all support the feature.

For quickly setting flags:
nbdkit eval init_sparse='exit 0' init_zero='exit 0' ...

For quickly checking flags:
qemu-nbd --list ... | grep init
nbdsh -u uri... -c 'print(h.get_init_flags())'

Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Reply to: