On 03/31/2016 01:41 PM, Alex Bligh wrote: > > On 31 Mar 2016, at 20:33, Eric Blake <eblake@...696...> wrote: > >> Qemu's nbd-client is setting NBD_CMD_FLAG_FUA during a flush command, >> but the official NBD protocol documentation doesn't describe this as >> valid (it merely states that flush must not have a reply until all >> acknowledged writes have hit permanent storage). Does this flag make >> sense (what semantics would the flag add, and we need to fix the NBD >> docs as well as relax the reference implementation to allow the flag), >> or is it a bug in qemu (and the recent tightening of NBD to throw EINVAL >> on unsupported flags will trip up qemu)? > > As the original author of that particular mess, the intent was that > they should reflect exactly the Linux kernel's semantics for FLUSH > and FUA, not only in terms of whether they can be used together, > but also exactly what they mean. Oh, and I also just found that qemu's nbd-server tries to honor FUA on read, even though the protocol doesn't document that as valid either. > > This turned out to be an easier way of describing the operations > than describing them semantically (in particular FLUSH, where I > couldn't get an entirely consistent answer of what it required > of inflight requests, specifically whether it required all > requests inflight at the time of making the request to be written > to disk prior to answering, or all requests inflight prior to the > time of replying to be written to disk prior to answering, though > I believe the former). > > FUA just requires that particular request to be persisted to > disk, and does not require other requests to be persisted to disk As written, NBD says that FUA requires the current write operation to land on disk (but says nothing about any other writes, whether those writes had an early reply). And for flush, NBD only requires that all writes that have _sent_ their reply to the client must land on disk, but this can certainly be a smaller set of write requests than _all_ writes issued prior to that point in time. So maybe flush+FUA is a valid thing to support, and means that ALL in-flight writes must land, whether or not a reply has been sent to the client, for an even stronger barrier? > So in answer to your question, my understanding is that FLUSH requires > (some subset) of otherwise potentially non-persisted requests to > be persisted to disk. In that sense it implies FUA. It is permitted > to set FUA (as it is permitted, I believe, in the linux block layer) > but it will make no difference. > > I once thought FUA on read should bypass any local read cache, though > that is not part of the spec currently. In qemu, read+FUA just triggers blk_co_flush() prior to reading; but that's the same function it calls for write+FUA. And for flush (whether or not FUA was specified), qemu still calls blk_co_flush(). So from qemu's perspective, FUA is synonymous with "finish ALL pending transactions", which is stronger than what the NBD protocol requires. (Nothing wrong with an implementation doing more work than required, although it may be less efficient). Alas, that means I can't use qemu's behavior as a good reference for how to improve the NBD spec. Meanwhile, it sounds like FUA is valid on read, write, AND flush (because the kernel supports all three), even if we aren't quite sure what to document of those flags. And that means qemu is correct, and the NBD protocol has a bug. Since you contributed the FUA flag, is that something you can try to improve? -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature