[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[PATCH v2] proto: add xNBD command NBD_CMD_CACHE to the spec



From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

The ability to request an NBD server to cache things locally,
without also spending time in the client reading what was just
cached, has proved useful to both the xNBD implementation
(public), and to a VM restore operation performed by Virtuozzo
atop qemu (not yet public).  Time to document this functionality,
rather than calling the command a failed experiment.

The documentation is rather fuzzy, in that it makes no
requirements on what constitutes a valid cache (a server can
implement this command as a no-op) and no guarantees that a
cache will benefit the client (although for a specific server
implementation, the client can indeed benefit).  But as the
command has a use in existing implementations, it's better
to document what we can to prevent future incompatible use
of the command.

Note that this proposal adds NBD_FLAG_SEND_CACHE which xNBD
did not use; meanwhile, the command NBD_CMD_CACHE is allowed
to fail with no visible side effects.  The documentation tries
to point out interoperability constraints resulting from
existing servers that understand the command but don't advertise
it, as well as existing clients that send the command even when
it is not advertised.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180326161049.15981-1-vsementsov@virtuozzo.com>
[eblake: add commit message, add more documentation]
Signed-off-by: Eric Blake <eblake@redhat.com>
---

This is an updated version of the NBD_CMD_CACHE patch that
Vladimir proposed [1], trying to incorporate feedback raised
in that thread.

Vladimir has also posted a proof-of-concept implementation for
qemu [2].

[1] https://lists.debian.org/nbd/2018/03/msg00040.html
[2] https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg01937.html

 doc/proto.md | 40 +++++++++++++++++++++++++++++++++++++---
 1 file changed, 37 insertions(+), 3 deletions(-)

diff --git a/doc/proto.md b/doc/proto.md
index 32a36ba..09bfc92 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -1060,6 +1060,10 @@ The field has the following format:
   the export.
 - bit 9, `NBD_FLAG_SEND_RESIZE`: defined by the experimental `RESIZE`
   [extension](https://github.com/NetworkBlockDevice/nbd/blob/extension-resize/doc/proto.md).
+- bit 10, `NBD_FLAG_SEND_CACHE`: documents that the server understands
+  `NBD_CMD_CACHE`; however, note that a server MAY support the command
+  even without advertising this bit, and conversely that this bit does
+  not guarantee that the command will succeed or have an impact.

 Clients SHOULD ignore unknown flags.

@@ -2005,6 +2009,39 @@ The following request types exist:
     A client MUST NOT send a trim request unless `NBD_FLAG_SEND_TRIM`
     was set in the transmission flags field.

+* `NBD_CMD_CACHE` (5)
+
+    A cache request.  The client is informing the server that it plans
+    to access the area specified by *offset* and *length*.  The server
+    MAY use this information to speed up further access to that area
+    (for example, by performing the actions of `NBD_CMD_READ` but
+    replying with just status instead of a payload, by using
+    posix_fadvise(), or by retrieving remote data into a local cache
+    so that future reads and unaligned writes to that region are
+    faster).  However, it is unspecified what the server's actual
+    caching mechanism is (if any), whether there is a limit on how
+    much can be cached at once, and whether writes to a cached region
+    have write-through or write-back semantics.  Thus, even when this
+    command reports success, there is no guarantee of an actual
+    performance gain.  A future version of this standard may add
+    command flags to request particular caching behaviors, where a
+    server would reply with an error if that behavior cannot be
+    accomplished.
+
+    If an error occurs, the server MUST set the appropriate error code
+    in the error field. However failure on this operation does not
+    imply that further read and write requests on this area will fail.
+    When no command flags are in use, the server MAY send a reply
+    prior to the requested area being fully cached.
+
+    A client MAY attempt to send a cache request even when
+    `NBD_FLAG_SEND_CACHE` was not set in the transmission flags field,
+    however, in that case, it MUST NOT use any command flags.  A
+    server MAY advertise `NBD_FLAG_SEND_CACHE` even if the command has
+    no effect or even if the command always fails with `EINVAL`;
+    however, if it advertised the command, the server MUST reject any
+    command flags it does not recognize.
+
 * `NBD_CMD_WRITE_ZEROES` (6)

     A write request with no payload. *Offset* and *length* define the
@@ -2107,9 +2144,6 @@ The following request types exist:
     maintainer of this document, so that these messages can be listed here
     to avoid conflicting implementations.

-    Currently one such message is known: `NBD_CMD_CACHE`, with type set to
-    5, implemented by xnbd.
-
 #### Error values

 The error values are used for the error field in the reply message.
-- 
2.14.3


Reply to: