[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [PATCH v2 3/3] doc: Add alternate trim/zero limits



On 03/16/2018 11:56 AM, Vladimir Sementsov-Ogievskiy wrote:

---
v2: wording tweaks, don't allow EINVAL failure on requests that
meet primary alignment but not alternate alignment, more text
added for the sake of cross-references
[1]


+The server MAY support individual commands that support a larger
+length than the stated maximum block length (such as `NBD_CMD_TRIM`
+and `NBD_CMD_WRITE_ZEROES`); in such cases, the server SHOULD
+advertise the alternative limits as part of the information exchanged
+during `NBD_OPT_GO`.

and NBD_OPT_INFO ?

Sure.


+
  Where a transmission request can have a nonzero *offset* and/or
  *length* (such as `NBD_CMD_READ`, `NBD_CMD_WRITE`, or `NBD_CMD_TRIM`),
  the client MUST ensure that *offset* and *length* are integer
@@ -818,8 +824,8 @@ be considered a denial of service attack (even if the advertised
  maximum block size is smaller).  For all other commands, where the
  *length* is not reflected in the payload (such as `NBD_CMD_TRIM` or
  `NBD_CMD_WRITE_ZEROES`), a server SHOULD merely fail the command with
-an `EINVAL` error for a client that exceeds the maximum block size,
-rather than initiating a hard disconnect.
+an `EINVAL` error for a client that exceeds the maximum size for that
+command, rather than initiating a hard disconnect.

looks like don't correspond to [1]

Perhaps my versioning notation was confusing. My actual wording changes compared to v1 are not here, but at [2].



  ## Values

@@ -1254,6 +1260,58 @@ during option haggling in the fixed newstyle negotiation.
        - 32 bits, preferred block size
        - 32 bits, maximum block size

+    * `NBD_INFO_TRIM_SIZE` (4)
+
+      Represents alternative limits that the server will honour during
+      `NBD_CMD_TRIM`.  The server MUST NOT send this info unless it
+      also sends `NBD_INFO_BLOCK_SIZE`, and SHOULD NOT send this info
+      unless it will also be advertising the transmission flag
+      `NBD_CMD_SEND_TRIM`.  The minimum trim size SHOULD be a power of
+      2, and MUST be at least as large as the preferred block size
+      advertised in `NBD_INFO_BLOCK_SIZE`; it represents the alignment
+      and minimum granularity that can be discarded.  When a client
+      request

TRIM request

Okay.


is aligned to the minimum block size but not the minimum
+      trim size, the server MUST NOT fail that request with `EINVAL`,

[2]

The intent is that if the server advertises:

min_block = 4k, max_block = 32M
min_zero = 32k, max_zero = 2G

then a client sending NBD_CMD_WRITE_ZEROES with a length of 4k is compliant and the server MUST honor that request (the server must assume the client predated the addition of the new NBD_INFO_ZERO_SIZE), rather than failing with EINVAL), even though a client that honors min_zero would take the new limit into account and only send WRITE_ZEROES with a length of 32k. But on the opposite side of the limits, a client that sends a WRITE_ZEROES with a length of 64M is in violation of a server that did not send NBD_INFO_BLOCK_SIZE, however, the server may still permit a per-command allowance for write zeroes to request larger lengths and thus not fail the command with EINVAL but rather act on the command. So my wording change at [1] was pointing out that rather than treating max_block as the limit for ALL commands, that some commands have a larger limit (whether or not that larger limit is advertised), and the server replies with EINVAL on a per-command basis rather than clamping all commands to max_block (however, it should NEVER reject a request smaller than max_block as being too large).


+      but SHOULD ignore the unaligned portion of the request (and trim
+      only the aligned subset, if any).
don't it violate severs, which you mention in first patch commit message,
"(for specific implementations where the client knows via external means that
the server will guarantee a reads-as-zero after the trim
completes)" ?

Hmm, you're right. It's only a SHOULD, not a MUST, so a server could instead manually write zeroes into that region; either way, the client still can't rely on the contents of that region. I'll see if I can improve the wording for v3.

We may also want to consider an extension where a server can advertise whether a successful TRIM guarantees a reads-as-zero effect (this behavior is not required of a generic server, but may be useful to exploit when a client knows that a specific server supports it). Perhaps such an extension would be advertising a tri-state value (no guarantees, only the aligned portions are guaranteed, or the entire trim request is guaranteed). But I don't want to tackle that in my v3.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


Reply to: