Re: [Nbd] [RFC PATCH] doc: In STRUCTURED_REPLY, make error types easy to recognize

To: Alex Bligh <alex@...872...>
Cc: "nbd-general@lists.sourceforge.net" <nbd-general@lists.sourceforge.net>
Subject: Re: [Nbd] [RFC PATCH] doc: In STRUCTURED_REPLY, make error types easy to recognize
From: Eric Blake <eblake@...696...>
Date: Fri, 8 Apr 2016 14:33:46 -0600
Message-id: <570815AA.6040201@...696...>
In-reply-to: <B76616D0-B2F3-4313-9B94-B178C4A72828@...872...>
References: <1460134137-3432-1-git-send-email-eblake@...696...> <B76616D0-B2F3-4313-9B94-B178C4A72828@...872...>

On 04/08/2016 11:17 AM, Alex Bligh wrote:
> 
> On 8 Apr 2016, at 17:48, Eric Blake <eblake@...696...> wrote:
>> We may add future structured error replies; making it easy
>> for older clients to properly treat such new reply types as
>> an error gives us a bit more flexibility on introducing new
>> errors to existing commands.  Of course, good design practice
>> says we should try hard to prevent the server from sending
>> something the client isn't expecting, but covering the
>> situation from both sides never hurts.  Also, marking error
>> types resembles how NBD_REP_ERR_* has a common layout.
> 
> Having the error chunks with bit 15 set is a good plan.
> 
> Not 100% sure about the other bits ...

I wasn't either - which is why this is still RFC (and I don't have qemu
code anywhere close to posting for STRUCTURED_REPLY yet, so we still
have some wiggle room).

>> +    - `NBD_REPLY_TYPE_ERROR` (2^15 + 1)
> 
> Whilst you are playing with these:
> 
> NBD_REP_CHUNK_TYPE_ERROR ? (etc.)

Or even: NBD_REP_CHUNK_ERROR? (That is, TYPE_ adds nothing over CHUNK_).

But I don't like the collision between NBD_REP_* for handshake phase and
for transmission phase.  Having NBD_REPLY_* was at least a unique
namespace that said _which_ phase the reply type belongs to.  Beyond
that, I don't care what color we paint the bikeshed.

It's already bad enough that transmission requests are NBD_CMD_* (we
should either call it transmission commands, or consider NBD_REQ_*).

>> +      The payload is structured as:
>> +
>> +      32 bits: error (MUST be nonzero)  
>> +      32 bits: message length, optional, but MUST be the header
>> +        *length* - 8 if present, and MUST be present if a message is
>> +	included  
>> +      *message length* bytes: optional string suitable for
>> +        direct display to a human being  
> 
> This seems overly complicated which means someone will write
> something that doesn't parse it correct.
> 
> Why not make the length mandatory, or absent (i.e. just work
> it out from the chunk length)?

Tradeoffs are:

Mandatory: _every_ error packet has more bulk over the wire. But error
cases are rare, so who cares.  Clients don't have to special case.

Work it out from the chunk length: Not possible.  The whole intent is
that down the road, we may add:

NBD_REPLY_TYPE_ERROR_SHINY_NEW (2^15 + 5)

which wants to report LOTS of structured information.  The whole point
of this proposal is that we are guaranteeing that it will be laid out as:

32 bits: error (non-zero)
32 bits: message length (non-zero)
new fields... shiny :)
message length bytes: optional trailing string

Knowing the length of the chunk message tells you nothing about where
the string actually starts within the chunk - you HAVE to know somehow
that this particular error message includes X bytes of structured
components, so that if you get the SHINY_NEW error in reply to an
existing command, you: 1) know it is an error, even if you don't know
what the fields are telling you about it, and 2) know how to find where
the human-readable string is

> The 'optional' case is
> bizarre. you're optimising to save a single 32 bit word
> transmission *in an error* condition. It's far more likely no
> one ever tests one of the two ways of transmitting errors, than
> the 'speed of reporting disk errors' makes a difference. 

Fair point - I'm not at all upset if we make it mandatory. Also, making
it mandatory gives more encouragement for servers to supply a
human-readable string (making it easy to omit a zero-valued message
length encourages omitting a message; and once you HAVE a message, the 4
bytes [or even 2, since we've capped strings to 4096 bytes] is no longer
dominating the message).

> 
> Making it mandatory would equate it to NBD_REPLY_TYPE_ERROR_OFFSET
> in the new form; making it absent would equate it to what
> NBD_REPLY_TYPE_ERROR_OFFSET was before.
> 
>> +    - `NBD_REPLY_TYPE_ERROR_OFFSET` (2^15 + 2)
>> +
>> +      This chunk type is in the error chunk category.  *length* MUST
>> +      be at least 16.  This reply represents that an error occurred at
>> +      a given offset, which MUST lie within the original offset and
>> +      length of the request; the client can use this offset to
>> +      determine if request had any partial success.  This chunk type
>> +      MAY appear multiple times in a structured reply, although the
>> +      same offset SHOULD NOT be repeated.  Likewise, if content chunks
>> +      were sent earlier in the structured reply, the server SHOULD NOT
>> +      send multiple distinct offsets that lie within the bounds of a
>> +      single content chunk.  Valid as a reply to `NBD_CMD_READ`,
>> +      `NBD_CMD_WRITE`, `NBD_CMD_WRITE_ZEROES`, and `NBD_CMD_TRIM`.
>> +
>> +      The payload is structured as:
>> +
>> +      32 bits: error (MUST be non-zero)  
>> +      32 bits: message length, MUST be *length* - 16
>> +      64 bits: offset (unsigned)  
>> +      *message length* bytes: optional string suitable for
>> +         direct display to a human being  
> 
> Firstly, surely we can have the length of the message
> next to the message!

Seems reasonable enough to do.

So, let's use two examples, of two different NBD_REPLY_TYPE_ERROR_OFFSET
messages (one with string, the other without). As I originally wrote the
RFC, they would be sent over the wire as:

With my original RFC, the same two messages were:

4 bytes: error
4 bytes: message length (11)
8 bytes: offset
11 bytes: "hello world"

4 bytes: error
4 bytes: message length (0)
8 bytes: offset

Under your counter of keeping message length and contents next to one
another, plus my idea above of only 2 bytes for message length because
of our 4096 string length cap, those same to chunks become:

4 bytes: error
2 bytes: message length (11)
11 bytes: "hello world"
8 bytes: offset

4 bytes: error
2 bytes: message length (0)
8 bytes: offset

And looking at it, I see another viable alternative:

32 bits: error (non-zero)
arbitrary (up to 4096) number of bytes: message string, no NUL bytes
1-4 bytes: NUL padding (to be nice, server SHOULD round to
   boundary of 4, but client MUST be prepared for any alignment
   on remaining fields)
remaining bytes: additional structure, size determined by reply
   header length

That way, we've guaranteed that the error string has NUL termination
(unlike in other places), and no longer need to advertise the message
length over the wire. We've also guaranteed that the message string
starts at a fixed offset (the client can ALWAYS dump whatever string it
finds at the offset immediately following the error field, and dump
until it hits a NUL byte, even if the client has no idea how to parse
the tail of the message).  And clients that DO recognize the error type
can take header_length - lookup_table[reply_type] as the offset where
the important binary fields live in the remainder of the reply.

Under this alternate, two same two examples are sent over the wire as:

4 bytes: error
11 bytes: "hello world"
1 byte: pad: {'\0'}
8 bytes: offset

4 bytes: error
0 bytes: ""
4 bytes: pad: {'\0', '\0', '\0', '\0'}
8 bytes: offset

Preferences?

> 
> Secondly, I'm guessing the purpose of intoducing
> a new field here (message length) whose value we already
> know (message length -16) is for forward compatibility, i.e.
> so at a later stage we could add more structured error
> stuff and clients could skip the things after the message
> they didn't understand using the chunk's 'length' field.

Yes.

> That's great, fine in principle, but ...
> 
>> +    If the client receives an unknown or unexpected type with bit 15
>> +    set, it MUST consider the current reply as errored, but MAY
>> +    continue transmission unless it detects that *length* is
>> +    inconsistent with *message length*.
> 
> ... such a 'new' server which inserts something after the 'message'
> field is going to have exactly that effect, so you are now mandating
> it close the connection. I thus don't understand the purpose.

Under the layout in my original RFC, the message ALWAYS lives at the
end; the fact that a server is returning a new error type means that the
new fields are in the middle of the structure, and the point of the
message length field at a fixed offset is to give enough information
about how to find the start of the message. The only way that *length*
can be inconsistent with *message length* is if *message length* >
*length* - 8.  I guess I need to spell that out better :)

The rules under the alternative in this message are slightly different,
but again, a client can detect a bogus server [or MitM attacking the
server's stream] if it does not find any NUL bytes in the message (that
is, if strnlen(message, header_length) == header_length).

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

Reply to:

Follow-Ups:
- Re: [Nbd] [RFC PATCH] doc: In STRUCTURED_REPLY, make error types easy to recognize
  - From: Alex Bligh <alex@...872...>

References:
- [Nbd] [RFC PATCH] doc: In STRUCTURED_REPLY, make error types easy to recognize
  - From: Eric Blake <eblake@...696...>
- Re: [Nbd] [RFC PATCH] doc: In STRUCTURED_REPLY, make error types easy to recognize
  - From: Alex Bligh <alex@...872...>

Prev by Date: Re: [Nbd] [RFC PATCH] doc: In STRUCTURED_REPLY, make error types easy to recognize
Next by Date: [Nbd] [RFC PATCH 00/18] NBD protocol additions
Previous by thread: Re: [Nbd] [RFC PATCH] doc: In STRUCTURED_REPLY, make error types easy to recognize
Next by thread: Re: [Nbd] [RFC PATCH] doc: In STRUCTURED_REPLY, make error types easy to recognize
Index(es):
- Date
- Thread