[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] [RFC] Proposal: NBD_CMD_READ2



Wouter,

On 20 Aug 2015, at 12:05, Wouter Verhelst <w@...112...> wrote:
> 
> One of the problems with the NBD protocol is that the read command sends
> out the reply header before the data. As such, if handling of a read
> request encounters a problem after the header has been sent out, there
> is no way currently to communicate this fact to the client.
> 
> This is a problem, because it forces the server to choose between a
> number of equally unattractive options:
> - The server could ignore read errors. This would mean the client would
>  get incorrect data.
> - The server could drop the connection on receiving a read error. This
>  would mean the client would see a lost connection without really
>  knowing what's happening.
> - The server could be required to read all data into memory before
>  sending out the reply header. This is problematic for busy servers
>  and/or large read requests.
> 
> I would therefore want to add another message to the protocol,
> NBD_CMD_READ2. The semantics of this message would be similar to
> NBD_CMD_READ, except that an nbd_reply structure is sent both before and
> after the read data.
> 
> If the first reply has a nonzero error message, then no data is to be
> expected by the client (this is different from the current semantics of
> NBD_CMD_READ as described in the protocol document).
> 
> If the second reply has a nonzero error message, the client should
> consider the received data to be (possibly partially) invalid.
> 
> The server should send "invalid request" error replies in the first
> reply header; it should send "medium error" replies in the second.
> 
> Thoughts?

This is something I've been banging on about occasionally for a while.

I think it's a good idea.

However, I would suggest an amendment.

If you do a large read, and many megabytes into the read, but many megabytes
before the end, you still have much the same problem. Such reads *do* happen
e.g. using qemu-img convert and nbd device.

Perhaps better would be to specify that the read would be produced in blocks
of a size defined by the server, and each block would be followed by a header
that could contain the error. This would add a few extra bytes of overhead
that the client could discard, and allow the server to break the reply up
conveniently. Each block header would specify the size of the next block OR
an error in respect of the previous block.

-- 
Alex Bligh







Reply to: