Re: [Nbd] [RFC] Proposal: NBD_CMD_READ2

To: Markus Pargmann <mpa@...1897...>
Cc: "nbd-general@lists.sourceforge.net" <nbd-general@lists.sourceforge.net>
Subject: Re: [Nbd] [RFC] Proposal: NBD_CMD_READ2
From: Wouter Verhelst <w@...112...>
Date: Thu, 3 Sep 2015 21:55:37 +0200
Message-id: <20150903195537.GH8621@...3...>
In-reply-to: <20150820143541.GF18784@...1897...>
References: <20150820110519.GA17039@...3...> <AE44BFF5-25F7-495E-80AA-A91947526A2A@...872...> <20150820131754.GE18784@...1897...> <20150820134441.GB27381@...3...> <20150820143541.GF18784@...1897...>

Hi Markus,

On Thu, Aug 20, 2015 at 04:35:41PM +0200, Markus Pargmann wrote:
> On Thu, Aug 20, 2015 at 03:44:41PM +0200, Wouter Verhelst wrote:
> > On Thu, Aug 20, 2015 at 03:17:54PM +0200, Markus Pargmann wrote:
> > > I like the idea of blocks. However I think it would be better to have
> > > some kind of negotiation for the maximum block size for reads and
> > > writes.
> > > 
> > > As the server can choose the size of a block itself I think it wouldn't
> > > be a problem to have a header for each data block which has the
> > > error state and so on. Then we wouldn't have this strange semantic of
> > > reporting the error state of the data we already transmitted.
> > 
> > That would make doing things like using sendfile() to send out the
> > actual data impossible, as when you use sendfile(), you don't know that
> > an error occurred until you've put some of the data on the wire already.
> > 
> > (unless I'm missing things, which of course is possible)
> 
> Ah, I see.
> 
> I still like the idea of fragmented replies. May be useful for NBD on
> top of UDP at some point (no real plans right now, but who knows...
> would be perfect for bootloader nbd implementations ;) ).

We could of course use fragmented replies, I'm not opposed to that.
However, I think it's not very useful to have the client and the server
negotiate the fragment size. If the client doesn't want replies over a
given size, it should just not issue a read request that is too large.
The server should just send the data it wants to send; if a request is
too large to fit in memory, it should just not try to store it in memory
(which is the whole reason why I was proposing this change).

The point about large requests causing wasted bandwidth in case of an
error somewhere halfway through does hold some merit, however, and so I
can imagine we might want to add fragmented replies in the protocol; but
if we're going to do so, we should just hardcode the maximum reply size
and be done with it. That reply size should be not too large (bandwidth
thing), but also not be too small (so that we don't engage the CPU in
too much context switching between "send data" and "send next header").

> > > Also when using blocks with block headers an offset would be good
> > > together with the length of the data.
> > > 
> > > However this whole block thing would probably not be backwards
> > > compatible.
> > 
> > None of this would be, hence the idea of adding a READ2 command, and
> > retaining READ for backwards compatibility.
> 
> Yes, just for naming I would prefer something like READ_END instead of
> READ2. I am still thinking about this and other possibilities.

Fair enough, I'll be the first to admit the name wasn't carefully chosen
;-)

-- 
It is easy to love a country that is famous for chocolate and beer

  -- Barack Obama, speaking in Brussels, Belgium, 2014-03-26

Reply to:

Follow-Ups:
- Re: [Nbd] [RFC] Proposal: NBD_CMD_READ2
  - From: Markus Pargmann <mpa@...1897...>

Next by Date: Re: [Nbd] state of nbd and qemu-nbd
Next by thread: Re: [Nbd] [RFC] Proposal: NBD_CMD_READ2
Index(es):
- Date
- Thread