Re: [Nbd] Question about the expected behaviour of nbd-server for async ops
- To: Alex Bligh <alex@...872...>
- Cc: nbd-general@lists.sourceforge.net
- Subject: Re: [Nbd] Question about the expected behaviour of nbd-server for async ops
- From: Wouter Verhelst <w@...112...>
- Date: Sun, 29 May 2011 08:50:38 +0200
- Message-id: <20110529065038.GA25100@...510...>
- In-reply-to: <6DBA2A6208847F844397DB62@...873...>
- References: <87oc2m28o7.fsf@...860...> <6DBA2A6208847F844397DB62@...873...>
On Sat, May 28, 2011 at 05:35:22PM +0100, Alex Bligh wrote:
> Goswin,
>
> --On 28 May 2011 16:37:12 +0200 Goswin von Brederlow <goswin-v-b@...186...>
> wrote:
>
> My view is that this is derived from the linux request layer, in
> which case (having asked much the same question on fsdevel
> a couple of days ago) the answers appear to be as follows:
>
> > 1) Order of replies
> >
> > Currently nbd-server works all requests in order and replies in
> > order. Since every request/reply has a handle to uniquely pair them I
> > assume replying to requests out of order is allowed and will (most
> > likely) be handled correctly by existing clients.
>
> Handles can be reused only once the command in question is completed.
>
> You may process commands out of order, and reply out of order,
> save that
> a) all write commands *completed* before you process a REQ_FLUSH
> must be written to non-volatile storage prior to completing
> that REQ_FLUSH (though apparently you should, if possible, make
> this true for all write commands *received*, which is a stronger
> condition) [Ignore this if you don't set SEND_REQ_FLUSH]
We already implement that stronger condition, because writes are handled
in the way they are received. It shouldn't be too hard to implement when
disordered handling of requests is done, either: stop handling incoming
requests when you receive a flush request; flag all outstanding requests
so you know when the flush can be done (after which you can start
handling incoming requests again); and handle the flush when all flagged
requests have been handled.
[...]
> > 2) Overlapping requests
> >
> > I assume that requests may overlap. For example a client may write a
> > block of data and read it again before the write was ACKed. This would
> > be unexpected behaviour from a proper client but not forbidden.
>
> Correct
>
> > As such
> > the server has to internally ensure the proper order of overlapping
> > requests.
>
> Slightly surprisingly, the fsdevel folk's answer to this is that you
> can disorder both reads and writes and do what is natural, i.e. do
> not maintain ordering. A file system which cares about the result
> should not issue reads of blocks for which the writes have not
> completed.
Interesting to know.
[...]
> > + not NBD_CMD_FLAG_FUA:
> > a) reply when the data has been recieved
> > b) reply when the data has been commited to cache (write() returned)
> > c) reply when the data has been commited to physical medium
>
> You may do any of those. Provided you will write the data "eventually"
> (i.e. when you receive a REQ_FLUSH or a disconnect).
>
> > For a+b how does one report write errors that only appear after
> > the reply? Report them in the next FLUSH request?
>
> You don't. To be safe, I'd error every write (i.e. turn the medium
> read only).
I don't think errors that appear after the reply are possible in the
case of b (they are in the case of a, obviously)? Or what am I missing?
[...]
> > * NBD_CMD_DISC: Wait for all pending requests to finish, close socket
>
> You should reply to all pending requests prior to closing the socket
> I believe, mostly as it's polite. I believe the current client doesn't
> send a disconnect until all replies are in,
I believe so too, yes.
[...]
> and I also think the server may behave a little badly here.
How so?
> > Should this flush data before closing the socket? And if so what if
> > there is an error on flush? I guess clients should send NBD_CMD_FLUSH
> > prior to NBD_CMD_DISC if they care.
>
> No, you should not rely on this happening. Even umount of an ext2 volume
> will not send NBD_FLUSH where kernel, client, and server support it.
> You don't need to write it then and there (in fact there is no 'then
> and there' as an NBD_CMD_DISC has no reply),
It does have one -- the FIN packet. But yeah, it's not an
application-layer reply, that much is true.
> but you cannot guarantee *at all* that you will have received any sort
> of flush under any circumstances.
Correct. All you know is that the server will close its file handles on
disconnect.
> > What if there are more requests after this while waiting for pending
> > requests to finish? Should they be ignored or return an error?
>
> I believe it is an, um, undocumented implicit assumption that no
> commands are sent after NBD_CMD_DISC is sent. The current server
> just closes the socket, which will probably result in an EPIPE
> upstream if the FIN packet gets back before these other commands
> are written.
The client will flush its outgoing queue before sending a disconnect
request. Indeed, if it didn't do that, badness would ensue.
[...]
--
The volume of a pizza of thickness a and radius z can be described by
the following formula:
pi zz a
Reply to: