[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] Question about the expected behaviour of nbd-server for async ops



On Sun, May 29, 2011 at 12:08:03PM +0200, Goswin von Brederlow wrote:
> Alex Bligh <alex@...872...> writes:
> 
> > Goswin,
> >
> > --On 28 May 2011 16:37:12 +0200 Goswin von Brederlow
> > <goswin-v-b@...186...> wrote:
> >
> >> 2) Overlapping requests
> >>
> >> I assume that requests may overlap. For example a client may write a
> >> block of data and read it again before the write was ACKed. This would
> >> be unexpected behaviour from a proper client but not forbidden.
> >
> > Correct
> >
> >> As such
> >> the server has to internally ensure the proper order of overlapping
> >> requests.
> >
> > Slightly surprisingly, the fsdevel folk's answer to this is that you
> > can disorder both reads and writes and do what is natural, i.e. do
> > not maintain ordering. A file system which cares about the result
> > should not issue reads of blocks for which the writes have not
> > completed.
> 
> I guess this makes sense if you think of the behaviour with multiple
> cpus and threads. The threads might invoke read/write calls at the same
> time. Allowing disorder means that the requests can be processed in
> parallel through all the layers withough having to synchronize between
> cpus.
> 
> Wouter: Could we make a decision here about the behaviour of a correct
> nbd-server in this? Must it logically preserve the order or read/write
> requests (i.e. return the value you would get if it had been done in
> order) or can it implement the disordered behaviour that linux seems to
> allow?

If the kernel currently allows it, then I'll allow it too: the result of
reading a block of data for which the write call has not been confirmed
yet is now officially undefined

> The later would be much simpler code wise.

Quite.

> >> 3) Timing of replies and behaviour details
> >>
> >> Now this is the big one. When should the server reply to a request and
> >> how should it behave in detail? Most important is the barrier question
> >> on FUA/FLUSH.
> >>
> >> * NBD_CMD_READ: reply when it has the data, no choice there
> >
> > Technically you need not reply as soon as you have data, but you
> > can't reply before.
> 
> True. And a read reply takes times (lots of data to send). In case there
> are multiple replies pending it would make sense to order them so that
> FUA/FLUSH get priority I think. After that I think all read replies
> should go out in order of their request (oldest first) and write replies
> last. Reason being that something will be waiting for the read while the
> writes are likely cached. On the other hand write replies are tiny and
> sending them first gets them out of the way and clears up dirty pages on
> the client side faster. That might be beneficial too.
> 
> What do you think?

I've been working on a multithreaded scatter/gather implementation of
the backend; the current code is in a 'scatgat' branch (which needs
updating for post-2.9.21 patches). It currently fails 'make check'
though, so it's certainly not ready yet.

That implementation, once complete, will read a request from the socket,
check whether it is a read or write request and read the data in case of
a write request, submit the request to a GThreadPool to read from or
write to the backend storage, and use writev (which the man page tells
me is supposed to be atomic) to write the entire reply to the socket at
once (that is, the header plus the data in case of NBD_CMD_READ). So
reads from the socket are done in the main thread, and writes are done
from worker threads, and we don't need to use mutexes much, which is
good.

This would not care about ordering much, is a fairly simple
implementation (once I get rid of the bugs), and should improve
performance since reads and writes are no longer handled sequentially.

[...]

-- 
The volume of a pizza of thickness a and radius z can be described by
the following formula:

pi zz a



Reply to: