Re: [Nbd] Design concept for async/multithreaded nbd-server
- To: Goswin von Brederlow <goswin-v-b@...186...>
- Cc: email@example.com
- Subject: Re: [Nbd] Design concept for async/multithreaded nbd-server
- From: Wouter Verhelst <w@...112...>
- Date: Mon, 5 Mar 2012 09:56:01 +0100
- Message-id: <20120305085601.GD5472@...3...>
- In-reply-to: <87pqcutvko.fsf@...860...>
- References: <87pqcutvko.fsf@...860...>
On Sat, Mar 03, 2012 at 12:46:47AM +0100, Goswin von Brederlow wrote:
> I kind of want to just gather my thoughts on this and maybe use you as a
> sounding board. Hopefully this might also come in handy if you rewrite
> the nbd-server with threads.
> The story so far:
> The Linux kernel supports having multiple requests in the air and
> handles out-of-order replies to requests. There is also a patch for the
> nbd kernel module to support FLUSH/FUA/TRIMM that would make true
> asynchronous request handling safe.
> The nbd-server is single threaded and uses synchronous IO. Each request
> is completed and replied before the next request is handled. That means
> for example that a small read of a cached block has to wait for a large
> write preceeding it to complete. Also write requests wait for the write
> to actually complete before replying.
> Where do we go from here?
> There are multiple levels of async behaviour possible with more or less
> improvement in speed and increase in risk of data loss:
> 1) Handle requests in parallel but wait for each request to complete
> before replying. This would involve using fsync/file_sync_range/msync to
> ensure data reaches the physical disk (the disks write cache actually)
> before replying. This would be perfectly safe.
> 2) Handle requests in parallel but wait for each request to complete
> before replying. But do not fsync/file_sync_range/msync unless required
> by FUA or FLUSH. This would still be safe as long as the system does not
> crash. An nbd-server crash would not result in data loss.
> 3) Handle requests in parallel and reply imediatly on recieving write
> requests. This would be the fastest but also involve the most risk. The
> nbd-server would basically cache writes for a short while and a
> nbd-server chrash would loose that data. Error detection would also be
> problematic since requests have already been acknowledged by the time a
> write error occurs. The error would have to be transmitted in reply to
> the next FLUSH request or as a new kind of packet. So this might go to
> Requests could also be handled out-of-order. A read request send before
> an overlapping write request could reply with the data of the write. I
> do not believe that that would be correct behaviour. With multiple
> client connections the order in which requests from different clients
> are recieved is somewhat random. I would still serialize them in the
> order in which they are recieved. A write from client A should not cut
> in front of a read from client B.
> Handling multiple requests in parallel means that there could be
> overlaps between requests. Esspecially if the server supports multiple
> client connections. So some synchronization feature should be used.
> So what do you think?
I don't think this is needed.
When you have multiple overlapping requests in-flight to a disk, the
result is undefined. So filesystems already need to deal with that. As
such, I don't think there's anything inherently wrong with doing the
same for NBD.
Not dealing with multiple outstanding requests has the advantage that we
don't need to add incredible amounts of complexity to the server. This
is not to be underestimated.
So I'd recommend against implementing it. It doesn't help much, anyway.
The volume of a pizza of thickness a and radius z can be described by
the following formula:
pi zz a