Re: [Nbd] Question about the expected behaviour of nbd-server for async ops

To: Wouter Verhelst <w@...112...>
Cc: nbd-general@lists.sourceforge.net
Subject: Re: [Nbd] Question about the expected behaviour of nbd-server for async ops
From: Alex Bligh <alex@...872...>
Date: Sun, 29 May 2011 19:37:35 +0100
Message-id: <B42ED1042D1F1F692F0D590C@...873...>
Reply-to: Alex Bligh <alex@...872...>
In-reply-to: <20110529181151.GH31747@...510...>
References: <87oc2m28o7.fsf@...860...> <6DBA2A6208847F844397DB62@...873...> <87wrh9bz0c.fsf@...860...> <0F0AB0F66B196FBBE86999A5@...874...> <87fwnxy8gi.fsf@...860...> <4691046225438F6EB4A447F6@...873...> <20110529181151.GH31747@...510...>



--On 29 May 2011 20:11:51 +0200 Wouter Verhelst <w@...112...> wrote:

No, we need to define how nbd behaves. This may or may not be the same
thing as how the Linux block layer behaves, and it may change at some
point in the future if we add additional messages to the protocol.
I'm *not* going to be adding negotiation messages of the likes of "use
2.6.42-style semantics".


Your challenge, should you choose to accept it, is to define the
current requirements /without/ reference to behaviour of a specific
group of kernels...

See also my message to Goswin re us already having problems here
in theory.

I'm also not going to care about marking the particular write that
failed. If you're writing to a broken disk, your filesystem is going to
lock up anyway, and then all writes will fail. If write() returns an
error condition, I'd return that to the client. If fsync() returns an
error condition, I'd return that to the client too. Beyond that, I'm not
going to care much about assigning failures to the correct write.


I agree 100%. There is no need to care where the error occurs. Indeed
I think the kernel does a good job of losing the info anyway.

All I was saying is that if you are disordering writes, caching stuff
etc., and you get an underlying error, you might consider erroring all
subsequent writes/flushes - I'd rather tell the block layer too much
failed than too little. Writes do not have to fail atomically anyway.
A good example is as follows: suppose you operate a huge writeback cache.
The client runs ext3 without -obarriers=1, so issues requests without
FUA and issues no flushes. It then goes idle. Your watchdog timer
expires and you decide to flush some data out to disk of your
own volition. As it happens, the hard disk has died, and the fsync()
errors. You have no command to report this to the client. Are you
really going to let future writes to your huge RAM based cache
succeed, and never report the fsync() error? I'd be erroring every
write after the fsync error because I'd be saying "this nbd device is
broken - there is a high probability your writes will never reach
a disk". Given in normal operation (mount, unmount, disconnect)
you NEVER receive a REQ_FLUSH from an ext3 device with no barriers,
and you can't error the disconnect, not errorring subsequent writes
would lead the user to think his device had been successfully unmounted
and disconnected. I'd consider that bad.

--
Alex Bligh

Reply to:

References:
- [Nbd] Question about the expected behaviour of nbd-server for async ops
  - From: Goswin von Brederlow <goswin-v-b@...186...>
- Re: [Nbd] Question about the expected behaviour of nbd-server for async ops
  - From: Alex Bligh <alex@...872...>
- Re: [Nbd] Question about the expected behaviour of nbd-server for async ops
  - From: Goswin von Brederlow <goswin-v-b@...186...>
- Re: [Nbd] Question about the expected behaviour of nbd-server for async ops
  - From: Alex Bligh <alex@...872...>
- Re: [Nbd] Question about the expected behaviour of nbd-server for async ops
  - From: Goswin von Brederlow <goswin-v-b@...186...>
- Re: [Nbd] Question about the expected behaviour of nbd-server for async ops
  - From: Alex Bligh <alex@...872...>
- Re: [Nbd] Question about the expected behaviour of nbd-server for async ops
  - From: Wouter Verhelst <w@...112...>

Prev by Date: Re: [Nbd] Processing client's option list
Next by Date: Re: [Nbd] Question about the expected behaviour of nbd-server for async ops
Previous by thread: Re: [Nbd] Question about the expected behaviour of nbd-server for async ops
Next by thread: Re: [Nbd] Question about the expected behaviour of nbd-server for async ops
Index(es):
- Date
- Thread