Re: [Nbd] [PATCH/RFCv2] Remove NBD_OPT_BLOCK_SIZE

To: Alex Bligh <alex@...872...>
Cc: "nbd-general@lists.sourceforge.net" <nbd-general@lists.sourceforge.net>
Subject: Re: [Nbd] [PATCH/RFCv2] Remove NBD_OPT_BLOCK_SIZE
From: Wouter Verhelst <w@...112...>
Date: Wed, 27 Apr 2016 14:04:59 +0200
Message-id: <20160427120459.GA17932@...3...>
In-reply-to: <97B91103-CE3B-4661-B528-85E45CCCC875@...872...>
References: <1461604203-63003-1-git-send-email-alex@...872...> <20160426083013.GA3624@...3...> <FC22BF52-F390-475C-A170-AA2B6599FAD5@...872...> <20160426103031.GA13582@...3...> <97B91103-CE3B-4661-B528-85E45CCCC875@...872...>

Hi,

On Tue, Apr 26, 2016 at 06:10:46PM +0100, Alex Bligh wrote:
[...]
> >> I'm arguing that we can legitimately make block size support part of the info
> >> spec, and say if you ask for info, you must be prepared to deal with it. The
> >> alternate view is that info and blocksizes are different things, in which
> >> case the server needs some way of knowing whether the client will respect
> >> block sizes if given.
> > 
> > Right. I think they are different things. They look similar today,
> > because we mostly add INFO because of the block sizes -- but we also add
> > it for STARTTLS, and a client which doesn't care about block sizes but
> > which does care about STARTTLS should not have to suddenly worry about
> > block-aligning requests, etc.
> 
> OK. I suppose my counterargument would be 'but blocksize constraints
> exist today, just like STARTTLS did before it was a formal option'.

Except that it doesn't, at least not entirely. Yes, there are
constraints on maximum block sizes. However, before I fixed that bug,
nbd-server would crash on requests >= 1MiB, and it took years to fix
because such large requests were (and still are) highly unlikely. So
that side of the question is boring and not what I'm talking about; and
a client which gets EOVERFLOW on a 1GB request gets what it deserves.

The minimum block size constraints, however, is an entirely different
matter. Current clients are well within their rights to send an
arbitrary-sized request and expect the server to handle it (modulo bugs
that apparently do exist). I'm not saying it's reasonable for clients to
send a request of 3 bytes, but really, it's legal today, and we
shouldn't make such clients suddenly buggy because we think it's weird
what they're doing.

[...]
> * Something in me feels the NBD_OPT stuff is asking the server
>   about what it does, rather than the client telling the server
>   about what it does. You almost want the server to be able to
>   ask the client things. This might be just in my head.

I think it is :-)

The option haggling was always intended to be a two-way conversation,
with the client setting options and the server (possibly) returning
information. This falls well within that.

[...]
> >> But we have more contentiousness in the following.
> > 
> > Not really.
> 
> I meant I disagreed more about what you wrote below this point.

and I meant to say that what I wrote below this point is less important
to me than the bit before, and that I'm more likely to just drop that
bit if needs be :-)

> >>> In addition, while I understand the reasoning behind announcing block
> >>> sizes, I do not want to make "abiding by announced block sizes" a
> >>> prerequisite of "using the more modern way of choosing an export". The
> >>> protocol currently has no requirement for a server to use a particular
> >>> block size, and I think this is a feature. Sure, most clients *do*
> >>> currently request data in block-sized units, but that doesn't mean we
> >>> need to make that a requirement. A simple user-space client which wants
> >>> to inspect (and possibly modify) some parts of an export might not care
> >>> about block sizes; we should make writing such clients not harder than
> >>> needs be.
> >> 
> >> Well, actually every server I know of ALREADY has block size constraints.
> >> nbd-server.c is weakest, in that it won't support operations >= 1>>31
> >> bytes. Qemu will fail stuff >32Mb. gonbdserver.c fails on stuff >128Mb
> >> (and I won't be removing that - the DoS opportunity is too great).
> >> Many servers perform substantially better if reads/writes are aligned.
> > 
> > Sure, but that's not my point.
> > 
> > A server is already allowed to reject 'too large' requests, even if
> > block sizes aren't negotiated. That's fine, and we should keep that.
> 
> They aren't allowed to per the standard, but they do.

Yes, that.

> > However, servers *aren't* currently allowed to reject requests that are
> > too small, or that are not block-aligned.
> 
> They aren't allowed to per the standard, but they do. I don't see
> the difference.

That the overflow is far less likely to happen than the underflow,
especially if the minimum size is set to something > 512.

> > I think that's the only case
> > in which NBD_REP_ERR_BLOCK_SIZE_REQD should be sent; it signals that the
> > server can't support a certain type of request, and that the client
> > should be prepared for that.
> 
> Right. So if you take the case of a server that can't (because it
> is impossible / impractical to do so, it carries other disadvantages,
> or the author is lazy, or whatever) support block writes < 512 bytes
> (say), that's no different from one that can't (because it is impossible
> / impractical to do so, or because the author is lazy or whatever)
> support >32Mb (cough, qemu) or >128Mb (cough, gonbdserver) or >allocatable
> memory (cough, nbdserver.c) requests.

Except that I believe the former to be far more likely than the latter.

> The way the block size stuff is currently written, there are a default
> set of block constraints - for these purposes the minimum size of 1
> byte and the maximum size of 0xffffffff are the only thing that
> matters. If you support those, you shouldn't send
> NBD_REP_ERR_BLOCK_SIZE_REQD (currently the text is slightly wonky
> as it says you shouldn't send it if you can agree them externally,
> whereas obviously if they are the default you don't need to
> agree them externally - I will fix that). IE you should only
> be sending NBD_REP_ERR_BLOCK_SIZE_REQD if you can't tell the
> client that you don't support the block sizes it is going
> to expect (in the absence of asking you).
> 
> > Block-aligning requests on the client side may be a bit much work
> > though, and some clients may not have the ability to abide by that
> > request. Therefore, if yoyu're going to require block-aligned requests,
> > you're effectively passing work to clients.
> 
> Well that's one way of looking at it. Another is you are otherwise
> requiring the server to do it.

Yes, but the server is far less likely to be in a minimal environment
where "doing a lot of work" is Hard(tm) than the client is.

> In the general case I suspect it's easier on the client, because most
> client stacks (kernel, qemu being my 2 examples) are already capable
> of dealing with backends that have fixed block sizes (normal block
> devices, hard disks etc.) so will already do this.

Mmm. Possibly.

[...]
> >> Much better to fail at negotiation stage if you really can't
> >> support (or can't risk pretending to support) the client.
> >> 
> >> If what you are saying is 'it is unreasonable to use the nbd
> >> protocol unless you can always somehow support byte-aligned
> >> writes' I don't really agree.
> > 
> > I'm not saying that.
> > 
> > A small server could certainly use NBD_REP_ERR_BLOCK_SIZE_REQD and
> > refuse to talk to clients who don't (announce they) know how to
> > block-align requests. There's nothing wrong with that.
> > 
> > However, a server written for full interoperability and maximum
> > usefulness should not do so. It may issue a warning that it will be
> > slower, but it should be able to operate at a basic level.
> 
> .... you are characterising such a server as a server with
> less than full interoperability. I don't agree. It's a perfectly
> reasonable server choice. In fact I'd say 'not supporting block
> sizes is an issue with client interoperability, not a
> server issue'.

The problem is that current clients, which don't know about block sizes,
will be *completely* unable to speak with a server which *requires*
block size negotiation. A server which does not *require* such
negotiation (but prefers it) will be able to talk to such a client. As
such, a server which sends BLOCK_SIZE_REQD is less than fully
interoperable.

It's okay (in my book) for a server to perform less than optimally with
older clients, but it's *not* okay for a server to refuse to talk to
older clients (if we can avoid it).

> NBD is fundamentally to me a block device, not a file seeking
> device (the clue being in the name). If a hard disk vendor said
> "well, I'm sorry our disk only supports reads and writes by
> whole sectors", the response "well that's not very interoperable"
> would not be seen as sensible. Nor would a 'iSCSI vendors
> SHOULD support byte-wise writes to block devices'. However,
> if we found in the iSCSI protocol 'clients MUST respect block
> sizes between X and Y as returned by the iSCSI target' we'd
> be pretty unsurprised.
> 
> That's the bit I think you may have backwards.

No, I think you misunderstand what I want to see happen here.

> > For that reason, I think we should add some language to discourage the
> > use of that option, with the understanding that "discourage" does not
> > mean "forbid".
> 
> If it was me, I would make it the other way around, that clients
> SHOULD ask for blocksize info and respect it. And were it not for
> the fact we haven't had blocksize in there from day 1, I'd make
> it a 'MUST' (obviously that's impractical now).

Yes, clients clearly should ask for the information. My point is that a
server which sees a client which doesn't ask for it, should be written
so that it can still talk with said client (which presumably is an older
client).

[...]
-- 
< ron> I mean, the main *practical* problem with C++, is there's like a dozen
       people in the world who think they really understand all of its rules,
       and pretty much all of them are just lying to themselves too.
 -- #debian-devel, OFTC, 2016-02-12

Reply to:

Follow-Ups:
- Re: [Nbd] [PATCH/RFCv2] Remove NBD_OPT_BLOCK_SIZE
  - From: Alex Bligh <alex@...872...>

References:
- [Nbd] [PATCH/RFCv2] Remove NBD_OPT_BLOCK_SIZE
  - From: Alex Bligh <alex@...872...>
- Re: [Nbd] [PATCH/RFCv2] Remove NBD_OPT_BLOCK_SIZE
  - From: Wouter Verhelst <w@...112...>
- Re: [Nbd] [PATCH/RFCv2] Remove NBD_OPT_BLOCK_SIZE
  - From: Alex Bligh <alex@...872...>
- Re: [Nbd] [PATCH/RFCv2] Remove NBD_OPT_BLOCK_SIZE
  - From: Wouter Verhelst <w@...112...>
- Re: [Nbd] [PATCH/RFCv2] Remove NBD_OPT_BLOCK_SIZE
  - From: Alex Bligh <alex@...872...>

Prev by Date: Re: [Nbd] [PATCH/RFCv4] Remove NBD_OPT_BLOCK_SIZE; add specific requests to NBD_OPT_INFO
Next by Date: [Nbd] [PATCH] Fix might sleep warning.
Previous by thread: Re: [Nbd] [PATCH/RFCv2] Remove NBD_OPT_BLOCK_SIZE
Next by thread: Re: [Nbd] [PATCH/RFCv2] Remove NBD_OPT_BLOCK_SIZE
Index(es):
- Date
- Thread