[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] [PATCH v2] doc: Add new NBD_REP_INFO reply, for advertising block size



On Wed, Apr 13, 2016 at 08:50:22AM -0600, Eric Blake wrote:
> On 04/13/2016 01:27 AM, Wouter Verhelst wrote:
> > Eric,
> > 
> > On Tue, Apr 12, 2016 at 06:16:54PM -0600, Eric Blake wrote:
> >> Existing NBD servers often have limitations, such as requiring
> >> actions to be aligned to block sizes or limiting maximum
> >> transactions to avoid denial of service attacks; for example,
> >> qemu's NBD server refuses any transaction larger than 32M.  But
> >> to date, clients have to learn these limitations via out-of-band
> >> means.
> >>
> >> This alters NBD_OPT_INFO and NBD_OPT_GO to use a new reply type
> >> NBD_REP_INFO (rather than overloading NBD_REP_SERVER), so that
> >> we have a future-proof way of supplying as much additional
> >> structured information about an export as we want.
> >>
> >> Design decision: a client that wants to learn block sizes MUST
> >> use NBD_OPT_GO, rather than the old NBD_OPT_EXPORT_NAME, even
> >> though we could have repurposed some of the reserved zeroes when
> >> NBD_FLAG_C_NO_ZEROES is not in effect, because we don't want to
> >> encourage any further abuse of NBD_OPT_EXPORT_NAME.
> >>
> >> Design decision: no new global NBD_FLAG or NBD_OPT are required;
> >> there is nothing for the client to negotiate.  The server merely
> >> provides as much information as it can, and the client then
> >> interprets what information it understands.  The items are
> >> structured so that a client can ignore details from the server
> >> that the client does not know about, and that we can easily add
> >> future items of information.
> > 
> > General note (in-depth review may follow later):
> > 
> > Currently, there are no default minimum or maximum block sizes, and
> > therefore they are effectively limited to "1 byte" for the minimum block
> > size, and "the size of the device" for the maximum block size.
> 
> We have existing servers that don't comply with this; I'm trying to
> approach it from the angle of:
> 
> - if the server advertises, then that is the limit
> - if the client and server communicate out of band, then use those limits
> - if the server does not advertise, then the server MUST support a
> minimum block size of 1 (although not all existing servers do, they can
> be considered buggy), and the client SHOULD NOT send requests smaller
> than 512 unless the export size proves a smaller block size is in use
> (that is, if export size % 512 is non-zero, we know the server supports
> a smaller block size in order to access those tail bytes), but the
> client MAY still attempt to send smaller alignments and hope that the
> server is not buggy
> 
> > 
> > I do agree that it might be advantageous for the server to announce such
> > minimum and maximum sizes, but I don't think that defining defaults that
> > differ from what historically has been the effective default is the
> > right way to go.
> > 
> 
> But that's my argument - historically, there ARE servers which insist on
> 512 or even 4096-byte alignment, and which misbehave (close the
> connection, return an error, or possibly even corrupt data) rather than
> do read-modify-write on misaligned requests.

Haven't seen those servers, but if they exist, they're buggy. There is
nothing in the protocol *today* that allows them to do so.

Yes, in practice, they're probably used mostly by their own clients, and
yes, in practice most clients perform requests by the block rather than
by the byte, anyway, but that doesn't mean doing something like that is
legal.

> And qemu's server certainly kills connections on attempts to exceed 32M in a
> single transaction.

Right. I'm willing to cut them some slack on that, mostly because older
versions of nbd-server would choke much earlier than that.

> So while the theoretical server should be unlimited
> unless advertised, the practical client should stick within sane limits
> rather than attempting to exploit unlimited.

Yes. But that is something else than to claim that "the default is X Y
Z".

Something like this seems sane:

A client SHOULD assume (sane defaults), but MAY assume anything.

A server MAY announce (whatever it wants), but SHOULD allow (pretty much
anything) if it doesn't.

A server MUST be prepared for clients which ignore whatever they
announce by, at the very least, sending appropriate errors, and MUST NOT
corrupt data in that case.

> I'm open to suggestions on wording (such as using SHOULD rather than
> MUST when a value is not advertised, so that existing implementations
> are merely poorer quality of implementation rather than in violation of
> the spec).
> 
> > Therefore, I would like this to say that unless you announce
> > differently, the maximum block size is the size of the device, and the
> > minimum block size is 1 byte.
> > 
> > Having a default for preferred block size sounds sane, although it might
> > be better to switch it to 4096 (which is what most conversations seem to
> > use today) rather than 512.
> 
> Except that I suggested 65k, not 4096, as the default preferred block
> size.  Even though 4096 may be atomic, it is not as efficient; and in at
> least qemu's case, the default qcow2 file sizing uses 32k clusters as
> its preferred I/O size.

Still, I think 4096 makes more sense as a default preferred block size.
If it is not efficient in a particular situation, it is the server's
responsibility to announce the correct value. If there is no information
on what's most efficient, then using a value which is larger than most
filesystems' native block size promotes more bytes being shovelled
across the wire than is necessary.

So, using the most common block size in current transactions seems like
a good default. I say that it is "the most common", because nbd-client
has defaulted to 4k blocks for a very long time (there was a recent
change to the relevant code, but still).

-- 
< ron> I mean, the main *practical* problem with C++, is there's like a dozen
       people in the world who think they really understand all of its rules,
       and pretty much all of them are just lying to themselves too.
 -- #debian-devel, OFTC, 2016-02-12



Reply to: