[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [PATCH] doc/proto: drop max-block restriction for data-less commands



On 2/4/20 6:02 AM, Vladimir Sementsov-Ogievskiy wrote:
max-block restriction make sense for NBD_CMD_WRITE and NBD_CMD_READ to
avoid big data transfer and long io operations at server side.
NBD_CMD_WRITE_ZEROES still may be realized on server through direct
writing zeroes, which may lead to long operation and huge allocation
and should be restricted by max-block.
Same for NBD_CMD_CACHE: long operation / big allocation.

Still, NBD_CMD_TRIM, NBD_CMD_BLOCK_STATUS and NBD_CMD_WRITE_ZEROES with
NBD_CMD_FLAG_FAST_ZERO set are over-restricted by it. So, for better
performance, drop these restrictions.

Note, that Qemu nbd server already works accordingly to this patch: it
doesn't check the restriction for NBD_CMD_TRIM, NBD_CMD_BLOCK_STATUS
and NBD_CMD_WRITE_ZEROES.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---

RFC question: hmm, Qemu nbd doesn't check restriction for WRITE_ZEROES
at all, even without NBD_CMD_FLAG_FAST_ZERO flag.

That's actually a good question. I looked today at relaxing qemu as a client to take advantage of this change as committed, and it's a bit easier to relax if we can guarantee that the server handles all WRITE_ZEROES rather than just the FAST_ZERO requests with unlimited size (qemu automatically fragments large zero requests at the block layer; but if I make qemu's client code conditional on whether a FAST_ZERO is being requested, then block/nbd.c has to do report a larger max zero in nbd_refresh_limits() and fragmenting itself on non-fast requests, instead of getting pre-fragmented requests from the block layer).

But, as you say, with fast zeroes, the server already has to be aware on what qualifies as fast (only newer servers support it), while with older servers, you don't get FAST_ZERO. At the same time, older servers don't know about the newer relaxed requirement, so clients STILL have to be prepared for EINVAL failures when an older server rejects what it considers to be oversized requests, regardless of whether the newer server handles that larger size. So a client that wants to be portable to multiple servers HAS to be prepared for oversized requests to fail, and to consider scaling back to max block size rather than completely assuming the operation is not going to work.


So, we probably could go further, and allow big WRITE_ZEROES without
this flag..

Or may be change s/MUST NOT/SHOULD NOT/ for this case..

Or I could follow through with my idea of letting clients and servers handshake actual limits with additional NBD_INFO_* during NBD_OPT_GO (now that I just posted my NBD_INFO_INIT_STATE patches, I'm getting closer to that as my next project). The benefit of an advertised size is that clients won't have to guess if an oversized request will fail with EINVAL on an older server.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


Reply to: