[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] write_zeroes/trim on the whole disk



> On 24 Sep 2016, at 18:47, Vladimir Sementsov-Ogievskiy <vsementsov@...2723...9...> wrote:
> 
> I just wanted to say, that if we want a possibility of clearing the whole disk in one request for qcow2 we have to take 512 as granularity for such requests (with X = 9). An this is too small. 1tb will be the upper bound for the request.

Sure. But I do not see the value in optimising these huge commands to run as single requests. If you want to do that, do it properly and have a negotiation-phase flag that supports 64 bit request lengths.

> Full backup, for example:
> 
> 1. target can do fast write_zeroes: clear the whole disk (great if we can do it in one request, without splitting, etc), then backup all data except zero or unallocated (save a lot of time on this skipping).
> 2. target can not do fast write_zeroes: just backup all data. We need not clear the disk, as we will not save time by this.
> 
> So here, we need not splitting as a general. Just clear all or not clearing at all.

As I said, within the current protocol you cannot tell whether a target supports 'fast write zeroes', and indeed the support may be partial - for instance with a QCOW2 backend, a write that is not cluster aligned would likely only partially satisfy the command by deallocating bytes. There is no current flag for 'supports fast write zeroes' and (given the foregoing) it isn't evident to me exactly what it would mean.

It seems however you could support your use case by simply iterating through the backup disk, using NBD_CMD_WRITE for the areas that are allocated and non-zero, and using NBD_CMD_WRITE_ZEROES for the areas that are not allocated or zeroed. This technique would not require a protocol change (beyond the existing NBD_CMD_WRITE_ZEROES extension), works irrespective of whether the target supports write zeroes or not, works irrespective of difference in cluster allocation size between source and target, is far simpler, and has the added advantage of making the existing zeroes-but-not-holes area into holes (that is optional if you can tell the difference between zeroes and holes on the source media). It also works on a single pass. Yes, you need to split requests up, but you need to split requests up ANYWAY to cope with NBD_CMD_WRITE's 2^32-1 length limit (I strongly advise you not to use more than 2^31). And in any case, you probably want to parallelise reads and writes and have more than one write in flight in any case, all of which suggests you are going to be breaking up requests anyway.

-- 
Alex Bligh







Reply to: