Re: Cross-project NBD extension proposal: NBD_INFO_INIT_STATE

To: Max Reitz <mreitz@redhat.com>, "nbd@other.debian.org" <nbd@other.debian.org>, QEMU <qemu-devel@nongnu.org>, "qemu-block@nongnu.org" <qemu-block@nongnu.org>, "libguestfs@redhat.com" <libguestfs@redhat.com>
Cc: "Richard W.M. Jones" <rjones@redhat.com>, Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>, Alberto Garcia <berto@igalia.com>
Subject: Re: Cross-project NBD extension proposal: NBD_INFO_INIT_STATE
From: Eric Blake <eblake@redhat.com>
Date: Tue, 18 Feb 2020 14:55:19 -0600
Message-id: <[🔎] c47d277e-35f8-9837-1f1d-eab4bb6d5840@redhat.com>
In-reply-to: <[🔎] 1b3741aa-7841-9062-ecca-73c38e599e05@redhat.com>
References: <[🔎] a4394fde-f459-dcb5-1698-013e1e24c388@redhat.com> <[🔎] 1b3741aa-7841-9062-ecca-73c38e599e05@redhat.com>

On 2/17/20 9:13 AM, Max Reitz wrote:

Hi,

It’s my understanding that without some is_zero infrastructure for QEMU,
it’s impossible to implement this flag in qemu’s NBD server.

You're right that we may need some more infrastructure before being ableto decide when to report this bit in all cases. But for raw files, thatinfrastructure already exists: does block_status at offset 0 and theentire image as length return status that the entire file is a hole.And for qcow2 files, it would not be that hard to teach a similarblock_status request to report the entire image as a hole based on myproposed qcow2 autoclear bit tracking that the image still reads as zero.


At the same time, I still haven’t understood what we need the flag for.

As far as I understood in our discussion on your qemu series, there is
no case where anyone would need to know whether an image is zero.  All > practical cases involve someone having to ensure that some image is
zero.  Knowing whether an image is zero can help with that, but that can
be an implementation detail.

For qcow2, the idea would be that there is some flag that remains true
as long as the image is guaranteed to be zero.  Then we’d have some
bdrv_make_zero function, and qcow2’s implementation would use this
information to gauge whether there’s something to do as all.

For NBD, we cannot use this idea directly because to implement such a
flag (as you’re describing in this mail), we’d need separate is_zero
infrastructure, and that kind of makes the point of “drivers’
bdrv_make_zero() implementations do the right thing by themselves” moot.

We don't necessarily need a separate is_zero infrastructure if we caninstead teach the existing block_status infrastructure to report thatthe entire image reads as zero. You're right that clients that need toforce an entire image to be zero won't need to directly callblock_status (they can just call bdrv_make_zero, and let that worryabout whether a block status call makes sense among its list of steps totry). But since block_status can report all-zero status for some cases,it's not hard to use that for feeding the NBD bit.

However, there's a difference between qemu's block status (which isalready typed correctly to return a 64-bit answer, even if it may need afew tweaks for clients that currently don't expect it to request morethan 32 bits) and NBD's block status (which can only report 32 bitsbarring a new extension to the protocol), and where a single all-zerobit at NBD_OPT_GO is just as easy of an extension as a way to report a64-bit all-zero response to NBD_CMD_BLOCK_STATUS.


OTOH, we wouldn’t need such a flag for the implementation, because we
could just send a 64-bit discard/make_zero over the whole block device
length to the NBD server, and then the server internally does the right
thing(TM).  AFAIU discard and write_zeroes currently have only 32 bit
length fields, but there were plans for adding support for 64 bit
versions anyway.  From my naïve outsider perspective, doing that doesn’t
seem a more complicated protocol addition than adding some way to tell
whether an NBD export is zero.

Adding 64-bit commands to NBD is more invasive than adding a singlestartup status bit. Both ideas can be done - doing one does notpreclude the other. But at the same time, not all servers willimplement both ideas - if one is easy to implement while the other ishard, it is not unlikely that qemu will still encounter NBD servers thatadvertise startup state but not support 64-bit make_zero (even if qemuas NBD server starts supporting 64-bit make zero) or even 64-bit blockstatus results.

Another thing to think about here is timing. With the proposed NBDaddition, it is the server telling the client that "the image you areconnecting to started zero", prior to the point that the client even hasa chance to request "can you make the image all zero in a quick manner(and if not, I'll fall back to writing zeroes as I go)". And even ifNBD gains a 64-bit block status and/or make zero command, it is stillless network traffic for the server to advertise up-front that the imageis all zero than it is for the client to have to issue command requestsof the server (network traffic is not always the bottleneck, but it canbe a consideration).


So I’m still wondering whether there are actually cases where we need to
tell whether some image or NBD export is zero that do not involve making
it zero if it isn’t.

Just because we don't think that qemu-img has such a case does not meanthat other NBD clients will not be able to come up with some use forknowing if an image starts all zero.


(I keep asking because it seems to me that if all we ever really want to
do is to ensure that some images/exports are zero, we should implement
that.)

The problem is WHERE do you implement it. Is it more efficient toimplement make_zero in the NBD server (the client merely requests tomake zero, but lets the server do all the work) or in the NBD client(the client learns whether the server is already zero, and not hearingyes, the client proceeds to do all the work to write zeroes). From theqemu perspective, qemu-img convert needs the image to be zero, andbdrv_make_zero will report back either "yes I quickly made it zero,possibly by doing nothing" or "no, making it zero now is no moreefficient than you just writing zeroes as you go". But although thecode in qemu-img is the same whether bdrv_make_zero is able to requestthe work be done in the server or whether the work has to be done in theclient, the code in the block layer that implements bdrv_make_zero mayitself care about knowing whether the NBD server started all zero.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Reply to:

Follow-Ups:
- Re: Cross-project NBD extension proposal: NBD_INFO_INIT_STATE
  - From: Max Reitz <mreitz@redhat.com>

References:
- Cross-project NBD extension proposal: NBD_INFO_INIT_STATE
  - From: Eric Blake <eblake@redhat.com>
- Re: Cross-project NBD extension proposal: NBD_INFO_INIT_STATE
  - From: Max Reitz <mreitz@redhat.com>

Prev by Date: Re: Cross-project NBD extension proposal: NBD_INFO_INIT_STATE
Next by Date: [PATCH 0/2] requeue request if only one connection is configured
Previous by thread: Re: Cross-project NBD extension proposal: NBD_INFO_INIT_STATE
Next by thread: Re: Cross-project NBD extension proposal: NBD_INFO_INIT_STATE
Index(es):
- Date
- Thread