[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extension



On Wed, Mar 23, 2016 at 06:58:34PM +0100, Wouter Verhelst wrote:
> On Wed, Mar 23, 2016 at 05:16:02PM +0300, Denis V. Lunev wrote:
> > From: Pavel Borzenkov <pborzenkov@...2319...>
> > 
> > With the availability of sparse storage formats, it is often needed to
> > query status of a particular LBA range and read only those blocks of
> > data that are actually present on the block device.
> > 
> > To provide such information, the patch adds GET_LBA_STATUS extension
> > with one new NBD_CMD_GET_LBA_STATUS command.
> > 
> > There exists a concept of data dirtiness, which is required during, for
> > example, incremental block device backup. To express this concept via
> > NBD protocol, this patch also adds additional mode of operation to
> > NBD_CMD_GET_LBA_STATUS command.
> > 
> > Since NBD protocol has no notion of block size, and to mimic SCSI "GET
> > LBA STATUS" command more closely, it has been chosen to return a list of
> > extents in the response of NBD_CMD_GET_LBA_STATUS command, instead of a
> > bitmap.
> > 
> > Signed-off-by: Pavel Borzenkov <pborzenkov@...2319...>
> > Reviewed-by: Roman Kagan <rkagan@...2319...>
> > Signed-off-by: Denis V. Lunev <den@...2317...>
> > CC: Wouter Verhelst <w@...112...>
> > CC: Paolo Bonzini <pbonzini@...696...>
> > CC: Kevin Wolf <kwolf@...696...>
> > CC: Stefan Hajnoczi <stefanha@...696...>
> > ---
> >  doc/proto.md | 82 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 82 insertions(+)
> > 
> > diff --git a/doc/proto.md b/doc/proto.md
> > index cda213c..fff515d 100644
> > --- a/doc/proto.md
> > +++ b/doc/proto.md
> > @@ -243,6 +243,8 @@ immediately after the global flags field in oldstyle negotiation:
> >    `NBD_CMD_TRIM` commands
> >  - bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; should be set to 1 if the server
> >    supports `NBD_CMD_WRITE_ZEROES` commands
> > +- bit 7, `NBD_FLAG_SEND_GET_LBA_STATUS`; should be set to 1 if the server
> > +  supports `NBD_CMD_GET_LBA_STATUS` commands
> >  
> >  ##### Client flags
> >  
> > @@ -477,6 +479,10 @@ The following request types exist:
> >  
> >      Defined by the experimental `WRITE_ZEROES` extension; see below.
> >  
> > +* `NBD_CMD_GET_LBA_STATUS` (7)
> > +
> > +    Defined by the experimental `GET_LBA_STATUS` extension; see below.
> > +
> >  * Other requests
> >  
> >      Some third-party implementations may require additional protocol
> > @@ -638,6 +644,82 @@ The server SHOULD return `ENOSPC` if it receives a write zeroes request
> >  including one or more sectors beyond the size of the device. It SHOULD
> >  return `EPERM` if it receives a write zeroes request on a read-only export.
> >  
> > +### `GET_LBA_STATUS` extension
> > +
> > +With the availability of sparse storage formats, it is often needed to query
> > +status of a particular LBA range and read only those blocks of data that are
> > +actually present on the block device.
> > +
> > +Some storage formats and operations over such formats express a concept of
> > +data dirtiness. Whether the operation is block device mirroring,
> > +incremental block device backup or any other operation with a concept of
> > +data dirtiness, they all share a need to provide a list of LBA ranges
> > +that this particular operation treats as dirty.
> > +
> > +To provide such class of information, `GET_LBA_STATUS` extension adds new
> > +`NBD_CMD_GET_LBA_STATUS` command which returns a list of LBA ranges with
> > +their respective states.
> > +
> > +* `NBD_CMD_GET_LBA_STATUS` (7)
> > +
> > +    An LBA range status query request. Length and offset define the range
> > +    of interest. The server MUST reply with a reply header, followed
> > +    immediately by the following data:
> 
> As Eric noted, please expand LBA at least once.
> 
> > +      - 32 bits, length of parameter data that follow (unsigned)
> > +      - zero or more LBA status descriptors, each having the following
> > +        structure:
> > +
> > +        * 64 bits, offset (unsigned)
> > +        * 32 bits, length (unsigned)
> > +        * 16 bits, status (unsigned)
> > +
> > +    unless an error condition has occurred.
> > +
> > +    If an error occurs, the server SHOULD set the appropriate error code
> > +    in the error field. The server MUST then either close the
> > +    connection, or send *length of parameter data* bytes of data
> > +    (which MAY be invalid).
> > +
> > +    The type of information required by the client is passed to server in the
> > +    command flags field. If the server does not implement requested type or
> > +    have no means to express it, it MUST NOT return an error, but instead MUST
> > +    return a single LBA status descriptor with *offset* and *length* equal to
> > +    the *offset* and *length* from request, and *status* set to `0`.
> > +
> > +    The following request types are currently defined for the command:
> > +
> > +    1. Block provisioning state
> > +
> > +    Upon receiving an `NBD_CMD_GET_LBA_STATUS` command with command flags
> > +    field set to `NBD_FLAG_GET_ALLOCATED` (0x0), the server MUST return
> 
> I prefer to have a non-zero flag value.
> 
> > +    the provisioning state of the device. The following provisionnig states
> > +    are defined for the command:
> > +
> > +      - `NBD_STATE_ALLOCATED` (0x0), LBA extent is present on the block device;
> > +      - `NBD_STATE_ZEROED` (0x1), LBA extent is present on the block device
> > +        and contains zeroes;
> 
> Presumably this should be "contains only zeroes"?
> 
> Also, this may end up being a fairly expensive call for the server to
> process. Is it really useful?
> 
> > +      - `NBD_STATE_DEALLOCATED` (0x2), LBA extent is not present on the
> > +        block device. A client MUST NOT make any assumptions about the
> > +        contents of the extent.
> > +
> > +    2. Block dirtiness state
> > +
> > +    Upon receiving an `NBD_CMD_GET_LBA_STATUS` command with command flags
> > +    field set to `NBD_FLAG_GET_DIRTY` (0x1), the server MUST return
> > +    the dirtiness status of the device. The following dirtiness states
> > +    are defined for the command:
> > +
> > +      - `NBD_STATE_DIRTY` (0x0), LBA extent is dirty;
> > +      - `NBD_STATE_CLEAN` (0x1), LBA extent is clean.
> > +
> > +    Generic NBD client implementation without knowledge of a particular NBD
> > +    server operation MUST NOT make any assumption on the meaning of the
> > +    NBD_STATE_DIRTY or NBD_STATE_CLEAN states.
> 
> That makes it a useless call. A server can read /dev/random to decide
> whether to send STATE_DIRTY or STATE_CLEAN, and still be compliant with
> this spec.
> 
> Either the spec should define what it means for a block to be in a dirty
> state, or it should not talk about it.

What I was trying to say with this sentence is that without knowing what
is currently going on on the server side, the DIRTY state has no
meaning. If we are doing incremental backup DIRTY state might represent blocks
that have changed since last backup. For mirroring - blocks that are yet
to be migrated. And for the same block device different sets of DIRTY
ranges might be returned in response to this command. Basically, the
meaning of DIRTY depends on the context. 

Should I state in the spec, that the meaning of DIRTY state is
implementation-specific? I see that NBD_REP_SERVER already uses this
approach.

> 
> -- 
> < ron> I mean, the main *practical* problem with C++, is there's like a dozen
>        people in the world who think they really understand all of its rules,
>        and pretty much all of them are just lying to themselves too.
>  -- #debian-devel, OFTC, 2016-02-12



> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140

> _______________________________________________
> Nbd-general mailing list
> Nbd-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nbd-general




Reply to: