[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: NBD prefetch read



22.03.2018 21:24, Wouter Verhelst wrote:
On Wed, Mar 21, 2018 at 02:05:43PM +0300, Vladimir Sementsov-Ogievskiy wrote:
21.03.2018 13:20, Wouter Verhelst wrote:
On Tue, Mar 20, 2018 at 08:22:31PM +0300, Vladimir Sementsov-Ogievskiy wrote:
20.03.2018 19:58, Wouter Verhelst wrote:
On Tue, Mar 20, 2018 at 11:57:46AM +0300, Vladimir Sementsov-Ogievskiy wrote:
19.03.2018 17:39, Eric Blake wrote:
Can you demonstrate an actual sequence of commands sent over the wire,
for how it would be useful?
- we initialize two drives A and B in qemu and setup copy-on-read for them.
- client send a sequence of READAHEAD commands, and data is copied from A to
B on read from B, in corresponding sequence.

So, here B is a cache in terms of the PRE-FETCH command.
This sounds very similar to what xNBD does
(https://bitbucket.org/hirofuchi/xnbd/wiki/Home). Can you confirm?

If so, I suppose it makes sense to add the current behaviour of xNBD to
the spec, rather than inventing our own thing.

Hm, what do you mean on this page? "Scenario 2 (Simple proxy server,
distributed Copy-on-Write)" ?
Well, I don't necessarily mean the implementation details, so much as
the general concept :-)

You are talking about live migration of storage; that is what xNBD
implements and has in production. Does it not make sense to at least
look at what they're doing, so that we can possibly implement something
compatible?

It looks similar, but there is no control channel with READAHEAD. The idea
is no data is send through the control channel.
Eh, NBD has no control channel? I'm not sure what you're talking about
here.

We are doing restore, not migration. Start qemu over empty qcow2 image over
nbd-client with enabled copy-on-read, so, when client read something, it
goes into it's qcow2 (nbd-client is connected to backup NBD server). It's
ok.
But we also need a way to force data movement from backup to new qcow2
image, even if guest doesn't read the data, so, we want to simulate reads on
that qcow2, just to initiate copy-on-read. Moreover, we need a special
sequence of these simulated reads, and only a third tool knows this
sequence. So, we want to export qcow2 image over NBD - it's a "control
channel", only READAHEAD commands will be sent through this channel by third
tool. As I said, it's a managed copy-on-read process, managed by third tool.
Okay, so do I understand you correctly if you're saying it's something
like this:

client               server               third party
    | -----NBD CS------- |                      |
    |                    | ------ NBD TS ------ |
    | ---------------- NBD TC ----------------- |

The client reads "stuff" over the NBD CS channel (where it is a client,
and the server is, well, a server). It copies blocks to its local cache
when necessary.

The third party connects to both the client and the server as a client
(i.e., the client in the above acts as a client as well as a server at
the same time), on the NBD TS and NBD TC channels. It uses BLOCK_STATUS
commands (?) to figure out what the status of the restore is. When it
figures out that, after comparing the output of a BLOCK_STATUS command
on the NBD TS channel with the NBD TC channel, it would send the
proposed new command over the NBD TC channel, causing the client to then
read stuff from the server.

Is that understanding correct?

The same, but our scheme is even simpler, it doesn't use BLOCK_STATUS for now. It may be also shown like this:

+---------- VM --------------+
|                            |
| guest                      |
|  |                         |
| local disk -- (NBD server)<----READAHEAD---+(third party NBD client)
|  |                         |
| (NBD client)<-----------------(data)-------+(NBD server) -- [VM backup]
|                            |
+----------------------------+

- when guest read block and there is no this block in local disk it is read from NBD client, and saved in local disk
- when guest write something, it is written into local disk
- on READAHEAD, if there is no this block in local disk, it is read from NBD client and saved in local disk
- on READAHEAD, if there is corresponding block in local disk - do nothing


If so, then I think the semantics of that proposed new command are,
still, very similar to the NBD_CMD_BGCOPY that xNBD implemented, and we
should first look at that before inventing something new; that's not to
say that BGCOPY is exactly what we need, but it might be.

hmm, from xnbd Changelog:
Protocol changes
~~~~~~~~~~~~~~~~
 * NBD_CMD_BGCOPY (value 3) has been turned into NBD_CMD_CACHE (value 5)
   to get back in sync with the original NBD server and NBD in kernel


Is there any specification for NBD_CMD_CACHE? Or only the code? On first sight it looks very similar with our needs.



If not, then please enlighten me, because I'm afraid that in that case
I'm lost :-)



--
Best regards,
Vladimir


Reply to: