[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [PATCH 1/1] nbd: increase maximum size of the PWRITE_ZERO request

On 02/08/2018 06:55 PM, Eric Blake wrote:
On 02/08/2018 09:28 AM, Edgar Kaziakhmedov wrote:

We've got a potential problem.  Unless you have out-of-band communication of the maximum NBD_CMD_WRITE_ZEROES sizing (or if the NBD protocol is enhanced to advertise that as an additional piece of block size information during NBD_OPT_GO), then a client CANNOT assume that the server will accept a request this large. We MIGHT get lucky if all existing servers that accept WRITE_ZEROES requests either act on large requests or reply with EINVAL but do not outright drop the connection (which is different from servers that DO outright drop the connection for an NBD_CMD_WRITE larger than 32M).  But I don't know if that's how all servers behave, so sending a too-large WRITE_ZEROES request may have the unintended consequence of killing the connection.
Actually, I do not understand why current NBD servers shouldn't accept such large requests, because most servers should apply some optimizations avoiding direct filling with zeroes.

Just because a server CAN optimize doesn't mean that it is REQUIRED to optimize.  You cannot make assumptions that a server will be happy with a larger request, merely because less data was sent over the wire, because the server may still have to allocate memory locally to perform the request.

As for block-mirroring over NBD, it works fine with QEMU server implementation and isn't it the main application?

Yes, qemu-to-qemu interoperating as efficiently as possible is nice; but I'm worried about qemu-to-other interoperating as well. The point of a public specification is to avoid one-way silos, so that you CAN mix-and-match a server from one implementation with a client from another, rather than being forced to use qemu as the server when qemu is the client.  Note that portability can include hand-shaking to fall back to the least-common denominator, rather than requiring both sides to always understand all extensions; but the important part is that neither party should make assumptions about the other side without using the spec as their guide.
So, in that case it is required to negotiate about the biggest write_zero chunk size before communication, if such option is featured in mainline NBD.

Reply to: