[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Allowing > 32 bit lengths for NBD_CMD_TRIM, NBD_CMD_WRITE_ZEROES

If you create a very large XFS filesystem, XFS issues a single discard
request over the whole filesystem first.  However NBD translates this
into many 4 GB NBD_CMD_TRIM requests over the wire.

I've been trying to create an 8 EB (2^63 - 1024) XFS filesystem to
explore the limits of the Linux kernel, XFS, NBD, nbdkit.  The discard
step issues 2 billion x 4 GB NBD_CMD_TRIM requests which takes ...
quite a while (actually too long, I had to turn this feature off in

Since NBD_CMD_TRIM & NBD_CMD_WRITE_ZEROES don't involve sending data
over the wire, it seems we should allow larger lengths for these
commands as an extension.

I could think of a few ways this might be implemented:

(1) At option negotiation time, negotiate a "64 bit lengths" extension
and change the format of the request packet.  I don't like this: it
would break a lot of packet analysis tools and also makes parsing
commands in the server more difficult.

(2) Add a command flag NBD_CMD_FLAG_64_BIT_LENGTH.  When this is
present, the high order 32 bits of the length are sent by the client
after the normal request packet.  This raises the question of if we
should allow this for NBD_CMD_WRITE/NBD_CMD_READ too.

(3) Add a new set of commands (NBD_CMD_TRIM64 etc) which work exactly
like the existing commands except the high order 32 bits of the length
are sent after the normal request packet.

(4) Allow a different request message format, differentiated by using
a different magic (! NBD_REQUEST_MAGIC).  We could use the opportunity
to widen a few fields and reserve space for future expansion.
Similar to (1).



Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch

Reply to: