On 04/13/2016 01:27 AM, Wouter Verhelst wrote: > Eric, > > On Tue, Apr 12, 2016 at 06:16:54PM -0600, Eric Blake wrote: >> Existing NBD servers often have limitations, such as requiring >> actions to be aligned to block sizes or limiting maximum >> transactions to avoid denial of service attacks; for example, >> qemu's NBD server refuses any transaction larger than 32M. But >> to date, clients have to learn these limitations via out-of-band >> means. >> >> This alters NBD_OPT_INFO and NBD_OPT_GO to use a new reply type >> NBD_REP_INFO (rather than overloading NBD_REP_SERVER), so that >> we have a future-proof way of supplying as much additional >> structured information about an export as we want. >> >> Design decision: a client that wants to learn block sizes MUST >> use NBD_OPT_GO, rather than the old NBD_OPT_EXPORT_NAME, even >> though we could have repurposed some of the reserved zeroes when >> NBD_FLAG_C_NO_ZEROES is not in effect, because we don't want to >> encourage any further abuse of NBD_OPT_EXPORT_NAME. >> >> Design decision: no new global NBD_FLAG or NBD_OPT are required; >> there is nothing for the client to negotiate. The server merely >> provides as much information as it can, and the client then >> interprets what information it understands. The items are >> structured so that a client can ignore details from the server >> that the client does not know about, and that we can easily add >> future items of information. > > General note (in-depth review may follow later): > > Currently, there are no default minimum or maximum block sizes, and > therefore they are effectively limited to "1 byte" for the minimum block > size, and "the size of the device" for the maximum block size. We have existing servers that don't comply with this; I'm trying to approach it from the angle of: - if the server advertises, then that is the limit - if the client and server communicate out of band, then use those limits - if the server does not advertise, then the server MUST support a minimum block size of 1 (although not all existing servers do, they can be considered buggy), and the client SHOULD NOT send requests smaller than 512 unless the export size proves a smaller block size is in use (that is, if export size % 512 is non-zero, we know the server supports a smaller block size in order to access those tail bytes), but the client MAY still attempt to send smaller alignments and hope that the server is not buggy > > I do agree that it might be advantageous for the server to announce such > minimum and maximum sizes, but I don't think that defining defaults that > differ from what historically has been the effective default is the > right way to go. > But that's my argument - historically, there ARE servers which insist on 512 or even 4096-byte alignment, and which misbehave (close the connection, return an error, or possibly even corrupt data) rather than do read-modify-write on misaligned requests. And qemu's server certainly kills connections on attempts to exceed 32M in a single transaction. So while the theoretical server should be unlimited unless advertised, the practical client should stick within sane limits rather than attempting to exploit unlimited. I'm open to suggestions on wording (such as using SHOULD rather than MUST when a value is not advertised, so that existing implementations are merely poorer quality of implementation rather than in violation of the spec). > Therefore, I would like this to say that unless you announce > differently, the maximum block size is the size of the device, and the > minimum block size is 1 byte. > > Having a default for preferred block size sounds sane, although it might > be better to switch it to 4096 (which is what most conversations seem to > use today) rather than 512. Except that I suggested 65k, not 4096, as the default preferred block size. Even though 4096 may be atomic, it is not as efficient; and in at least qemu's case, the default qcow2 file sizing uses 32k clusters as its preferred I/O size. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature