NBD commit 39e20207 (2.9.22) introduced the use of the ioctl NBD_SET_FLAGS, and tries to pass the concatenation of the 16-bit global flags in the most significant half + the 16-bit transmission flags in the lower half (during newstyle negotiation). Of course, at that point in time, the global flags were ALWAYS 0, so this results in just giving the kernel the transmission flags. While oldstyle negotiation gives 32 bits for flags, it never defined any bits in the upper 16, and is now no longer under development, so no (working) oldstyle server will ever advertise more than 16 bits of information, so blindly using the 32 bit flags field to hand to the kernel via NBD_SET_FLAGS is just fine. But when commit 626c2a37 (3.1) introduced the first global flag, NBD_FLAG_FIXED_NEWSTYLE, it computes: if(read(sock, &tmp, sizeof(uint16_t)) < 0) { err("Failed reading flags: %m"); } *flags = ((u32)ntohs(tmp)); ... if(read(sock, &tmp, sizeof(tmp)) < 0) err("Failed/4: %m\n"); *flags |= (uint32_t)ntohs(tmp); Whoops - this computes an overlap of flags (it leaves the upper 16 bits of *flags clear, and sets the lower 16 bits to the bitwise OR of the global flags and transmission flags) - although no one cared in any release between there and 3.8, because it so happens that NBD_FLAG_FIXED_NEWSTYLE and NBD_FLAG_HAS_FLAGS happen to be the same bit. But in 3.9, the overlap bug was still present, and the set of global flags had grown to include NBD_FLAG_NO_ZEROS (commit 038e066), which overlaps with NBD_FLAG_READ_ONLY. Ouch. This means that a client talking to a server that advertises NO_ZEROES means that the client will mistakenly tell the kernel to treat EVERY export as read-only, even if the client doesn't respond with NBD_FLAG_C_NO_ZEROES. 3.10 fixed things; negotiate() now uses uint16_t *flags (instead of u32 *), and no longer tries to merge global flags with transmission flags; only the transmission flags are ever passed to the kernel via NBD_SET_FLAGS. Maybe it's good that there was only 2 weeks between 3.9 and 3.10, so hopefully few distros are trying to ship that broken version. Meanwhile, qemu 2.5 (and probably 2.6, because it's now hard freeze and too late to change things without good reason) will populate things as: if (read_sync(csock, &tmp, sizeof(tmp)) != sizeof(tmp)) { error_setg(errp, "Failed to read server flags"); goto fail; } *flags = be16_to_cpu(tmp) << 16; ... if (read_sync(csock, &tmp, sizeof(tmp)) != sizeof(tmp)) { error_setg(errp, "Failed to read export flags"); goto fail; } *flags |= be16_to_cpu(tmp); which means it has been calling the kernel with NBD_SET_FLAGS 0x10001 when talking to the same server where the upstream NBD client would use merely 0x0001. I plan on fixing qemu to no longer set the upper 16 bits with global flags (as I don't see how NBD_FLAG_FIXED_NEWSTYLE or NBD_FLAG_NO_ZEROES could ever affect what the kernel wants to do in transmission phase). But do we need to document in the kernel code that existing clients mistakenly pass too many bits to the NBD_SET_FLAGS ioctl, so that if we ever reach the future point where we need more than 16 transmission flags, AND where we have more than 2 global flags defined, existing qemu 2.5 clients don't confuse the kernel when calling NBD_SET_FLAGS? Or do we think that it is unlikely enough to worry about, where by the time there are more than 16 transmission flags, users are likely to already be using new-enough qemu that doesn't send global flags to the kernel? -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature