On 03/31/2016 11:57 AM, Alex Bligh wrote: > I know that in practice *lengths* > 2^31 are hardly ever used by any > client, so I'd like to say "the maximum length you can send me is > 2^31 - 1". >> Large-file support on 32 bit architectures has existed since several >> decades now. We should not cater to that; people should just write >> software that can handle LFS offsets. > > It's not offsets that are the issue, it's lengths. Even in POSIX, write() takes size_t length on in put, but ssize_t result on output, and a successful the return value MUST be the number of bytes actually written. So, on a 32-bit platform where ssize_t cannot represent 2^32, you can request a write for larger than that size, but the kernel MUST treat it as a short write, because the kernel CANNOT convert your large size_t request into a negative ssize_t result and still comply with POSIX. This is true whether or not off_t is 64 bits. Given that our current NBD_CMD_WRITE is all-or-none, there's no way to report the error that happens (under the hood, the server would HAVE to split a 2^31+1 byte write request into two write() calls, but you still have the risk of partial failure even if the first call succeeds) - of course, with the addition of structured replies, error-with-offset is great at reporting short writes. > > I'd also like to say "I'd like my offsets / lengths to be a multiple of 4k" > so I can open O_DIRECT. Or at least say "can you cope with only offsets / > lengths that are a multiple of 4k, as if so I can perform better". But > this should probably be on a different wish-list! Yep, several emails now hinting that we want to add an extension for length negotiation; I've started a strawman locally, but it's not polished enough to post yet, with all the churn on structured replies first. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature