Control: tags -1 + fixed-upstream patch jessie Control: found -1 3.16.7-ckt20-1+deb8u3 Control: fixed -1 4.1.1-1~exp1 Hi, so I've found out that this is actually a bug in LIO (the target) not the initiator. Problem is: LIO in Linux up to 4.0 uses vfs_writev to write out blocks when using the fileio backend, so this has a limit of 1024 iovecs (as per UIO_MAXIOV in include/uapi/linux/uio.h), each at most a page in size, so that gives us a maximum of 4 MiB in data that can be processed per request. (At 4 kiB page size.) Older versions of the Linux software initiator had a hard limit of 512 kiB per request, which means that at most 128 iovec entries were used, which fits perfectly. Newer versions of the Linux iSCSI initiator don't have this hard-coded limit but rather use the value supplied by the target. (This is correct behavior by the initiator, so there's no bug there, against what I initially assumed.) Problem now is that LIO with the fileio backend assumes that 8 MiB may be transfered at the same time, because (according to comments in drivers/target/target_core_file.h) they assume for some reason unbeknownst to me that the maximum number of iovecs is 2048. Note that this also means that any non-Linux initiator that properly follows the target-supplied values for the maximum I/O size will run into the same problem and cause I/O errors, even if it behaves properly. This problem doesn't affect upstream anymore, because they have rewritten LIO to use vfs_iter_write, which doesn't have such limitations, but was only introduced after 3.16. Backporting this would be too much, and probably ABI-incompatible. Fortunately, there's a much easier way to fix this, by just lowering the limit in drivers/target/target_core_file.h to 4 MiB. I've tested that and the limit will be properly set by LIO and newer initiators won't choke on that, so that fixes the bug. See the attached patch. CAVEAT: there is a slight problem with this change, and I don't know what the best solution here is: the optimal_sectors setting for fileio disks on people's existing setups is likely to be 16384, because that corresponds to 8 MiB (the previous max value, which is the default for optimal_sectors if no other value is set) - but that will cause the kernel to refuse that setting, because it's now larger than the maximum allowed value. If you use targetcli 3 (not part of Jessie, but you can install e.g. the version from sid) then that will fail to set up the target properly, because it will abort as soon as it notices that it can't make the setting. (Leftover targetcli 2 from Wheezy upgrades should not be affected as badly as far as I can tell, because the startup scripts seem to ignore errors. But I haven't tested that.) So that leaves the situation that without this fix, 3.16 kernels produce I/O errors when used with initiators that respect the kernel's setting, but with the fix the target configuration needs to be updated. (Of course, one could also patch the kernel to ignore the specific value of 16384 for optimal_sectors if fileio is used as a backend and print a warning.) Don't know what you'd prefer here. Also note that this likely also affects the kernel in Wheezy, although I haven't done any tests in that direction. Regards, Christian PS, for reference, upstream discussion on the initiator mailing list that resulted in me finding out that it's not the initiator but the target that was the problem: https://groups.google.com/forum/#!topic/open-iscsi/UE2JJfDmQ7w
From: Christian Seiler <christian@iwakd.de> Date: Sat, 30 Jan 2016 13:48:54 CET Subject: LIO: assume a maximum of 1024 iovecs Previously the code assumed that vfs_[read|write}v supported 2048 iovecs, which is incorrect, as UIO_MAXIOV is 1024 instead. This caused the target to advertise a maximum I/O size that was too large, which in turn caused conforming initiators (most notably recent Linux kernels, which started to respect the maximum I/O size of the target and not have a hard-coded 512 kiB as previous kernel versions did) to send write requests that were too large, resulting in LIO rejecting them (kernel: fd_do_rw() write returned -22), resulting in data loss. This patch adjusts the limit to 1024 iovecs, and also uses the PAGE_SIZE macro instead of just assuming 4 kiB pages. Signed-off-by: Christian Seiler <christian@iwakd.de> --- drivers/target/target_core_file.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) --- a/drivers/target/target_core_file.h +++ b/drivers/target/target_core_file.h @@ -1,6 +1,8 @@ #ifndef TARGET_CORE_FILE_H #define TARGET_CORE_FILE_H +#include <asm/page.h> + #define FD_VERSION "4.0" #define FD_MAX_DEV_NAME 256 @@ -9,9 +11,9 @@ #define FD_MAX_DEVICE_QUEUE_DEPTH 128 #define FD_BLOCKSIZE 512 /* - * Limited by the number of iovecs (2048) per vfs_[writev,readv] call + * Limited by the number of iovecs (1024) per vfs_[writev,readv] call */ -#define FD_MAX_BYTES 8388608 +#define FD_MAX_BYTES (1024*PAGE_SIZE) #define RRF_EMULATE_CDB 0x01 #define RRF_GOT_LBA 0x02 --- a/drivers/target/target_core_file.c +++ b/drivers/target/target_core_file.c @@ -674,7 +674,7 @@ fd_execute_rw(struct se_cmd *cmd, struct */ if (cmd->data_length > FD_MAX_BYTES) { pr_err("FILEIO: Not able to process I/O of %u bytes due to" - "FD_MAX_BYTES: %u iovec count limitiation\n", + "FD_MAX_BYTES: %lu iovec count limitiation\n", cmd->data_length, FD_MAX_BYTES); return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE; }
Attachment:
signature.asc
Description: OpenPGP digital signature