[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#629994: sendfile returns early without user-visible reason



tags 629994 + upstream wontfix
quit

Hi Marc,

Marc Lehmann wrote:

> In 2.6.39 (and maybe some earlier versions= of Linux, sendfile supports
> file->file copies.
[...]
> Linux always seems to stop copying at 0x7FFFF000 bytes, without apparent
> reason (such as disk full or another error). This happens with a 64 bit
> kernel btw., so this is not a 32 bit issue either.
>
> This causes many programs to report a "short write", as this is an error
> condition with similar syscalls on files (such as write(2)).

Yes, I can reproduce it[*].

I believe this is from v2.6.16-rc1~169^2~16^2~37 (Relax the
rw_verify_area() error checking, 2006-01-04).  In other words, it seems
like it was always this way.  (Side note: recently the principle behind
that patch was reaffirmed --- see v2.6.37-rc1~40, readv/writev: do the
same MAX_RW_COUNT truncation that read/write does, 2010-10-29.)

Indeed, read(2) does the same thing (truncates to 7ffff000) and has done
so for five years, though it's a little harder to notice (I had to use
mmap to create a large file-backed buffer to read into.)

Background: even with read(2) and write(2), partial progress does not
necessarily represent an error.  For example, on "slow" devices like a
terminal, pipe, or socket, a partial success can indicate interruption
by a signal, and on a named or unnamed pipe it can indicate that fewer
than the requested number of bytes were immediately available.  So I am
somewhat curious about these many programs --- why are they expecting
this from sendfile?

The manpage is outdated and does not even indicate that sendfile can be
used to copy a file.  Has the size allowed for a single sendfile(2) call
changed over time?  Is this a regression or a request for a new feature?

> I think sendfile should either not attempt to copy files, or copy the
> requested number of bytes unless an error occurs

If an application wants to print a useful error message, it has to try
again until sendfile returns -1 so errno can be set.

Anyway, I agree that it would be better for sendfile to return partial
results less often, to make one-off programs easier to write and to
decrease the number of syscalls made, but that doesn't seem worth
exposing problems in low-level driver code that assumes it never has to
write more than fits in an "int" at a given moment.  So I'm marking this
wontfix for now.

All that said.

An obvious possible improvement would be to update the manpages to
include information about this.  Would you be interested in that, and if
so, can you suggest a wording?

Thanks and hope that helps,
Jonathan

[*]
#define _FILE_OFFSET_BITS 64
#include <sys/sendfile.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdlib.h>
#include <stdio.h>
#include <inttypes.h>
#include <limits.h>

static void die(const char *msg)
{
	fprintf(stderr, "%s\n", msg);
	exit(1);
}

int main(int argc, const char *argv[])
{
	struct stat st;
	int in, out;

	if (argc != 3)
		die("wrong number of arguments");
	in = open(argv[1], O_RDONLY);
	out = open(argv[2], O_WRONLY | O_CREAT);
	if (in < 0 || out < 0)
		die("cannot open file");
	if (fstat(in, &st))
		die("cannot get input file status");
	if (st.st_size > SSIZE_MAX)
		die("input file is too big");
	printf("sendfile returns %"PRIdMAX"\n",
		(intmax_t) sendfile(out, in, NULL, st.st_size));
	return 0;
}



Reply to: