[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#629994: sendfile returns early without user-visible reason



tags 629994 + lfs
quit

Hi,

Marc Lehmann wrote:
[out of order for convenience]

> Well, for read, the situation is a bit different, because thats a clear
> posix violation.
[...]
>    The value returned may be less than nbyte if the number of bytes left
>    in the file is less than nbyte, if the read() request was interrupted
>    by a signal, or if the file is a pipe or FIFO or special file and has
>    fewer than nbyte bytes immediately available for reading.

That would have been my guess, too.  I just stared at the POSIX "write"
manpage for a few minutes, and although it's pretty strongly implied
by the presence of a long list of exceptions:

 - if a write() runs out of room (for example, by filling the medium
   or hitting the per-process file size limit)
 - if the write() is interrupted by a signal
 - pipes and FIFOs
 - writes with the O_NONBLOCK flag set
 - when the number of bytes to be written exceeds SSIZE_MAX
   (implementation-defined behavior, truncation presumably possible)
 - sockets
 - STREAMs

(just like for "read") I didn't find any text that comes out and says
that barring such an exception partial writes are not allowed.  The
missing text might be as small as a missing "shall" but the
requirement just didn't seem to be there anywhere obvious.

I would be happy to see the standard clarified.  If you're interested,
the people at http://opengroup.org/austin and
http://austingroupbugs.net/ are generally a helpful bunch.

> in posix/unix/sus it has clearly defined
> and user-visible semantics for that, which require that the success case
> transfers as many bytes as can be transferred, and not stop a random
> amount earlier unless there is an error condition (signal => EINTR if not
> restarted, and easily controllable by applications - for example, not
> doing anything with signals makes it work):

Yes, that's mostly true.  The idiom "if (read(..., count) == count)"
is not uncommon and is safe when used carefully (for example, when
used with files on disk rather than terminals, tapes, or pipes).

> On Fri, Jun 10, 2011 at 03:21:38AM -0500, Jonathan Nieder <jrnieder@gmail.com> wrote:

>> If an application wants to print a useful error message, it has to try
>> again until sendfile returns -1 so errno can be set.
>
> Thats clearly just an opinion.

Yes, I was clearly overstepping to suggest that an I/O error message
has to include strerror(errno) to be useful.  There are plenty of
cases where that isn't true.  But isn't it the common case?

> The authors of gnu tar and many existing
> applications apparently disagree, as do I.

GNU tar mostly uses the safe_read function from gnulib, except when
reading the first line of CACHEDIR.TAG files, if I read it correctly.

[...]
> Should I open a separate bug for read(2) then, or will posix compliance
> also be a wontfix (a valid position)?

If you'd like, you can retitle this bug. :)

Also keep in mind that if you can convince upstream (presumably by
writing a patch, e.g., one to let block devices and in turn
filesystems declare that they are confident about safely handling
large reads/writes) then that "wontfix" tag won't matter in the least.
It's just documentation to summarize the current state of things.

>    sendfile is not the same as a read+write combination - it may transfer
>    and return fewer bytes than requested for no user-visible reason.
>
> that would require read(2) to be fixed.

If read(2) and write(2) start handling larger chunks, there's
obviously no reason for sendfile(2) not to, too.  The whole point of
the exercise is to avoid buffer overflows and other logic errors in
sloppily written, obscure drivers.

Thanks again.

Good night,
Jonathan



Reply to: