[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#629994: sendfile returns early without user-visible reason



On Fri, Jun 10, 2011 at 06:30:48AM -0500, Jonathan Nieder <jrnieder@gmail.com> wrote:
> > Well, for read, the situation is a bit different, because thats a clear
> > posix violation.
> [...]
> >    The value returned may be less than nbyte if the number of bytes left
> >    in the file is less than nbyte, if the read() request was interrupted
> >    by a signal, or if the file is a pipe or FIFO or special file and has
> >    fewer than nbyte bytes immediately available for reading.
> 
> That would have been my guess, too.  I just stared at the POSIX "write"

Well, the relevant posix manpage for read is read, not the one for write.
read is clear.

> (just like for "read")

No, read cannot run out of room - the wording is altogether quite different,
and I quoted it above.

It might be less strongly worded, but it's still clear - the standard defines
how a function must behave and exceptions to it.

> I didn't find any text that comes out and says
> that barring such an exception partial writes are not allowed.

The standard specifies the behaviour of write - it doesn't allow
implementations to specify extra behaviour unless it explicitly says so
(for example, with a "may" as is done for read).

> missing text might be as small as a missing "shall" but the
> requirement just didn't seem to be there anywhere obvious.

I quoted them. You should really read the "read" manpage instead of the
write manpage, the functions have quite different return values.

> I would be happy to see the standard clarified.  If you're interested,
> the people at http://opengroup.org/austin and
> http://austingroupbugs.net/ are generally a helpful bunch.

The standard is absolutely clear, really.

> Yes, that's mostly true.  The idiom "if (read(..., count) == count)"
> is not uncommon and is safe when used carefully (for example, when
> used with files on disk rather than terminals, tapes, or pipes).

Well, not on linux.

> has to include strerror(errno) to be useful.  There are plenty of
> cases where that isn't true.  But isn't it the common case?

Both are common cases, as is no error checking.

> > The authors of gnu tar and many existing
> > applications apparently disagree, as do I.
> 
> GNU tar mostly uses the safe_read function from gnulib, except when
> reading the first line of CACHEDIR.TAG files, if I read it correctly.

It still gives a short read or equivalent error message on short reads,
depending on version and input device/file.

> If you'd like, you can retitle this bug. :)

I don't know how.

> Also keep in mind that if you can convince upstream (presumably by

I am really busy with writing and maintaining a LOT of free software. I
consider it my duty to report bugs, but if upstream doesn't feel like
fixing it, that's not really my problem. I already workarounded the
sendfile problem in libeio for example, and would have done so in any
case, as buggy kernel versions will be in use for a long time.

> > that would require read(2) to be fixed.
> 
> If read(2) and write(2) start handling larger chunks, there's
> obviously no reason for sendfile(2) not to, too.  The whole point of
> the exercise is to avoid buffer overflows and other logic errors in
> sloppily written, obscure drivers.

Well, the result is some non-posix code that breaks real-world programs that
work with both posix semantics as well as historic unix semantics.

I wouldn't be surprised if breakage and data loss ensues more in the
future - right now, there are not so many big files around (mostly dvd
etc. images), but files tend to grow.

-- 
                The choice of a       Deliantra, the free code+content MORPG
      -----==-     _GNU_              http://www.deliantra.net
      ----==-- _       generation
      ---==---(_)__  __ ____  __      Marc Lehmann
      --==---/ / _ \/ // /\ \/ /      schmorp@schmorp.de
      -=====/_/_//_/\_,_/ /_/\_\



Reply to: