[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

libdpkg: buffered I/O API



Hi,

I finally found some time to work on moving the compression and
decompression code to the buffer library.  It seems like a good idea and
an interesting one, but first I ran into a couple of questions I’d like
to get your feedback on.

This patch series reorganizes the libdpkg buffer functions to make it a
little easier to add new buffer types.  I am sending it because it puts
the signatures for the functions a buffer type might provide in one place,
so even though the series does not do much, it makes API questions a
little clearer.

1. Currently, error reporting uses the usual convention of returning -1 and
   putting a more specific error code in errno.  Obviously, this is not a
   good way to report errors from backends that do not use the same
   convention.  For example, if the error flag is already set on a
   stream before a call to stream_md5(), then the result would be an
   error message something like the following:

	failed to read on buffer copy for <whatever>: Success

   It would be more friendly for write_stream() to check the error flag
   itself and report that with a more meaningful error result.

   Of course, dpkg itself never makes such a mistake AFAICT.  A more
   important problem is that it is hard to fit other backends (e.g., zlib)
   in to the “set errno, return -1” scheme.

   I suggest adding a pointer argument to return either:

    a. a universally valid error code like those used in libcom_err; or

    b. an error string.  In this  case, there is an ugly associated
       question: who is responsible for freeing the error strings?  The
       simplest answer: the caller, except for a special "Out of memory"
       string (ick).

   Despite the ugliness, I’m inclined towards b.

2. Incomplete I/O is hard to handle rationally.  The current code will
   busy-loop if a caller provides a non-blocking file descriptor to
   fd_fd_copy(); it might make sense to error out on EAGAIN instead
   (though this is also something dpkg will never run into).

   Complicating this is that Linux 2.2 used to return EAGAIN on 0-length
   reads from a nonempty pipe [1], but we already work around that in
   other ways.

   If read_stream() performs a partial read and then receives an I/O
   error, then the number of bytes successfully read is thrown away.
   Do we want the caller to be able to recover?  Practically speaking,
   this is only relevant if read_stream() is something callers should
   be able to use from a select() loop (i.e., with file handles
   fdopen()'d from non-blocking file descriptors, which can return
   EAGAIN).

   Options:
 
    a. busy-loop on EAGAIN (which punishes callers who try this);
    a. transparently poll() on EAGAIN (bad idea IMHO);
    b. consider EAGAIN an unrecoverable error; or
    c. report number of bytes consumed/produced on errors.

   I’m inclined towards c.

Thoughts?
Jonathan Nieder (4):
  libdpkg: Collect MD5-related functions in one place
  libdpkg: Refactor generic I/O code into backend-specific functions
  libdpkg: Make buffer_init() and buffer_done() return void
  libdpkg: Reorganize buffer module around new buffer_type struct

 lib/dpkg/buffer.c |  274 +++++++++++++++++++++++++++++++++++++++--------------
 lib/dpkg/buffer.h |   69 ++++++++-----
 2 files changed, 243 insertions(+), 100 deletions(-)

[1] <http://bugs.debian.org/81881>


Reply to: