[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Why the .deb format?



"BC" == Billy Chow <billy.chow@eng.ox.ac.uk> writes:

  BC> Dear all,
  BC> This is the question I always wanted to ask.  Are there advantages to
  BC> the use of .deb format for debian packages?  Why is .tgz
  BC> inappropriate?

  BC> -Billy


    I think this is a good question.

    According to `deb(5)', Debian-0.93 packages have the following
logical structure:

       line 1:
              version number (0.93...), followed by newline.

       line 2:
              number of characters occupied by control area
              expressed in decimal, followed by newline.

       control area:
              compressed gzipped ustar formatted archive.  Must
              contain file named control.  May optionally contain
              files named: conffiles, preinst, prerm, postint,
              postrm.

       files archive area:
              compressed gzipped ustar formatted archive.  [with
              file structures designed to be unpacked in the root
              directory].


   Given this logical structure, you seem to me to be asking if it
would not be at least as appropriate to gzip these four sections
together whilst the third and fourth sections were still in their
ungzipped state.

   In the absence of any file corruption, doing it that way would
obviously make it more costly to do some things which can be done very
easily under the present scheme (like checking the dpkg version
number, for instance) so I guess the question becomes would doing it
that way be any better i.t.o. providing safeguards against file
corruption?

   As matters stand, if after the first readline has been issued, the
second readline returns a positive integer NC, the result of reading
the succeeding NC characters (bytes?) is I think safeguarded (assuming
it does not run off the end of the file) by the fact that it has to
pass `gunzip -t'; the remainder of the file must also pass `gunzip
-t'.
 
    And AFAIK there is one and only one place where a sequence of
bytes consisting of two gzipped files which have been concatenated can
be split into two such that both parts are gzip files.
    
    For example, as an experiment, I just now copied the file
`ae-493-6.deb' to a file named `test.deb', loaded the latter into
emacs, and deleted a random handful of bytes from somewhere near the
middle of the files archive area section, just to see how `dpkg
--contents' would react.

# dpkg --contents test.deb
------------------------ quote -------------------------
drwxr-xr-x root/root         0 Oct  2 18:35 1995 ./
drwxr-xr-x root/root         0 Oct  2 18:35 1995 usr/
drwxr-xr-x root/root         0 Oct  2 18:35 1995 usr/doc/
drwxr-xr-x root/root         0 Oct  2 18:35 1995 usr/doc/copyright/
-rw-r--r-- root/root      5020 Oct  2 18:35 1995 usr/doc/copyright/ae
drwxr-xr-x root/root         0 Oct  2 18:35 1995 usr/doc/ae/
-rw-r--r-- root/root      1147 Oct  2 18:35 1995
usr/doc/ae/modeless.rc.gz
-rw-r--r-- root/root      5883 Oct  2 18:35 1995 usr/doc/ae/ae.man.gz
tar: Skipping to next file header

gzip: stdin: invalid compressed data--crc error
dpkg-deb: subprocess gzip -dc returned error exit status 1
-------------------------- unquote -------------------------


   So, unless an explicit `gzip -t' does any better than this, I don't
see how doing it the other way around would do any better
i.t.o. providing safeguards against file corruption -- and I suspect
the effect might in fact be the opposite.


 
-- 
<bhogan@rahul.net> |- "5. Improve constantly and forever the system of
production and service, to improve quality and productivity, and thus
constantly decrease costs." (W. Edwards Deming)







Reply to: