[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#222779: [PROPOSAL] definition of deb binary files



Hi,

I made a proposal of an updated deb format definition. I based that on
the manpage deb (part of dpkg-dev), and on reverse engineering of
dpkg-deb/build.c. I hope I've written the standard in a right and easy
to understandable way. I did (by purpose) not add anything about
signatures etc, but I just wanted to document what we have at current.
Discussion about additions should (IMHO) be kept seperate.

IMHO this definition should become part of the policy; I propose
either an new chapter 12, or an addition to chapter 3 Binary packages,
whatever seems more appropriate. This means that also some parts of
Appendix B could be removed at this occasion.

I'm also Ccing one bug of apt-utils, where I also got some of the
information from, and debian-devel. Please restrict the crossposting
on answers if usefull.


Cheers,
Andi


DESCRIPTION

The .deb format is the Debian binary package file format. It is understood
by dpkg 0.93.76 and later, and is generated by default by all versions
of dpkg since 1.2.0 and all i386/ELF versions since 1.1.1elf.

The format described here is used since Debian 0.93; details of the old
format are described in deb-old(5).


OVERALL FORMAT

The file is an ar archive in a certain ar version and with a magic number
of !<arch>. Due to the robustness principle, extracting tools should be
able to cope with as many of the different ar versions as possible; if they
don't, its at maximum a wishlist bug. On the other hand, tools providing
.deb-files MUST only provide strictly standard compatible files. Every
other behaviour is a serious bug!

The first member of the archive is name debian-binary and contains a series
of lines, separated by newlines. Currently only one line is present, the
format version number. The 2.0 format is current, and this format is
described in that document. Programs which read .deb-files should be
prepared for the minor number to be increased and new lines to be present,
and should ignore these if this is the case. If the major number has a
value a programm doesn't know, an incompatible change has happend, and
the program should abort with an error.


OVERALL AR FORMAT

The ar-format is (by purpose) one of the most ancient formats. This has the
reason that it should be possible to unpack .deb-files on as many different
computers as possible. Furthermore, it makes it also more easy for our code
to handle it.

Any ar files can be written as AR-FILE := HEADER [MEMBER]*.
The header is the string "!<arch>\n" (not null terminated).

Each member itself consists of the member head, and of the body, and, if
necessary, a padding '\n'. All information in the members head is printable
ascii, and each value is padded with spaces on the right side; at least one
space must be present, so the information must be shorter than the maximum
number of bytes available. The head is composed of the name (16 bytes), the
date in seconds since epoch (1970-1-1 0:00:00 UTC) in decimal notion (12
bytes), the uid and gid of the owner in decimal notion (each 6 bytes;
usually both 0), the file member mode in octal notion, begining with 1 (8
bytes; usually 100644), the size of the member body (the size is measure
without possible padding to the body; 10 bytes) and the two bytes "`\n".
After the member head, the member body follows unquoted; if the member body
has uneven lenght, it is padded with a single '\n'; so any members start on
an even byte boundry.

So, the initial member looks like:
debian-binary   1070194109  0     0     100644  4         `
2.0

Newer ar features (as longer file names, filesnames with spaces, ...) are
a violation of this standard; however, extracting tools should try to
support them as good as possible, but if they do not, that's just at
maximum a wishlist bug.


DEB 2 ARCHIVE MEMBERS

Archives with the major number 2 must have (after the initial member
debian-binary) in this exact order the members control.tar.gz and
data.tar.gz. After this, optional members can follow, but they must have a
'_' as the first character of their name.

control.tar.gz is a gzipped tar archive containing the package control
information, as a series of plain files, of which the file control is
mandatory and contains the core control information. Please see the Debian
Packaging Manual, section 2.2 for details of these files. The control
tarball may optionally contain an entry for `.', the current directory.

data.tar.gz contains the filesystem archive as a gzipped tar archive.


DEB 1 ARCHIVE MEMBERS

See the man-page deb-old(5) for a definition.
-- 
   http://home.arcor.de/andreas-barth/
   PGP 1024/89FB5CE5  DC F1 85 6D A6 45 9C 0F  3B BE F1 D0 C5 D1 D9 0C



Reply to: