[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Subpackaging (Was: Potato now stable)

Drake Diedrich <Drake.Diedrich@anu.edu.au> wrote:
>    Under the Irix packaging system (quite nice UI except that it has to
> handle Irix packages..) packages exist in a hierarchy, with lowest level
> packages quite fine grained.  For example:
> I  fw_bzip2             02/28/2000  bzip2-0.9.0c Compress/decompress files
> I  fw_bzip2.man         02/28/2000  bzip2-0.9.0c man pages
> I  fw_bzip2.man.bzip2   02/28/2000  bzip2-0.9.0c man pages
> I  fw_bzip2.man.info    02/28/2000  bzip2-0.9.0c info pages
> I  fw_bzip2.man.relnotes  02/28/2000  bzip2-0.9.0c Release Notes
> I  fw_bzip2.sw          02/28/2000  bzip2-0.9.0c execution only env
> I  fw_bzip2.sw.bzip2    02/28/2000  bzip2-0.9.0c execution only env
> I  fw_bzip2.sw.hdr      02/28/2000  bzip2-0.9.0c header files
> I  fw_bzip2.sw.lib      02/28/2000  bzip2-0.9.0c shared libraries
> I  fw_bzip2.sw64        02/28/2000  bzip2-0.9.0c 64-bit execution only env
> I  fw_bzip2.sw64.lib    02/28/2000  bzip2-0.9.0c 64-bit shared libs

This looks cool. I like it.

How about a little brainstorming to pick some categories that could be used
in debian.

Possible layout
control.tar.gz            package system stuff, depends, postinst, etc
signatures.tar.gz         signatures for each part of the package
binary/*.tar.gz           arch-dependent data and programs for each arch
data.tar.gz               arch-independent data and programs
doc.tar.gz                docs not in packages below (includes copyright)
doc/html.tar.gz           html format
doc/ps.tar.gz             postscript format
doc/dvi.tar.gz            dvi format
doc/text.tar.gz           text format
doc/man.tar.gz            man pages
doc/info.tar.gz           info pages
doc/examples.tar.gz       /usr/share/doc/examples/*
locale/*/gettext.tar.gz   gettext translations
locale/*/doc/html.tar.gz  html translations
locale/*/doc/ps.tar.gz    postscript translations
locale/*/doc/dvi.tar.gz   dvi translations
locale/*/doc/text.tar.gz  text translations
locale/*/doc/man.tar.gz   manpage translations
locale/*/doc/info.tar.gz  info translations
source/original.tar.gz    upstream source
source/debian.diff.gz     debian diff
copyright                 copy of copyright or symlink to common-licence

Packages could be made available for each $(ARCH), including packages
optimised for subarchs.

The locale support is sorted by locale, rather than by file format, so that
ftp users can more easily just download their locale, by downloading the
directory for their locale.

All the parts of the package would be optional apart from the control.tar.gz,
in that way it would be possible to build task packages with no filesystem,
just a copyright notice with the package on the mirror.

How would it be implemented?
My recommendation would be one directory per package. Each subpackage could
just be part of a .tar.gz file. Having the binary dependent parts listed here
would imply that the package locate could change from looking like this:


to looking like this:


apt, dselect and friends would be changed so that when a package was selected
the default set of subpackages would be installed to match the current
behaviour. When a package is selected all of the subpackages would be selected
except for the binaries and the source. Only the binary of the current arch
would be installed. The user could selected a more detailed screen in dselect,
or use command line switches with apt-get to select the subpackages to be
installed or not installed.

What use would this be?
 * Disk space

Machines with small hard drives could have packages installed without docs or

Single-user systems could save space by only installing translations for
languages that were needed.

Docs could be installed in preferred formats only.

A set of binaries built could be built with -Os passed to gcc and with extra
build options turned off to try and make them as small as possible. This would
be like the *-tiny packages now, but without needing to replicate the docs
and man pages.

Some space would be saved on the server because man pages and friends would
not be stored for each arch.

For an even more compact system each executable and library in the package
could be split into a separate subpackage, but means we tend to lose some of
the benefits of the packaging system, and just have a load of files.

The Gnome applets could be packaged as one package, called gnome-applets, but
within that package there is one .tar.gz for each applet. By default all the
applets are installed, but the user can select with more details the exact
binaries they want. If some modules are not selected this would save space 
on an install, during install and save bandwidth when downloading. An
alternative would be to include all the applets in one .tar.gz and selective
not install binaries. This would save space in the installed form, but would
not save space on during install or save bandwidth during a download.

 * Bandwidth

On slow net connections only the parts of the package that have changed need
to be redownloaded. Subpackages could have different versions, so that
modifications in the man pages mean there is a new manpage archive released,
with a higher version, but the other subpackages stay as they are.

Low bandwidth machines could avoid installing documentation to make downloads

 * Ease maintenance

Documentation auto-builders could be created. The maintainer supplies the
documentation in one format and an auto-builder converts it to another format.

Translations split into subpackages make it easier to have separate
maintainers for each locale. When a translation is updated there does not need
to be a new release of the whole package.

 * Legal

We are breaking the rules on copyright at the moment. We distribute binaries
licensed under the GPL without a copy of the GPL. Having a symlink to the GPL
in the directory of each package that is licensed under the GPL would fix
this. Have the standard copyright file in doc.tar.gz, apt could avoid
downloading the separate copyright file.

 * Other

Optimisation could be included for subarchs.

Auto-builders would only have to rebuild if the arch-dependent subpackage

More binary subpackages could be added, to include statically linked
executables and libraries, or binaries built with debugging symbols. 

More thoughts and problems
A lot of this has been said before. I am not sure if this is the right way of
looking at it. Some of this e-mail is a bit rambling.

It would need somebody to implement it, changes would be needed in the
archives, mirrors, policy, backend and frontend packaging tools. Probably
there is too much work, and changes like this will probably not happen. I
suppose package pools would require the same things to be changed.

Build scripts might need changing to support parameters being passed to gcc to
build subarch optimised versions.

The availability of documentation in different formats could be implemented by
generating them on users machines when they request documents in different
formats, rather than the maintainer having to provide them. This is partly
where the idea of a documentation autobuilder came from.

The idea of a document autobuilder is still a bit odd, and would require some
more thought on how it would be implemented, and how much use it would really
be. To be honest, I think I have become slightly side-tracked with this idea.

How would dependencies work? Would each subpackage have its own dependencies,
or would they share. How about the dependencies for each arch?

Different options of building binaries could quickly lead to a lot of disk
space being wasted unnecessarily. Imagine 8 linux-archs + hurd-i386, then
imagine statically linked binaries for each, add subpackages built with
debugging symbols and binaries that are both statically linked, and have
debugging symbols, that is 9 archs, and 4 compilation styles giving a total of
36 subpackages for each binary-dependent package. Then add all the subarchs,
that could be optimised for, packages with different build options and there
are a lot of packages.

Packages that provide the same documentation in different formats do not
always include the same documents in the different formats, but instead
different documents are included in different formats. An example might be

Not: README, README.html and README.ps

But: README, manual.html and specification.ps

Which subpackages do these files go into? Do they all go into doc.tar.gz, or
are they split across text.tar.gz, html.tar.gz and ps.tar.gz? Do we include
README doc.tar.gz, put specification.ps in ps.gz, and make a text version of
manual.html using links, put the html version in html.tar.gz and the text
version in text.tar.gz, as I say this requires some thought.

The correct way to solve the problem of download size and low bandwidth
connections would be to provide deb-diffs or an rsync service.

What happens when a user selects to install binary-i386 and binary-m68k
packages? Is this disallowed, or is the none current arch installed to a
different directory. So that if we are on an i386 and select a binary-m68k
subpackage the files going to /usr/bin get installed in /usr/bin/m68k.

Don't worry  --  shop.

Reply to: