[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Results of the Bootstrap/Crossbuild Sprint



This is the report for the Bootstrap [SPRINT] which was recently held in
Paris, France from 2014-08-16T10:00:00+02 until 2014-08-19:T18:00:00+02.
Six people attended the sprint:

 * Wookey (cross toolchain, bootstrapping, event organizer)
 * Guillem Jover (dpkg maintainer)
 * Matthias Klose (gcc, binutils, build-essential, python maintainer)
 * Aron Xu (mips64el porter)
 * Helmut Grohne (rebootstrap)
 * Johannes Schauer (botch)

We feel that our sprint was extremely productive and the next sections
will give an overview about what was discussed, achieved and planned for
the future.

  [SPRINT] <https://wiki.debian.org/Sprints/2014/BootstrapSprint>

This report has ended up quite long. Here are the major sections:

1. Problems with early bootstrapping
2. Partial archives for different ISAs
3. Cross compilers in main
4. Cross compile support in source packages
5. Bootstrap and crossbuild quality assurance
6. Build Profiles
7. Rebootstrap
8. Botch
9. Miscellaneous

------------------------------------------


0. General notes
----------------

Throughout this email we will use the terms "build architecture" to mean
the architecture that a package is built on, "host architecture" to mean
the architecture a package is built for and "target architecture" to mean
the architecture a compiler produces code for.

If not otherwise specified, the term "package" will mean both, "binary
package" and "source package". And if not otherwise specified, the term
"dependency" means both "build dependency" and "runtime dependency".

The term "Sources file" or "Sources" refers to the source package
control file that can be retrieved from mirrors or be found in
`/var/lib/apt/lists/`. The term "Packages file" or "Packages" refers
to the binary package control file from the same locations.

The term FTCBFS is analogous to FTBFS and expands to "fails to cross
build from source".

1. Problems with early bootstrapping
------------------------------------

1.1. Acquiring the list of `Essential:yes` binary packages
----------------------------------------------------------

When starting a new port, the host architecture binary packages are not
known. The only input is the data from a Sources file. Thus, the set of
binary packages which are `Essential:yes` are not known at this point,
because that information is encoded in the Packages file which does not
exist for the new architecture yet. It is important to know the set of
`Essential:yes` packages so that the right source packages can be cross
compiled for the new architecture to create a native build environment
that at the very least includes all `Essential:yes` packages and
`build-essential`. We decided that using the information of the
`Priority` field for binary packages is enough to solve this problem.
`Priority:required` packages are a small superset of `Essential:yes`
packages and their information is encoded in the `Packages-List` field
in the Sources file. This approach requires that the set of
`Priority:required` packages continues to be a very conservative and small
selection to not make the bootstrap problem harder.

1.2. Dependencies on `build-essential` are not spelled out by base packages
---------------------------------------------------------------------------

All binary packages implicitly depend on `Essential:yes` packages. All
source packages implicitly depend on `build-essential`. This assumption
helps reduce the number of needed dependencies in settings where all
`Essential:yes` packages and `build-essential` are available. This is not
the case during the initial bootstrap of an architecture where no packages
of the host architecture exist yet. Some of the source and binary packages
involved in creating the `Essential:yes` and `build-essential` packages
must have additional dependencies on packages which are otherwise
`Essential:yes` or `build-essential` so that cross-build dependencies
in an initial bootstrap can be correctly resolved.

1.3. Source packages that build virtual packages are not known
--------------------------------------------------------------

Since the Packages file does not exist in an early bootstrap it is not
possible to retrieve the information about which binary packages provide
virtual packages. If either packages depend on virtual packages, then it
is currently not possible to retrieve the source package that should be
built to create the binary package that provides the virtual dependency.
Here is a small overview of binary packages that were found to expose
this problem in an early bootstrap:

 * [virtual] -> [real]
 * libz-dev -> zlib1g-dev
 * libpng-dev -> libpng12-dev
 * libgpmg1-dev -> libgpm-dev
 * libselinux-dev -> libselinux1-dev
 * libgmp10-dev -> libgmp-dev
 * libltdl3-dev -> libltdl-dev
 * libltdl7-dev -> libltdl-dev
 * libncurses-dev -> libncurses5-dev
 * libexpat-dev -> libexpat1-dev
 * libtiff-dev -> libtiff5-dev

We see four solutions to this problem which all carry their own flaws
but can be used together to help mitigate each other's disadvantages.

1.3.1. `Provides` information in the `Packages-List` field
----------------------------------------------------------

The information of which virtual packages are provided by a binary
package could be encoded in the `Packages-List` field in a Sources file.
This has some restrictions, though:

 * the provided package must not vary between architectures (ocaml and
   haskell violate this assumption)
 * the Provides relationship must not be versioned
 * the Provides field must not contain substitution variables

1.3.2. Packages should depend on non-virtual packages
-----------------------------------------------------

If packages depend on virtual packages, they must carry a non-virtual
alternative as well to facilitate bootstrapping. We recognize that this
solution makes transitions harder which would otherwise just be solved
by a binNMU. Thus, this is probably not a solution that can be applied
to all packages.

1.3.3. Switching virtual and real package names around
------------------------------------------------------

Consider a binary package of the form `libfooN-dev` which provides
`libfoo-dev`. If the binary package does not bump API on each bump of
`N` then the problem can be solved the following way. Switch the package
content to a real binary package `libfoo-dev` and let it provide
`libfooN-dev`. When there is a `N+1` then there can be a real package
`libfooN-dev` while the content of the `N+1` goes into `libfoo-dev` and
provides `libfooN+1-dev`. When the transition is over, the real package
`libfooN-dev` can be dropped.

An example of a package which handles transitions like this even though
it is not a shared library is `automake`.

1.3.4 Adding real meta packages
-------------------------------

Another way of avoiding dependencies on virtual packages is making them
real, empty meta packages. A package building `libfooN-dev` could also
build `libfoo-dev` depending on `libfooN-dev`. The packages
`src:db-defaults` and `tcltk-defaults` take this to an extreme, but it
is also applicable to general library packages in theory.

1.4. Source packages create binary packages of different version
----------------------------------------------------------------

The information for which source package version corresponds to a
given binary package resides in the `Source` field of a Packages file
which does not exist in an early bootstrap. Thus, for now, versioned
dependencies are ignored during a bootstrap. If a package depends on
`foo (>= 1.0)` then the bootstrapper can only hope that if they build
the source package generating `foo` then the produced binary package
does actually satisfy the versioned dependency. We have not yet
encountered an example in practice where this limitation posed a
problem.

1.5. Rebuild for `Packages-List` field
--------------------------------------

Where solutions require more information in the `Packages-List` to be
present, the source packages in question must be rebuilt with a new
enough dpkg version once all features are implemented in dpkg. So at
some point in the future it might be necessary to do a rebuild of some
seldom uploaded source packages.

2. Partial archives for different ISAs
--------------------------------------

We discussed what would actually be needed in dpkg metadata to make
partial archives and generalised ISA optimisation mechanisms possible.
No-one has cared enough to implement this for a long time, but the MIPS
port is suffering enough that they would like to implement something.

Partial archives for different ISAs are particularly useful for having
optimizations for different hardware, rather than having everything run
the baseline. This is especially true for MIPS, since the baseline
ISA being used, MIPS III, is already a decade old, and there are quite
a few newer ISAs available, including MIPS64/MIPS64r2, and soon MIPS64r6.

A binary package field specifying the hwcaps (hardware capabilities) a
package is using/assuming could be added by dpkg or an external
helper.  However, that information should not be conflated with
dependency resolution, i.e. all such packages with different hwcaps
are equivalent in dependency terms. At the same time, dpkg could get a
new check in its --audit command to ensure that installed packages are
runnable on the current hardware. It's still to be decided whether the
baseline capability set is an empty or omitted field or a list of
capabilities.

Once we have a binary package field, apt pinning could use this to pin
to the appropriate sub-repository for the running hardware, though this
must allow a local override by the system administrator.

Repository layout and binary package testing migration management are
separate issues. The simple version is a separate repository, and
fallback to the baseline version if the optimized one does not exist
or cannot migrate. Otherwise DAK needs to understand the difference in
database, and could have separate tree, or filename suffixes could be
used in order to go in a shared file hierarchy.

3. Cross compilers in main
--------------------------

We discussed the state of the effort to get cross-compilers in jessie.

3.1. Cross compiler naming patterns
-----------------------------------

Package and binary naming patterns have historically been somewhat
arbitrary/confusing in terms of usage of Debian arch vs. GNU triplet
and perfix vs. suffix. So we have gcc-4.9-<triplet> package, but
<triplet>-gcc-4.9 binaries. And tools named with triplets, but
libraries named with arch-names.

We discussed the possibilities for making things more consistent, and
whether this provided significant user benefits.

But decided that:
 * given upstream/external conventions of <triplet>-<tool> naming,
 * the historic usage has been in place since at least 2001 resulting
   in a lot of documentation,

on balance it was probably best to leave things as they are, and
 * we could use provides for alternate package names.

3.2. What set of cross-toolchains do we build? Where is the list set?
---------------------------------------------------------------------

There should be a limited number of host/build cross compiler combinations
as every possible combination in the archive is very large and mostly 
pointless:
 * all cross compilers should build on fast/popular architectures.
   Definitely amd64. probably i386, maybe ppc64el, arm64.
 * targetting all reasonably popular architectures:
   armel, armhf, mips, mipsel, mips64el, arm64, ppc64el
 * and some other selected ones (e.g. armhf -> arm64)
 * having one metadata file to control which combinations are built by default
   seems like a good idea so as to keep the various source packages in sync.
 * that could live in cross-support?

 * it must be easy to build non-default combinations locally.

Current experimental packages do not have a shared matrix config
 * binutils builds on:
     amd64 i386 arm64
   targetting:
     armhf armel arm64 mips mipsel mips64el ppc64el i386 amd64
   (missing out the HOST=TARGET combinations)

3.3. Need for *-cross packages
------------------------------

These are needed for cross-gcc to be installable within an
architecture. The library packages such as libc-dev:<target-arch> and
libgcc1:<target-arch> are either built or downloaded and converted into
libc-dev-<target-arch>-cross packages with the libraries installed on
a different path.

They are needed for architectures that are not (or not yet) in the
archive as a multiarch build is not possible:
 * bootstrapping new architectures
 * mingw64
 * non-Debian cross-building

<debarch>-cross-toolchain-base packages exist in ubuntu to do this. Should
perhaps be called <debarch>-cross-toolchain-bootstrap in debian.

3.4. Crossbuild checks
----------------------

Many binary `Multi-Arch:no` packages still ship their headers in
`/usr/include`. This can lead to the wrong version being picked up
during cross or native building. To avoid this, two checks could be
implemented.

The first check could be Lintian warning if a non-`Multi-Arch:same`
package puts headers in `/usr/include`. If a binary package ships
architecture independent headers in `/usr/include` then it should be
marked `Multi-Arch:same`. If the headers differ between architectures,
then they should be moved to `/usr/include/<triplet>/`. Although this
might give some false-positives.

The second check could be added to a tool to be run locally, so that
developers using the current system (as opposed to a chroot) could check
if their system might taint their builds. That tool could be one such as
`adequate` or a new one from the devscripts toolset.

The multiarch wiki page should be changed to encourage multiarching of
*-dev packages, or at least stop discouraging it.

3.5. Conflict with gcc-multilib
-------------------------------

gcc-multilib package provides the asm symlink for the native
architecture (e.g. /usr/include/asm -> x86_64-linux-gnu/asm). Cross
compilers should conflict with gcc-multilib as otherwise it creates
an unclean environment.

3.6. Source packages layout
---------------------------

We discussed which binaries should come from which packages.

The current state of affairs is:
 * cross-binutils provides the binutils-<triplet> packages (in unstable)
 * gcc-cross-support provides the gcc-for-host and gcc-for-build
   packages (in NEW, see 4.1)
 * gcc-4.9 provides the (v4.9) native toolchain packages (in stable-unstable)

And there is a set of
 * cross-4.9-gcc-<debarch> packages providing the (v4.9) cross-gcc (and
   cpp, g++, gfortran) compilers targetting each arch. The set is generated
   from a git repo on alioth [CROSS-GCC]. This enables us to have one
   source per target arch so that architectures can fail to build
   independently, but still meaning that there is only one core set of
   code to maintain. Not yet uploaded.
 * there isn't yet a gcc-cross-defaults package. We propose to add that
   functionality into the existing gcc-defaults package.
 * <debarch>-cross-toolchain-base packages exist in Ubuntu to do a
   full toolchain bootstrap. These currently need work to get them
   building in Debian.

All these binaries are in [CROSS-TOOLS] for testing.

  [CROSS-GCC] <http://anonscm.debian.org/cgit/crosstoolchain/cross-gcc.git>
  [CROSS-TOOLS] <https://people.debian.org/~wookey/tools/debian/>


3.7. Cross pkg-config
---------------------

Currently, a cross build environment has to provide a symbolic link to
pkg-config-crosswrapper. This special treatment means more work from
cross builders. We came up with a way for pkg-config to contain this
symbolic link itself and submitted it as #759556.

This involves splitting pkg-config into pkg-config-bin (contains
everything currently in pkg-config, and remains M-A:foreign), and
making pkg-config MA:same, depending on pkg-config-bin, containing just
the symlink to pkg-config-crosswrapper). This means that depending on
pkg-config will bring in pkg-config-bin for the native arch and
pkg-config:$DEB_HOST_ARCH so crossing will work, removing the need for
the somewhat hacky approach of building <triplet>-pkg-config in the
<debarch>-cross-toolchain-base packages, and depending on them in
crossbuild-essential-<debarch> (not least because pkg-config isn't
build-essential).

3.8. crossbuild-essential
-------------------------

Currently, crossbuild-essential binary packages with an architecture
suffix are used to set up a cross build environment. These packages can
be built from `src:build-essential` but are not built for the Debian
archive yet (they _are_ built in Ubuntu). These packages will become
obsolete by extending `bin:build-essential`.

 * It will be marked `Multi-Arch:same` and thus be able to target
   multiple architectures at the same time.
 * It will depend on `gcc-for-host` (from gcc-cross-support) and thereby
   ensure that a cross toolchain is installed (see 4.1).
 * Its dependency on libc-dev will pull in the host architecture libc.
 * The compatibility symlink shipped for pkg-config's cross wrapper will
   be moved to `src:pkg-config` (see 3.8).

Then crossbuild-essential will no longer be needed and can be turned into
a transitional dummy package depending on build-essential.

crossbuild-essential also currently pulls in dpkg-cross. That will be
replaced by cross-support but something needs to ensure that it is
present, at least for autotools and cmake-using packages. This seems
to be the only remaining purpose of crossbuild-essential, and in this
case only one arch:all package would be needed, which would preferably be
pulled in by cmake or autoconf. This issue was not fully explored.

3.9. Cross binutils
-------------------

 * cross-binutils was updated to use dpkg-vendor instead of lsb-release
   and complain if it's null so you either get nothing or the right thing.
   (Previously it gave you the 'Ubuntu config' if lsb-release was not
   installed at build time.)
 * ppc64el was added to the list of target architectures built.
 * Rebuilt with latest binutils source and uploaded as cross-binutils_0.6.

 * The current packaging matches binutils and cross-binutils binary
   package numbers, which is neat, but this doesn't allow for new
   uploads unless there is a new binutils source. This is restricting
   and we agreed to add a cross-version suffix to add flexibility (such
   as adding a new arch even if binutils hasn't changed).

3.10. config.site
-----------------

autotools are used to examine properties of the host system. While most
of these checks can be executed on build machines, some cannot. The
results of these checks need to be recorded in a config.site file prior
to cross-building. Currently, these results are included in dpkg-cross,
and you have to set CONFIG_SITE=/etc/dpkg-cross/config.<arch> during
cross-builds. It makes more sense to move these results to the packages
that are being checked. We worked out a scheme for an infrastructure
providing /usr/share/config.site (which is sourced by autotools by
default). It will source files in /etc (local pre-configuration), then
include /usr/lib/config.site/* and /usr/lib/<triplet>/config.site/* and
afterwards source more files in /etc (local overrides). Files in the
config.site directories are going to be shipped by binary packages and
are named after the binary packages. For instance, the config.site file
shipped by libc6-dev is named libc6-dev and should contain
`scanf_cv_type_modifier=ms`. We need to write a policy document with
details and provide patches to affected packages.

This code should live in the cross-support package.

Helmut knocked up a proof-of-concept but it did not seem to be included
as expected. Further investigation needed.

3.11. Co-installable toolchains
-------------------------------

They are not going to happen for jessie. A patch for tcc (#695354)
demoing the concept has been around for a while, but implementing this
for gcc (#666743) requires more work and attention to not break
unrelated software. The first step towards co-installable toolchains
in the main toolchain is to always pass
`--program-prefix=$(DEB_TARGET_GNU_TYPE)- to configure for binutils
and gcc as is done for cross toolchains.  Also binutils must be
converted before gcc, because gcc depends on binutils.

3.12. Multiarch builds
----------------------

These work nicely on a local machine, with:

  $ apt-get --only-source build-dep <package>
  $ apt-get --only-source --build -oDpkg::Build-options="-aarm64 -B" \
    source <package>

See <https://wiki.debian.org/MultiarchCrossToolchainBuild> for details.
But this is no use for in-archive cross-toolchains as network access is
not allowed and we don't have source build-deps (only binary build-deps).

Packages to do the build in an archive-compatible way exist at
<http://anonscm.debian.org/cgit/crosstoolchain/cross-gcc.git>, but to
use those in the archive requires tools to understand what to do with
cross-arch dependencies.

 * We initially thought that a dpkg field to request such builds would
   be needed, but in fact sbuild can just infer it from the fact that a
   package has cross-arch buildd-deps (e.g libc6-dev:armhf)
 * Wookey is working on sbuild support for this, which is
   straightforward, but it is not done yet.
 * Other things in the archive will presumably get confused if they see
   such packages. wanna-build may think they are unbuildable (although
   if it uses dose it may just get it right?), britney may never migrate
   the resulting binary packages. Possibly other things could fail.
   Possibly other things could fail. It needs testing, which requires
   an upload to experimental, but we should check with ftp-masters
   first in order not to cause undue breakage.

 * We need to find out what is left to make these happen.
 * This needs to be in before jessie if we are to use it in jessie if
   any packages other than sbuild, DAK, wanna-build and britney are
   affected. Needs ftp-master input to determine this.

dpkg has had support for explicit arch qualifiers (i.e of the form
glibc:armhf), and the special foo:any and foo:native for some time.
So most things should already work, but it might need testing. Is
there anything that will break?

The crossbuild-essential package in Ubuntu has been in for some time
(and has Depends with arch-qualifiers).

3.13. Multiarch cross-toolchains vs single-arch cross-toolchains
----------------------------------------------------------------

This contentious issue was discussed, and is partly covered under
other headings. Wookey prefers the multiarch builds, Doko prefers the
single-arch bootstrap builds. We agreed that either provides useful
cross-toolchains. It's not clear whether it's easier to fix the Ubuntu
cross-toolchain-base packages to do a bootstrap build in Debian, or to
fix the blockers for multiarch builds in the archive. Whichever is
working first should get uploaded.

Some work on both was done during the sprint, with current multiarch builds
uploaded to the [CROSS-TOOLS] repo for testing, and various fixups of the
cross-toolchain-base-armhf package:
 * remove obsolete versioned build-deps (nearly all of them)
 * update versions for unstable
 * remove the binutils part of the build as that now comes from cross-binutils
 * start looking at why build fails.

[CROSS-TOOLS] <https://people.debian.org/~wookey/tools/debian/>


3.14. Wiki Docs
---------------

After the sprint Wookey wrote up some of this in the Debian Wiki at
<https://wiki.debian.org/CrossToolchains>.



4. Cross compile support in source packages
-------------------------------------------

4.1. Build dependency translation when cross compiling
------------------------------------------------------

If a source package explicitly depends on a versioned build tool like
gcc, g++ or cpp, then this build dependency might have to be translated
to the appropriate cross compiler when cross compiling the source
package. The [CROSS-TRANS] wiki page gives an extensive list of how
this could be accomplished. All existing solutions had considerable
disadvantages so we came up and agreed upon a seventh solution.

  [CROSS-TRANS] <https://wiki.debian.org/CrossTranslatableBuildDeps>

Consider the following example binary packages (it does not matter from
which source package(s) they will build). The left column shows a cross
compiler for mips as installed on a amd64 machine. The right column
shows them installed on mips machine but this time it's a cross compiler
for amd64.

               amd64           |             mips
-------------------------------+--------------------------------
                               |
Package: gcc-for-build         | Package: gcc-for-build
Architecture: all              | Architecture: all
Multi-Arch: foreign            | Multi-Arch: foreign
Depends: gcc                   | Depends: gcc
Contents: empty                | Contents: empty
                               |
Package: gcc-for-host          | Package: gcc-for-host
Architecture: mips             | Architecture: amd64
Multi-Arch: same               | Multi-Arch: same
Depends: gcc-mips-linux-gnu    | Depends: gcc-x86-64-linux-gnu
Contents: empty                | Contents: empty
                               |
Package: gcc-mips-linux-gnu    | Package: gcc-mips-linux-gnu
Architecture: amd64            | Architecture: mips
Multi-Arch: foreign            | Multi-Arch: foreign
Contents:                      | Depends: gcc
 /usr/bin/mips-linux-gnu-gcc   | Contents: empty
                               |
Package: gcc-x86-64-linux-gnu  | Package: gcc-x86-64-linux-gnu
Architecture: amd64            | Architecture: mips
Multi-Arch: foreign            | Multi-Arch: foreign
Depends: gcc                   | Contents:
Contents: empty                |   /usr/bin/x86_64-linux-gnu-gcc

With this binary package layout, source packages which need a versioned
gcc build dependency can now build depend on gcc-for-host. Should the
source package require execution of the native compiler during a cross
build, then the source package can build depend on gcc-for-build as
well. Packages depending on gcc-for-host have to call the triplet
version of gcc.

The mechanism works because the gcc-for-host is `Multi-Arch:same`.
Therefore, if it is build depended upon, its host architecture version
will be chosen. Depending on its architecture, gcc-for-host will depend
on a different gcc-<triplet> binary package. The gcc-<triplet> binary
packages in turn are `Multi-Arch:foreign` and either ship a cross
compiler or depend on the native gcc depending on their host
architecture. If gcc-x86-64-linux-gnu:amd64 is installed, then one will
get the native gcc compiler on amd64 through its dependency on gcc. If
gcc-mips-linux-gnu:amd64 is installed then one will get the cross
compiler for mips. When using gcc-for-build the native compiler will be
installed as gcc and no assumptions may be made on the architecture of
binaries created by this compiler.

The system works because dependencies on `Multi-Arch:foreign` packages
should be satisfied by the native architecture version of the package
if available. Currently, apt resolves `Multi-Arch:foreign` dependencies
in that way. On the other hand, the dependency of gcc-for-host on
gcc-mips-linux-gnu can also be satisfied by a foreign architecture
package. This package would be the native version of the foreign
architecture compiler. It would produce the code of the right
architecture but would probably not be executable on the build
architecture. This is a general property of `Multi-Arch:foreign`
packages which normally only causes problems when packages are missing.

An implementation of the above scheme has been uploaded as
`src:gcc-cross-support` to experimental.

4.2. help2man issue
-------------------

Source packages that build depend on help2man cannot be cross compiled
because they require a build architecture binary to be executed for
generating the man page. There exist different solutions to the problem:

 * one could build twice, first native then cross and run help2man on
   the native version of the binary.
 * one could introduce substitution variable support in Build-Depends in
   dpkg-dev so that the source package could build depend on the exact
   same version of the binary package it builds to generate the man page
   from it. This is filed as #751437.
 * one could use the nodoc profile.

We will start using the last option for bootstrapping new architectures
even though it does not solve the general cross compilation problem.

4.3 libtool
-----------

Build dependencies on libtool make cross build dependencies of a quarter
(83) of the source packages in the transitively build-essential set
unsatisfiable because the build as well as the host architecture version
of the libtool binary package would be required but libtool cannot be
co-installed and is not Multi-Arch:foreign either.

This bug was filed as #682045 (over two years ago) and was taken care of
by Matthias. The libtool binary was split into libtool and libtool-bin
and an upload was done as agreed with the libtool maintainer. The next
step is to do an archive rebuild with a modified libtool not depending
on libtool-bin, fixing packages that fail to compile and then dropping
the runtime dependency.

4.4 Guile
---------

rebootstrap had highlighted that guile failed to cross where arm or arm64
was the host arch. Wookey and Helmut investigated, and fixed its
arch/endianness-detection.
 * Bug filed #758463, maintainer happy.


5. Bootstrap and crossbuild quality assurance
---------------------------------------------

5.1. dose-builddebcheck for cross
---------------------------------

`dose-debcheck` is run regularly for all binary packages at [DEBCHECK].
Ralf Treinen was contacted and agreed that `dose-builddebcheck` could be
run in addition as well. We have to contact him and ask about the
possibility to also include running `dose-builddebcheck` with a foreign
host architecture for selected targets to spot FTCBFS problems early.
Such a service would enhance the results that are already generated
at [CROSS-BOOTSTRAP].

  [DEBCHECK] <https://qa.debian.org/dose/debcheck/>
  [CROSS-BOOTSTRAP] <http://bootstrap.debian.net/cross.html>

5.2. Archive cross rebuilds
---------------------------

Just as full archive rebuilds are regularly done natively, they could
also be done cross regularly. It is fairly simple to do one-off test
rebuilds as sbuild has cross compilation support. Wookey and Colin Watson
have both done this in the past. It is harder to set up a continuous
cross-buildd as buildd tools such as rebuildd and debile need changes.

The results of those rebuilds should be displayed on the PTS or the new
tracker platform. Possible status could be
 * 'cross-builds OK', 'cross-Build-Deps-uninstallable',
   'does-not-cross-build (and never has)' and
   'does-not-cross-build (but used to)'

Some tension between using the main archive (get realistic results -
usually 'cross-Build-Deps-uninstallable'), and using an archive with
some stuff fixed (find out if packages actually cross or not).

Wookey has a box that can be used for this and will set it up. He will
also sit down with debile people to check/add cross-support at Debconf.

5.3. `Cross-Builds:no` field
----------------------------

When starting to cross build the archive regularly, then there should be
a method to mark source packages as not-cross buildable. Packages would be
marked as such if their build system does not allow any cross compilation.
This is for example the case if there exists no cross compiler (example:
gobject-introspection). We already store similar information in the
Architecture field where we restrict the architecture to certain values
if the software project can only be built for a subset of architectures.

Having a `Cross-Builds:yes` field would invite trouble as it is not clear
for which combination of build/host architecture the source package is
supposed to cross compile. A `Cross-Builds:no` field will mark the source
package as not cross compilable at all. With this field, an archive auto
cross builder will not attempt to cross build those source packages and
will also not warn in the PTS or tracker about the failure. Until such a
field is agreed upon, the information can also be stored local to the
cross rebuild machinery or stored in a custom `XS-Cross-Builds:no` field.

Another good heuristic is to check the build dependencies of source
packages before build. If they build depend on, for example
gobject-introspection binary packages, then it is unlikely that they can
be cross compiled. This heuristic avoids having to flag tons of packages
but will also not always be correct.

5.4. Bootstrap info in PTS/tracker
----------------------------------

The PTS/tracker could show information about hard dependency cycles
(self-cycles). There exist two patches for the PTS (#745618 and #728298)
which have to be rewritten for the new tracker. Additionally, the
information should only be displayed once source packages with the new
build profile syntax can be uploaded to the archive. Thus, any such
information can only be available once jessie is released. In addition
it would be a good long term goal to move the machinery behind
bootstrap.debian.net onto Debian infrastructure to make collaboration
easier.

5.5. Display multiarch hints in PTS/tracker
-------------------------------------------

Downstream distributions of Debian, like Ubuntu, can be used to determine
which packages in Debian lack multiarch annotations. Especially Ubuntu is
useful as a source, as multiarch cross building has been tested there for
a while and thus more packages have been fixed. Botch contains a script
which can extract this information and will be run regularly on
bootstrap.debian.net. Once this is done we will post a bug with a patch
for the PTS to include this information.

6. Build Profiles
-----------------

6.1. New evaluation logic
-------------------------

The original build profile proposal defined an evaluation scheme of the
restriction list which was very hard to understand, contained corner
cases that would probably never be used and also limited expressiveness
a lot. We agreed that if this new syntax is introduced it should allow
for maximum expressiveness and be intuitive and easy to understand at
the same time. We decided that letting profile restrictions form a
disjunctive normal form would satisfy both requirements. The new
evaluation scheme makes more conditions specifiable and is, at the same
time, easier to understand and implement.

Here is an example:

     Build-Depends:
      foo (>= 1.0) <stage1 cross>,
       bar [amd64] <stage1> <cross>

The source package will depend on foo (>= 1.0) if both stage1 and cross
are active as a profile. The source package will depend on bar on amd64
if at least one out of stage1 or cross are active. Terms within <>
"brackets" are AND-ed and form a conjunction. Multiple of those groups
are OR-ed and form a disjunction. This also means that in contrast to
the original proposal, the order of build profiles does not matter. With
this solution, build profiles gain the same amount of expressiveness as
Gentoo USE flags.

As a result of this change, we will also drop our efforts to make the
archive understand the build profile syntax before the jessie release.
This also led to a reassignment of #744246 to dpkg.

Patches for dpkg and apt to support this change have been written and are
currently being tested.

6.2. No "profile." prefix
------------------------

We decided to drop the "profile." prefix from the restriction syntax.
The "profile." prefix was included in the original proposal to allow for
future extensions of the syntax. But we cannot see how the build profile
syntax as a selector of build dependencies can be extended for other use
cases. All future additions that we came up with and that were suggested
on discussions on debian-devel only imply a tagging of build dependencies
but not their selection. But a tagging mechanism, if ever introduced,
would not need to make use of the <> brackets as a meta character. With
the current meaning of the syntax, any other prefix namespace can only
ever take the role of a dependency selector and would thus fit in the
"profile." namespace. Proponents of the need for "future extensibility"
of the syntax were at no point able to give us a single concrete example
for such a use case. We therefore think that having a prefix mechanism
would only introduce unnecessary complexity in the implementation and
would make the restriction logic harder to read, write and understand
for no conceivable practical gain.

6.3. Binary package contents
----------------------------

We confirmed agreement (originally mooted at 2006 Extramedura sprint) that
Binary packages built with a build profile must contain the exact same
content as when they are built without a build profile (a full build).
This requirement is necessary to not break the dependency system. Anybody
depending on a binary package must continue to be able to make the
assumption that it ships a certain content, no matter how the binary
package was built. If binary package contents change significantly due to
a build with fewer dependencies and thus less features, then the binary
package has to be split into two or more individual binary packages.

This requirement should be relaxed for builds with the "stage1", "stage2"
and "nodoc" profiles. With these profiles, binary packages are allowed to
leave out content which does not expose a functional interface. Such
content would be documentation like man pages or locale information or
anything that resides in /usr/share/doc/packagename (but usually not
copyright and changelog information).

6.4. nodoc build profile
------------------------

In addition to the existing build profiles "stage1", "stage2", "nocheck",
"nobiarch" and "cross" we added an additional profile called "nodoc".
Its purpose is to do builds without the build dependencies required
for building documentation. There is no policy defined "nodoc" value
for DEB_BUILD_OPTIONS. Current source packages use "nodoc" as well as
"nodocs". We decided to go for the singular as "nocheck" is singular as
well. In addition, the prevalent convention for documentation packages
is to name them as *-doc, so they have a "doc" postfix and not a "docs"
postfix. We suggest adding "nodoc" as a DEB_BUILD_OPTIONS option to policy
(#759186).

6.5. Relationship to DEB_BUILD_OPTIONS
--------------------------------------

Some DEB_BUILD_OPTIONS like "nocheck" and "nodoc" are mapped to build
profile names. In the long term, build profiles will take over the job
of these DEB_BUILD_OPTIONS. For now, DEB_BUILD_OPTIONS and
DEB_BUILD_PROFILES will coexist and package builders should specify both.

6.6. Profile name application
-----------------------------

Especially the "stage1" and "stage2" build profiles with the meaning of
"build the source package with fewer features" overlap with other profile
names like "nocheck" and "nodoc". If in doubt, build dependencies should
be flagged with the more specialized build profile. So instead of writing:

   Build-Depends: texlive-latex-base <!nodoc> <!stage1> <!stage2>

It should be enough to write:

   Build-Depends: texlive-latex-base <!nodoc>

In the same way, instead of writing:

   Package: foobar-doc
   Build-Profiles: <!nodoc> <!stage1> <!stage2>

It should be enough to write:

   Package: foobar-doc
   Build-Profiles: <!nodoc>

It is the task of the builder to figure out which profiles to activate to
perform a build.


7. Rebootstrap
--------------

[REBOOTSTRAP] is a QA tool for highlighting general issues with
automatically bootstrapping new architectures. Guile crossing was fixed
(see above), but not much further effort was spent on rebootstrap itself
due to infrastructure issues making it go very slowly indeed, but experience
with the project served as input or origin for points 1.1, 1.2, 1.3, 1.5,
3.7, 3.8, 3.10, 4.1, 4.2, 4.3, 5.1 and 8.1.

  [REBOOTSTRAP] <https://wiki.debian.org/HelmutGrohne/rebootstrap>


8. Botch
--------

8.1. Bootstrap from zero
------------------------

Johannes integrated a new script from Helmut into botch which allows the
creation of a build order without having access to the host architecture
Packages file which may be non-existent at this point. Botch did not
yet include such a facility because it catered for two other use cases:
calculating a build order for rebootstrapping existing architectures
and calculating and analyzing a dependency graph to find the best points
to break it (i.e. finding a feedback arc set). With the new script botch
will be able to also handle the case where not even the information about
the host architecture binary packages exists in the beginning. This will
also allow botch to be used to schedule transitions as the resulting
binary packages are not available in that scenario either. The challenges
of bootstrapping from nothing have been explained in the first section.

8.2. Miscellaneous additions fixes
----------------------------------

 * Man pages for all 43 commands that the botch binary package ships
   have been written.
 * Botch now uses Python3 exclusively and dropped Python2 support.
 * The botch-wanna-build-sortblockers script now supports
   bd-uninstallable output of wanna-build.
 * Botch was uploaded to NEW.

8.3. Native re-bootstrapping
----------------------------

After a bootstrap has been done on ports.debian.org it has to be done
again on debian.org. To break cycles, some binary packages from
ports.debian.org have sometimes to be pushed to debian.org. The
question is which set of binary packages is the (close to) smallest
set that needs to be pushed to debian.org so that the whole archive
can be bootstrapped. This issue arose from the ongoing arm64 and
ppc64el bootstraps. Theoretically this info can be easily calculated
by botch by listing which source packages would be chosen to build
with a build profile to re-bootstrap the archive on ports.debian.org.
Unfortunately in practice it turns out that there are many cases where
the source packages on debian.org do not have matching binary packages
on ports.debian.org or the other way round. This makes it impossible
for botch to find an exact solution. It is up to future tests whether
falling back to the closest source version gives a good enough
approximation to facilitate an automatic choice of what binaries to
upload next from ports.debian.org to debian.org.

9. Miscellaneous
----------------

9.1. Find outdated versions
---------------------------

Versioned dependencies are problematic for bootstrapping because
versioned compiler dependencies have to be translated and the versions
of binary packages is not known a priori during a bootstrap from zero.
Many packages in the archive declare versioned dependencies with a
minimum version which is even fulfilled by oldstable. We propose that
lintian warns if a version constraint is obsolete because it would even
be fulfilled by oldstable. This was filed as #758425 against Lintian.

9.2. dpkg can set the target architecture
-----------------------------------------

To have a common and unified interface, it was agreed with Guillem that
dpkg-buildpackage should have a --target-architecture (or similar) switch
added which sets the target architecture just as -a sets the host
architecture. Perhaps the variables might only be output if the switch
is passed, to avoid further confusing maintainers that do not usually
need those variables.

This is basically now implemented (not pushed yet though), needs some
testing and polishing, but should make it into dpkg 1.17.14.

9.3. Fixed bugs
---------------

The following bugs have been fixed during the sprint in addition to the
ones mentioned earlier:

 * #758408
 * #750478
 * #746523

9.4. Filed bugs
---------------

The following bugs have been filed during the sprint in addition to the
ones mentioned earlier:

 * #758463

9.5. Fix dpkg-maintscript-helper
--------------------------------

The pkg-config changes explained above require a dir-to-symlink switch,
and the symlink target pathname should be relative as specified by the
Debian policy. But dpkg-maintscript-helper was not handling those
correctly. Helmut prepared a patch to fix the dir_to_symlink command, and
Guillem merged and added a check to the symlink_to_dir command so that it
ensured only absolute paths are being passed on. These were included in
the dpkg 1.17.13 upload.

9.6. Arch:all sometimes implies M-A:Foreign for dpkg-checkbuilddeps
-------------------------------------------------------------------

Helmut demonstrated an unreliable case where dpkg-checkbuilddeps seems to
misbehave, and although the causes are not clear, Guillem agreed that a bug
should be filed so that it can be tracked down. Helmut will produce a test 
case and file a bug.

9.7. Fixing the multiarch interpreter problem
---------------------------------------------

The interpreter problem could be solved by "virtually" letting dpkg treat
arch:all packages as if they were arch:any. This is described in more
detail in <https://wiki.debian.org/HelmutGrohne/MultiarchSpecChanges>.

The proposal boils down to tracking the real architectures an arch:all
package has been configured for, which would change depending on the
surroundings. The issues that make this proposal (as currently specified)
pretty much impractical are (at least):

 * An arch:all package would need to be configured each time a new
   "virtual arch" is added to it. This implies that the dpkg caller
   (either a frontend or a human) might need to issue thousands of
   configure calls for all those packages. Those configure storms might
   get out of hand pretty quick, if combined with the rest of the install
   or upgrade process.
 * Another issue, specific to perl, is that perl-base is Essential:yes so
   scripts that use directly or indirectly an arch:any XS module and only
   use modules from perl-base have an implicit dependency on it. Which
   means dpkg would not have any way to ensure architecture coherency.
   Overcoming this may require, perhaps, moving perl out of the
   `Essential:yes` set, but that is quite a big undertaking.

Helmut and Guillem discussed this proposal at length, but Guillem was not
persuaded that it was the right way to deal with things, due to the above.

10. Epilogue
------------

We would like to thank Roberto di Cosmo for kindly hosting the Bootstrap
Sprint at IRILL in Paris, France. We'd also like to thank Debian and its
donors who covered our travel expenses and without whom this sprint could
not have happened.

Wookey
-- 
Principal hats:  Linaro, Debian, Wookware, ARM
http://wookware.org/

Attachment: signature.asc
Description: Digital signature


Reply to: