[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Second take at DEP17 - consensus call on /usr-merge matters



Hi,

thanks for all the valuable feedback on the huge DEP17 thread. As
promised, I looked into condensing that discussion into something
shorter. That shorter thing still has more than 3000 words and is
available as source at

    https://salsa.debian.org/dep-team/deps/-/merge_requests/5

and I also put up a rendered version at

    https://subdivi.de/~helmut/dep17.html

and for those preferring to read offline, I'm also appending the main
file to this mail.

This still is a draft and it fundamentally still lacks the "proposal"
part of a DEP. What it does however is collecting all the problems we've
encountered in the discussion and most of the proposed mitigations and
it has gathered a few reviews already.

Consensus proposal #1:

    This updated DEP17 is a complete representation of the known and
    relevant problems and known mitigations under discussion at the time
    of this writing.

Do you miss a related problem important to you? Do you miss your
preferred mitigation? Please speak up, so we can record it.

Stating a goal has been quite difficult, but I think that most of the
participants agree that we want to eliminate the file move moratorium
without getting problems in return. And I also hope that my listing of 9
problems is agreeable by the project. We may need changes still and we
may encounter new problems, but this tries to capture the best we have
right now.

When we get into mitigations, consensus is hard to come by. My
impression is that we have roughly two camps. One is in favour of major
changes to dpkg and the other is not.

Arguments in favour of changing dpkg (dubbed M1 in DEP17):
 * We change one piece and doing that fixes a whole host of problems.
 * dpkg learns to understand the current situation and avoids us
   applying a lot of workarounds.
 * A solution may be more generally applicable.
 * There is no urgency with moving files to their canonical locations if
   dpkg knows to identify aliased and canonical locations.

Arguments in favour of not changing dpkg:
 * We'd implement changes for one transition, but they pose a permanent
   maintenance cost that does not become easily removable later.
 * Even if dpkg were fixed in trixie, trixie's libc6 would still be
   unpacked using bookworm's dpkg. At least for this package, we cannot
   rely on dpkg changes in trixie. Therefore we need workarounds or
   spread the transition to forky. For other packages, even a
   Pre-Depends on dpkg is not a guarantee that a changed dpkg will
   perform the unpack.
 * Changes to dpkg will not fix bootstrapping.

There also is a minority arguing in favour of doing both. I've kinda
ruled out that option already as we get the downsides of both without
any further benefit in return.

My impression is that the (vocal) majority falls into the latter
category. The major alternative here is getting into a state where all
paths shipped in binary packages are not aliased (dubbed M2 in DEP17).
Therefore, I propose a second consensus call.

Consensus proposal #2:

    The primary method for finalizing the /usr-merge transition is
    moving files to their canonical locations (i.e. from / to /usr)
    according to the dpkg fsys database (i.e. in data.tar of binary
    packages).  dpkg is not augmented with a general mechanism
    supporting aliasing nor do we encode specific aliases into dpkg.

I recognize that this is not unanimous, but I think we still have
sufficient consensus on this. I suspect that maybe Simon Richter and a
few more would disagree here. If consensus fails, we may have to put
this to a vote.

Once that is settled, the next big question is how to handle
bootstrapping. We had a number of people arguing in favour of changing
the bootstrap protocol. Such changes can be classified into generic
changes and release-dependent changes. A release-dependent change
enhances bootstrapping tools with knowledge of available releases and
adapts behaviour in release-specific ways that are encoded into the
bootstrapping tool. As it stands, the only bootstrapping tool that has
been enhanced in this way is debootstrap and that support is limited to
Debian and two derivatives. The category of generic changes includes
imposing an ordering on initial unpacks (e.g. base-files first). Others
are in favour of not changing the bootstrap protocol. In that view, some
data.tar will have to ship the symbolic links and all other essential
packages need to have their files canonicalized.

Among these options, the first has a working prototype (debootstrap),
but it is unclear how that extends to use of snapshot.d.o and how to
make it work with debootsnap and debbisect as those tend to use a suite
name rather than a codename. The last option has a prototype and relies
on uploading a number of essential packages in a short window of time.
(What could possibly go wrong?)

It is not clear to me how we can get to a consensus on these, so the
best I can do here is summarize options.

Option #3a:

    The bootstrap protocol shall be changed to contain a task for
    merging /usr as is done in debootstrap in a release-dependent way.

This is an instance of M16 from DEP17.

Option #3b:

    The bootstrap protocol shall be changed in unspecified ways to
    support the /usr-merged systems in a way that does not depend on
    matching the codename or suitename.

This is an instance of M16 from DEP17.

Option #3c:

    The bootstrap protocol shall remain unchanged. Therefore, all
    essential packages need to move their files out of aliased locations
    and the aliasing symlinks are to be installed from a data.tar of a
    binary package such as base-files.

This is M2+M11 from DEP17.

While a few people including Marco d'Itri and Sam Hartman have argued in
favour of exploring the space of #3b, no proposals have emerged in the
interim. The proposal in #3a has three significant limitations:
 * It creates compatibility issues when combining old a new suites
   unless changes to bootstrapping tools are backported to older
   releases.
 * It becomes a whack-a-mole, since we need to add codenames of every
   derivative to every bootstrapping implementation.
 * It breaks bootstrapping from snapshot.d.o and therefore breaks tools
   such as debbisect and debootsnap.

While the first of these limitations is shared with #3b, the others are
not and that would make #3b more attractive to me if there was a
concrete proposal to evaluate. The one about unpacking base-files first
seemed the most concrete to me, but it has the downside of imposing a
permanent cost on bootstrapping tools even though we only need that
behaviour temporarily, which seems like too bad of a trade-off to start
with in my opinion. Did I miss a relevant proposal for modifying the
bootstrap protocol?

On the flip side, there is a demo for #3c showing that we can move most
of the things except for a hand full of packages and then flip the
switch (for bootstrapping) in unstable by uploading those packages
simultaneously. The biggest downside of this probably is the inherent
fragility of this approach. Even if this is extensively tested before
uploading chances are good that we still break something unforeseen in
the process.

Can I get more feedback from those who rather not have #3c implemented as
to how you see things moving forward?

Assuming that we settle on #2, the question arises whether we want to
change dpkg at all. Loss of filesystem resources is an aspect that
multiple problems (P1, P6, P7, P9) have in common and there is a
relatively isolated change to dpkg (M3) that can turn these unintended
file losses into a warning. This is a change that resolves entire
problem classes, but we can only rely on it after dpkg has been upgraded
and re-executed, which effectively means that we cannot assume the new
behaviour for upgrades from bookworm to trixie. This change possibly is
backportable to bookworm, but we currently do not require upgrading to
the latest point release before the dist-upgrade.

Option #4a:

    The finalization of the /usr-merge transition shall not rely on
    changes to dpkg, because such changes cannot be relied upon during
    the upgrade to trixie.

This is denying M1 and M3 from DEP17.

Option #4b:

    While dpkg does not gain an understanding, it shall be temporarily
    changed to become more robust about some forms of unintended file
    loss. Once the /usr-merge transition has completed, this change is
    reverted.

This is M3 from DEP17.

Probably, the biggest benefit of doing this is gaining robustness in the
presence of external repositories after a successful upgrade to trixie.
It is not clear whether this is worth the cost.

Another aspect that hasn't gathered a uniform opinion is the handling of
update-alternatives. It is a bit special, because it has turned paths
(aliased or not) into API. We can either leave these paths (as presented
to update-alternatives) unchanged and thus retain compatibility or
migrate them to canonical locations and thus have users update their
scripts and automation (e.g. ansible/chef/puppet).

Option #5a:

    The paths used to interface with update-alternatives remain
    unchanged.

This is M13 from DEP17.

Option #5b:

    update-alternatives shall be temporarily wrapped such that it deals
    with both aliased and canonical paths equally.

Unlike many other aspects, we don't have to settle this question now and
can move from #5a to #5b at a later time (but not the other way round).

And then we have a couple of less important aspects to decide, which I
hope we can leave up to those implementing this.

So I hope that this mail results in a number of responses agreeing or
disagreeing with the various consensus and opinion items. In order to
reduce list traffic, you may reply to me directly if you only include
agreement/disagreement and no technical arguments nor reasoning. I will
count such private replies and give numbers here together with public
replies. Please indicate whether you want to stay anonymous in this
case.

I also hope that this mail results in detailed disagreements that I can
use to refine DEP17 and to base further research on.

Helmut

dep17.mdwn follows:

[[!meta title="DEP-17: Improve situation around aliasing effects from `/usr`-merge"]]

    Title: Improve situation around aliasing effects from `/usr`-merge
    DEP: 17
    State: DRAFT
    Date: 2023-03-22
    Drivers: Helmut Grohne <helmut@subdivi.de>
    URL: https://dep.debian.org/deps/dep17
    Source: https://salsa.debian.org/dep-team/deps/-/blob/master/web/deps/dep17.wdwn
    License: CC-BY-4.0
    Abstract:
     This document summarizes the problems arising from our current `/usr`-merge
     deployment strategy. It goes on to analyze proposed solutions and analyzes
     their effects and finally proposes a strategy to be implemented.

Introduction
============

Debian has [chosen](https://lists.debian.org/8311745.KnC49Ya6nT@odyx.org) to implement merged `/usr` by introducing symbolic links such as `/bin` pointing to `usr/bin`.
In the presence of such links, two distinct filenames may refer to the same file on disk.
We say that a filename aliases another when this happens.
The filename that contains a symlink is called the aliased location and the filename that does not is called a canonical location.

At its core, `dpkg` assumes that every filename uniquely refers to a file on disk.
This assumption is violated when aliasing happens.
As a result, we exercise undefined behavior in `dpkg`.
This is known to cause problems such as unexpected file loss and is currently mitigated by a [file move moratorium](https://lists.debian.org/debian-devel/2021/10/msg00190.html).

We currently prohibit most situations that may provoke problematic behavior using policy.
This mitigation is not without cost and we want to eliminate it.
Shipping files in their canonical locations tends to simplify packaging.
Once files are moved to their canonical locations, a number of aliasing problems are effectively mitigated.
The goal of this work is to reduce the impact of these matters to the typical package maintainer.
It aims for removing the cognitive load of having to keep in mind which files must be installed to aliased locations and which files must be installed to canonical locations.

Regardless of what strategy we end up choosing here, we will likely keep some of the temporary changes even in the `forky` release to accommodate external repositories and derivatives.

Problems
========

P1: File loss during canonicalized file move
--------------------------------------------

When moving a file from its aliased location to a canonical location in the `data.tar` of a binary package and moving this file from one binary package to another, `dpkg` may unexpectedly delete the file in question during an upgrade procedure.
If the replacing package is unpacked first, the affected file is installed in its canonical location before the replacing package is upgraded or removed.
`dpkg` may then delete the affected file by removing the aliased location - not realizing that it is deleting a file that still is needed.

This problem was originally observed in [#974552](https://lists.debian.org/974552) and is the one that motivated the issuance of the moratorium.
Since the moratorium came into effect and file moves have been prevented, no new cases surfaced.
Had the moratorium been lifted for the bookworm release, we know that problems would have been caused in a small two-digit number of cases.
[For instance](https://lists.debian.org/20230426223406.GB1695204@subdivi.de), `/lib/systemd/system/dbus.socket` could have been canonicalized while it has been moved from `dbus` to `dbus-system-bus-common`.
There is an [artificial test `case1.sh` demonstrating the problem](https://lists.debian.org/20230425190728.GA1471384@subdivi.de).

P2: Missing triggers
--------------------

When packages declare a `dpkg` file trigger interest in a location that is subject to aliasing without also declaring interest in the other location, a trigger may not be invoked even though that was expected behavior.
No issue arises when a file trigger is declared on a canonical location and all packages are shipping their files in that canonical location.
However, when the trigger is declared for an aliased location and packages move their files to the canonical location, triggers can be missed.

This problem is also currently being prevented by the moratorium.
Had the moratorium been lifted for the bookworm release, we know that problems would have been caused in two cases.
The `runit` and `udev` packages declare an interest to aliased locations and would start missing trigger invocations when canonicalizing files in other packages.

P3: Ineffective diversions
--------------------------

When a package uses `dpkg-divert` to displace a file from another package, the diverted location may have become aliased due to the `/usr`-merge.
If a package whose files are being diverted were to canonicalize its files, such a diversion were to become ineffective.
As a result, the content of the affected file were to be dependent on the order of unpacks.

This problem is also currently being prevented by the moratorium.
Had the moratorium been lifted for the bookworm release, we know that problems would have been caused in a small two-digit number of cases.
[For instance](https://lists.debian.org/20230428080516.GA203171@subdivi.de), `zutils` diverts files from `gzip` below `/bin` and a number of packages such as `molly-guard` divert power management tools such as `/sbin/reboot`.

Beyond diversions issued by packages, local diversions added by an administrator may also become ineffective.

P4: Disagreeing alternatives
----------------------------

When packages use `update-alternatives`, the alternative location or one of its providers may refer to an aliased location.
As packages move files to their canonical locations, they might want to move their provided alternatives as well.
Just replacing the location in the `update-alternatives --install` invocation is actively harmful in this case as the aliased location would not be removed.
If it were to be removed, a user configuration may inadvertently be deleted unless care is taken to preserve it.

Similarly, we may want to canonicalize the location of the alternative itself.
If it were just moved, `update-alternatives` would have two seemingly distinct alternatives that conflict with one another.
If such a move is desired, it must be carefully coordinated among all alternative providers.

Last but not least, the choice of alternative provider is usually referred to using an absolute path.
Therefore, this path is part of the interface and is often scripted via automation tools such as `ansible` or `puppet`.
Changing this path would break such automation.

These problems affect a [small two-digit amount of cases](https://lists.debian.org/20230428201151.GA2784035@subdivi.de).
They are also mitigated by the moratorium.

P5: Ineffective `dpkg-statoverride`
-----------------------------------

When packages move their files to canonical locations, a `dpkg-statoverride` may still refer to the aliased location and thus become ineffective.
This could happen if files were moved without updating the corresponding maintainer scripts accordingly.
Usually, statoverrides are issued in the same package that contains the files being modified.
This affects a [one-digit amount of cases](https://lists.debian.org/20230502135105.GA713645@subdivi.de) and the moratorium is effective as well.

A statoverride may also be configured as an administrative change.
As files are canonicalized, such overrides become ineffective without any warning.

P6: Empty directory loss
------------------------

Andreas Beckmann discovered that an empty directory may unexpectedly disappear when a package containing an aliased location is being removed as part of a package upgrade or removal.
The first instance of this problem is with [`/usr/lib/modules-load.d` from `systemd`](https://bugs.debian.org/1036920), but it is a generic problem.
The [generic problem](https://lists.debian.org/20230530095300.GA1289743@subdivi.de) affects a [one-digit number of situations](https://lists.debian.org/debian-devel/2023/05/msg00325.html).
This problem is not mitigated by the moratorium and really affects upgrades from `bullseye` to `bookworm` and package removal on `bookworm` installations.

P7: Shared multiarch file loss
------------------------------

Like directories can be shared between different packages (as in P6), regular files with identical content can be shared between multiple instances of a `Multi-Arch: same` package.
When upgrading one instance such that a location is canonicalized and removing another instance, the canonicalized file may be lost in the transaction if the removal happens after the upgrade.
This scenario has been [reproduced in an artificial case](https://lists.debian.org/20230530095859.GA1300602@subdivi.de).
It is not observed in practice, because it is also mitigated by the moratorium.
Had the moratorium been lifted for the bookworm release, we know that a two-digit number of case would have been affected by this.
In most cases, the files affected by loss are `udev` rules and `hwdb` files contained in shared library packages.

P8: Boostrapping aspects
------------------------

Filesystem bootstrap implementations have diverged due to `/usr`-merge.
`debootstrap` now installs the aliasing symbolic links prior to the initial package extraction.
Whether it performs the merge is dependent on the chosen `--variant`.
Other tools such as `cdebootstrap`, `mmdebstrap` and `multistrap` rely on the `usrmerge` package for merging.
As we canonicalize files in packages, that latter strategy will fail once essential files such as `/bin/sh` or the dynamic linker become canonicalized, because running `usrmerge.postinst` becomes impossible.
Conversely, the former strategy fails if a package such as `base-files` were to actually contain the aliasing symbolic links.
The moratorium also prevents these effects from happening in practice.
There is a [mail with more detail](https://lists.debian.org/20230517093036.GA4104525@subdivi.de).

P9: Loss of aliasing symlinks
-----------------------------

As more and more packages release aliased locations, eventually one package is the last package that contains a location referring to a top-level symbolic link.
When upgrading or removing that package, `dpkg` sees that the location is released and deletes it - in effect deleting an important symbolic link.
For instance, `libc6:amd64` is the only package that contains `/lib64`.
Canonicalizing its files would cause `/lib64` to be deleted and make the dynamic linker unavailable.
This is prevented by the moratorium for now.

Proposed mitigations
====================

Many but not all problems relate to tools contained in the `dpkg` package.
Therefore modifying `dpkg` to improve this situation is a natural thought.
We classify modifications to `dpkg` in three different categories:
 * No modification: The relevant changes happen outside the `dpkg` package.
 * Implicit changes: The change affects the semantics of `dpkg` without adding to its API.
 * Explicit changes: Some part of `dpkg`'s API (such as adding an option or a new `control.tar` member) is changed.

If changes to `dpkg` are part of the solution, we have to further spread the transition.
`dpkg` usually picks up new `glibc` symbols and therefore gains a `Pre-Depends` on the new `libc6`.
Therefore we cannot assume a fixed `dpkg` for unpacking `libc6`.

Other than that, a number of mitigations rely on implementation-defined behaviour of `dpkg`.
In particular, the various mitigations that employ diversions, use `dpkg` features in ways that they were not meant to be used (e.g. diverting directories).

M1: Teaching `dpkg` about aliases
---------------------------------

One approach is to explicitly tell `dpkg` about the relevant symbolic links such that it can identify canonical and aliased locations.
This strategy can resolve (depending on how many tools use this information) most problems, except for the bootstrapping aspects (P8).
It is considered an explicit change and can take multiple forms.
An earlier version of this document proposed the addition of an `--add-alias` option for `dpkg` to record aliasing symbolic links after their introduction.
Simon Richter proposed [adding a `control.tar` member to record aliases](https://salsa.debian.org/sjr/dpkg/-/tree/wip-canonical-paths).
A recurring suggestion is to hard code the aliases used by Debian into `dpkg`.
That latter approach has the downside of affecting unmerged installations such as old releases when using the `--root` option as well as non-Debian users of `dpkg` and is effectively ruled out for that reason.

In any case, this is a new feature in `dpkg` with a fairly involved implementation as it touches on core data structures.
Since it adds to the API in non-trivial ways, it is not something we can remove anytime soon, so it adds to the permanent maintenance cost of `dpkg`.
It is plausible that this feature interferes badly with other developments in `dpkg` such as filesystem metadata tracking.

Any package that wants to rely on the new behavior requires a versioned `Pre-Depends` on `dpkg`.
Doing this to `libc6` would introduce a dependency cycle.

M2: Canonicalize all paths
--------------------------

While moving files to their canonical location causes most of the problems, the final state of having all of them moved frees us from the majority of aliasing related problems as well.
Once everything is moved and a new policy prohibits adding files in aliased locations, problems P1,  P2, partially P3 (for package authored diversions), partially P5 (for package authored statoverrides), P6, P7 and P9 become irrelevant for the future.
For this reason, moving files also is a mitigation strategy if done exhaustively.
The process of doing this requires a combination of other mitigations in order to not break existing use cases, such as smooth upgrades or bootstrapping.

M3: Let `dpkg` use the filesystem as source of truth
----------------------------------------------------

Problems P1, P6, P7 and P9 are concerned with unexpected deletion of filesystem resources that are subject to aliasing.
As such, a targeted behavior change to the code dealing with deletion of unused filesystem resources in `dpkg` is plausible.
`dpkg` could be updated to perform a new check prior to deleting filesystem objects.
While `dpkg` normally considers its internal database as the sole source of truth, it can be changed to determine the canonicalized location according to the actual filesystem before performing its deletion.
If that resolved location happens to be known in the internal database, a warning can be emitted instead of deleting the file.
A similar mitigation already implemented in `dpkg` converts the attempt to delete a theoretically empty directory that happens to not actually be empty into a warning.
This is a significant deviation from the principles `dpkg` is built upon, but it is a fairly isolated change that is only useful during the move of filesystem resources to their canonical locations.
For systems that do not exercise aliasing, the only downside is a (hopefully minor) degradation in performance of upgrades and removals.
Once all files have been canonicalized (M2), this change can be reverted in `dpkg` given that it has warned about relying on this.
As such, this change does not impose a permanent cost to maintaining `dpkg`.

It also requires versioned `Pre-Depends` on `dpkg`, which again is impossible for `libc6`.
This mitigation needs to be part of two stable releases to accommodate derivatives and external repositories.

M4: Protective diversions for aliasing links
--------------------------------------------

For each of the aliasing symbolic links, we can introduce a diversion that redirects it to some unimportant location.
Since diversions are not intended to be used with directories, `dpkg` only applies a diversion to the exact filename that is being diverted.
When adding a diversion for one of the aliasing symbolic links, files that are installed below that directory component are unaffected by the diversion.
Any attempt to remove a diverted symbolic link will instead remove the corresponding unimportant location.
In order to avoid a `Pre-Depends` loop, the diversions are created by a dependency-less package (e.g. a new `usrmerge-support` package).
`libc6` as the sole owner of `/lib64` needs a `Pre-Depends` and can do so without introducing a loop.
`base-files` is a prominent owner of many other directories that have become symlinks and also needs such a `Pre-Depends`.
With these in place, P9 is addressed.
The diversions can be removed if the symlinks are installed into some `data.tar` or after two stable releases to accommodate external packages and derivatives.

M5: Ship symlinks as directories
--------------------------------

`base-files` contains `/bin`, `/lib` and `/sbin` as directories and as long as it contains them, `dpkg` will not delete the symbolic links that are now placed there.
If `base-files` were to also ship `/lib64` and all other multilib directories, that would effectively prevent them from being deleted, thus mitigating P9.
Its `postinst` script could still convert them to symbolic links unless already converted.
Unfortunately, `libc6` cannot `Pre-Depends` on `base-files` to ensure the right unpack order, as doing so would create a loop.
We can spread this part to two releases or additionally ship the directories in a dependency-less package (e.g. `usrmerge-support`).

M6: Divert `dpkg-divert`
------------------------

`dpkg-divert` can be wrapped by a script that modifies its behavior using the diversion mechanism on itself.
Whenever a diversion is added for a location that is subject to aliasing, the diversion is duplicated to both affected locations by the wrapper.
Similarly, removal of diversions is also duplicated.
Upon installation of the wrapper, all existing diversions are also duplicated.
If combined with M4, these diversions need to be ignored.
Packages that use diversions and packages that canonicalize files affected by diversions need to issue a `Pre-Depends` on the package that installs this wrapper (e.g. a new `usrmerge-support` package).
The affected packages can be determined in a mechanical way.
The wrapper would be required for at least two stable releases.
This way P3 can be mitigated.

The wrapper can also duplicate local diversions.
However, for removing the wrapper (and thus removing the aliased diversions), users must update their scripts and automation to canonicalize the locations to be diverted.

M7: Replacing `Replaces` with `Conflicts`
-----------------------------------------

When files are moved between packages, these are usually accompanied with `Replaces`.
A concurrent canonicalization may render this `Replaces` ineffective and cause the file loss described in P1.
This scenario cannot happen when the replacing package is unpacked after the replaced package has been upgraded or removed.
As such, changing `Replaces` to `Conflicts` makes this scenario impossible.
The mitigation can be applied in an as-needed manner such that moving files without canonicalizing remains unaffected as does canonicalizing without moving.
The situation can be detected using automated tools and reported mechanically.
If packages had their content canonicalized in bookworm, a low two-digit number of packages would have their `Replaces` changed to `Conflicts`.

This strategy mostly mitigates P1, but it can become impossible to apply when essential packages are involved (as those must not be temporarily deinstalled) or the upgrade becomes too complex for `apt` due to an excess of `Conflicts`.
These cases can be complemented with the next mitigation M8.

M8: Protective diversions for moved files
-----------------------------------------

A package that is at risk of loosing files as in P1 can set up a protective diversion for each affected location in the aliased form.
The replacing `preinst` script has to set up these temporary diversions.
When the replacing `postinst` is run, the replaced package is already upgraded or removed (due to associated `Breaks`) and it can therefore remove the protective diversions.
These diversions only exist during an upgrade, but writing the maintainer scripts can be difficult to get right.
Therefore, M7 should be preferred when applicable.

M9: Protective diversions for empty directories
-----------------------------------------------

In a similar vein to M8, empty directories can be saved in a P6 scenario.
If there is one and only one package shipping the empty directory, it can set up a protective diversion in a permanent way.
Note that `dpkg` errors out when creating a diversion for an existing directory as it is not meant to be used that way, but this can be worked around by temporarily moving the directory away.
As long as the diversion exists, the package owning the diversion must ensure that the diverted location actually is a directory in the filesystem.
Otherwise, unpacking other packages may fail.
These diversions probably need to stay for two stable releases.

A directory can also be empty in multiple packages.
In most such cases, this seems to be accidental and the directory can be deleted from the `data.tar` instead.
In other cases, the empty directory can be migrated to a common package, but this is expected to not be needed in practice.

M10: Protective diversions for shared files
-------------------------------------------

In a similar vein to M8 and M9, shared files in `Multi-Arch: same` instances can be saved in a P7 scenario.
The new `preinst` can divert the aliased locations contained in the old version, but it must assign these diversions to some other package (e.g. `usrmerge-support`) in order to become effective for itself.
Likewise the new `postinst` can remove these diversions, because all other instances must have been removed or unpacked by this time.
As with M8, such diversions only exist during an upgrade.

M11: Ship symbolic links in a package
------------------------------------

In order to fix bootstrapping tools, the aliasing symbolic links can be shipped in some `data.tar` of a binary package (e.g. `base-files`).
Doing so requires that all packages participating in filesystem bootstrap have canonicalized their paths first, because we would otherwise get a directory vs symbolic link conflict.
In the presence of such a conflict, the behavior of `dpkg` depends on the order of unpacks.
Instead of avoiding the conflict, all bootstrapping tools can be updated to unpack the symbolic link package before all other packages .
It also requires changing `debootstrap` to no longer create the links prior to extraction as it presently passes `-k` to `tar`, which would result in an unpack failure even without any conflict between the various `data.tar` contents.
With this, P8 is addressed.

M12: Explicitly duplicate triggers
---------------------------------

In order to address the missing trigger invocations from P2, the one-digit number of trigger interests can be manually duplicated.
A relation of triggering packages on the triggered package is not required, because configuration of the triggered package is equivalent to a trigger invocation and from that point on triggers work as expected.
This mitigation also needs to last for two stable releases.

However, this does not at all mitigate trigger interest in external packages.
External packages declaring an interest in aliased locations need to have this mitigation applied as well.

M13: Keep alternatives aliased
-----------------------------

If we keep the alternatives and alternative providers at their aliased locations, we can still move the package contents in the `data.tar` without otherwise impacting the use of alternatives.
The major downside is that we have to eternally remember that some locations in alternatives are expressed in a non-canonical way.
In a sense, we can skip P4 by not canonicalizing alternatives.

M14: Divert `update-alternatives`
---------------------------------

A wrapper to `update-alternatives` can canonicalize all paths and ensure that paths may be referred to by either way.
All packages relying on this behavior must issue a `Depends` on the package introducing the wrapper, thus solving P4.
For the trixie release, both aliased and canonical locations would be accepted such that external repositories and derivatives have a full release cycle to get updated.

M15: Manually migrate statoverrides
-----------------------------------

The few packages that install statoverrides can migrate them on their own, mitigating P5.

This totally leaves local statoverrides and statoverrides from external packages unaddressed.

M16: Change bootstrap protocol
------------------------------

We can consider modifications to the bootstrap protocol to alleviate the problems.
Thus far no concrete proposals to this end have emerged.

Comparison
==========

In the following table, we map mitigations to their properties.
The most fundamental property is which problems they  fully (✓) or partially (\*) address.
We also record whether explicit (E), implicit (I) or no (-) changes are required to `dpkg`.
The order of affected packages is judged as "significantly changed packages + mechanically changed packages".
A mitigation is considered temporary if the relevant changes can be reverted after two stable releases.
A prototype is linked when available.
Finally, some mitigations cannot be combined with others.

|           | M1   | M2   | M3  | M4   | M5   | M6   | M7   | M8   |  M9  | M10  | M11 | M12 | M13 | M14 | M15 | M16 |
| --------- | ---- | ---- | --- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | --- | --- | --- | --- | --- | --- |
| P1        |  ✓   |  \*  |  ✓  |      |      |      | \*   |  ✓   |      |      |     |     |     |     |     |     |
| P2        |  ✓   |  \*  |     |      |      |      |      |      |      |      |     |  ✓  |     |     |     |     |
| P3        |  ✓   |  \*  |     |      |      |  ✓   |      |      |      |      |     |     |     |     |     |     |
| P4        |  ✓   |      |     |      |      |      |      |      |      |      |     |     |  ✓  |  ✓  |     |     |
| P5        |  ✓   |  \*  |     |      |      |      |      |      |      |      |     |     |     |     |  ✓  |     |
| P6        |  ✓   |  \*  |  ✓  |      |      |      |      |      |  ✓   |      |     |     |     |     |     |     |
| P7        |  ✓   |  \*  |  ✓  |      |      |      |      |      |      |  ✓   |     |     |     |     |     |     |
| P8        |      |  \*  |     |      |      |      |      |      |      |      |  ✓  |     |     |     |     |     |
| P9        |  ✓   |  \*  |  ✓  |  ✓   |  ✓   |      |      |      |      |      |  ✓  |     |     |     |     |  ✓  |
| dpkg      |  E   |      |  I  |      |      |      |      |      |      |      |     |     |     |     |     |     |
| affected  | 1+0  | many | 1+0 | 2+10 | 1+0  | 1+30 | 0+10 | 10+0 | 10+0 | 30+0 | 2+0 | 0+2 | 0+0 | 1+0 | 4+0 | 4+0 |
| temporary |      |      |  ✓  |  ✓   |  \*  |  ✓   |  ✓   |  ✓   |  ✓   |  ✓   |     |  ✓  |     |  ✓  |     |     |
| prototype | [#](https://salsa.debian.org/sjr/dpkg/-/tree/wip-canonical-paths) | [#](https://lists.debian.org/debian-dpkg/2023/05/msg00080.html) | [#](https://git.hadrons.org/cgit/debian/dpkg/dpkg.git/log/?h=pu/aliasing-workaround) | [#](https://lists.debian.org/20230517093036.GA4104525@subdivi.de) | | | [#](https://lists.debian.org/20230425190728.GA1471384@subdivi.de) | [#](https://lists.debian.org/20230425190728.GA1471384@subdivi.de) | | | [#](https://lists.debian.org/20230517093036.GA4104525@subdivi.de) | | | | | |
| precludes | many |      | M1  |  M1  | M11  |  M1  |      |  M1  |  M1  |  M1  | M5  |     | M14 | M1,M13 |  |     |

In effect, the most fundamental decision becomes how much change we want in `dpkg`.
On one end, we can make it aware of aliasing and move files to their canonical location only as a measure to simplify packaging (M1).
As a middle ground, we can accept a temporary behavior difference (M3) to ease a complete move (M2) with additional mitigations (to be selected).
On the other end, we can work around the problematic behavior (to be selected) while moving the files (M2) without applying any changes to the `dpkg` source code.

The other fundamental decision becomes how to deal with architecture bootstrap.
On one end is an underspecified idea of changing the bootstrap protocol somehow (M16).
The other end is shipping the aliasing symlinks in some package (M11), but that implies (M2).

Proposal
========

This section will have a proposal once consensus has emerged.



Reply to: