Hi, On Fri, Apr 21, 2023 at 02:15:54PM +0200, Raphael Hertzog wrote:
I'd like to express some disappointment that nobody replied publicly sofar.
There were a few replies on the dpkg mailing list.
Last year's developer survey concluded that "Debian should complete the merged-/usr transition" was the most important project for Debian [1] (among those proposed in the survey). That's what we are trying to do here and it would be nice to build some sort of consensus on what it means in terms of changes for dpkg.
The first thing we need consensus on, IMO, is the definition of "complete".The maintainers of the usrmerge package consider the status quo an acceptable technical solution, so their definition of "complete" is to roll out the change to the remaining users.
Other people view the transition as complete when the transition handling code can be removed, which, with the current implementation, is bookworm+2 because that is the first time we could ever have ld.so in /usr/lib. The handling code for bookworm can never be removed from the various bootstrapping tools, because users will rightfully expect to be able to install older versions. I believe cross debootstrap is also currently broken.
The problem for dpkg is that there is already a considerable amount of cases it handles. Paths are owned by specific packages, ownership can be transferred, they can be marked as conffiles and diverted. That alone gives us a massive testsuite already, including such elaborate cases as a package providing two files that are subsequently replaced by files in two other packages so no files remain, which automatically makes the first package disappear.
Extending these cases to include aliases is a monumental task, and limiting the allowed combinations does not reduce that, it only inverts the expected return code from that test case. We need defined behaviour for what happens if someone tries to register an alias for /etc (not allowed because there are conffiles inside), and we need error handling paths for cases like /lib -> /usr/lib, where creation of the /lib symlink fails and the original /lib needs to be restored before exiting with an error.
There are several possible designs here, including a variant that is similar to dpkg-divert, which is called from a maintainer script, safely performs the filesystem transformation and updates dpkg's database, which is then reloaded after the maintainer script finishes. This variant has the drawback that even in future releases we can never move ld.so to /usr in the packages, because this would be safe only if we could guarantee that the alias is registered. We can approximate that by giving ownership of the alias to a package, which allows declaring a dependency, but then the maintainer scripts need to be able to handle transfer of ownership at some point or we're again creating technical debt.
Another variant would be a declarative approach, which would, conveniently, work for initial installation for foreign architectures, and permit moving files to /usr eventually (one or two releases later). The drawback there is that we need a defined point when that declaration is read, and at this point the main dpkg binary (which for cross installation is the host binary) is responsible for moving the files over. This also gives strong ownership of the alias to a package, but now it is dpkg that is responsible for ownership transfer and removal of the alias when the last package providing it vanishes.
That solution again spans up a massive space of possible edge cases. Either we handle the alias through the normal file handling code, which gives us ownership transfer for almost free, but introduces lots of special cases into that code, or we add separate handling, which introduces the special case where all files in a package providing an alias are gone and the package should be deinstalled -- do we deinstall when the alias is gone as well, or when all regular files are?
This is the problem with doing it right: there are so many corner cases that need to be handled, even if the handling consists of an error message about an unsupported configuration and a rollback to the last valid state.
The alternative would be a consensus that dpkg is simply not expected to always leave the system in a useful state if it encounters certain invalid situations, and hoping that we will also be able to point to a few million installations where that has not exploded and call it a success, but that would need to be communicated.
I know that Guillem (dpkg's maintainer) is generally opposed to the approach that Debian has followed to implement merged-/usr but I have yet to read his concerns on the changes proposed here (besides the fact that they ought to not be needed because we should redo the transition in another way, but that's a ship that has sailed a long time ago...).
A sensible solution needs to be able to perform the transition on its own, because that will be required for new installations for quite some time. Picking up a transitioned or half-transitioned system and bringing it into a consistent and fully-transitioned state would also be on my list of requirements, as well as a way to safely remove and update aliases in case someone in the future decides that this was a bad idea, or requirements change, like where a lib32 alias points to.
In essence, that is a full implementation of the transition as it should have been done in the first place.
The rough project consensus seems to be that we should modify dpkg to avoid the cases where some files can disappear upon upgrades. Most people don't really care how we modify dpkg for this, and I can't blame them, but given that dpkg's maintainer seems unwilling to work on this problem, someone else has to come up with a design, implement it and get it applied on Debian's version of dpkg.
This is a non-trivial problem. I have tried. All simple implementations have new interesting bugs.
The most promising for me was the approach in my branch, where I create duplicate database entries for aliased paths that point back to the file name in the original package, but so far I have exactly zero code for adding, removing or changing aliases, and there are a few design decisions in the dpkg database that make it fast, but also make it difficult to perform updates that modify the primary key column.
Testing alone will be an absolute nightmare because we can enter invalid states through multiple avenues, for example, if I have a conflict
    a.deb: /bin/test
    b.deb: /usr/bin/test
    c.deb: /bin -> /usr/bin
then I need to handle all possible installation orders:
    # fail installing c, leave /bin/test and /usr/bin/test installed
    dpkg -i a.deb b.deb c.deb
    dpkg -i b.deb a.deb c.deb
    # fail installing b, move a's file to /usr, install symlink
    dpkg -i a.deb c.deb b.deb
    dpkg -i c.deb a.deb b.deb
    # fail installing a, leave b's file, install symlink
    dpkg -i b.deb c.deb a.deb
    dpkg -i c.deb b.deb a.deb
    # transfer ownership from a to b, deinstall a and run its pre/postrm
    dpkg -i a.deb c.deb
    dpkg --force-override -i b.deb
The latter case is also what should happen if b declares "Replaces: a".
    # move file to /usr, install symlink, then remove symlink, move back
    dpkg -i a.deb c.deb
    dpkg --remove c.deb
    # leave file in /usr
    dpkg -i b.deb c.deb
    dpkg --remove c.deb
What happens if a.deb declares "Replaces: c" and "Conflicts: c", and 
--auto-deconfigure is active?
I'd posit that the reason we don't have a patch for dpkg that we can apply "locally" in Debian is that anyone who tried to produce one is somewhat discouraged because it is a lot of work, the ideal outcome will be completely invisible to users, and the people who should have done the work in the first place will not remain silent while other people wipe up their mess.
Simon
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature