Re: move to merged-usr-only?
On Fri, 20 Nov 2020 at 11:12:20 +0100, Ansgar wrote:
> On Fri, 2020-11-20 at 10:19 +0100, Adam Borowski wrote:
> > On Fri, Nov 20, 2020 at 09:35:42AM +0100, Ansgar wrote:
> > > As far as I know nothing broke catastrophically over the last releases
> > > with merged-/usr.
> > 
> > Unless you look at dpkg or attempts at speeding up bootstrap.
> > 
> > See https://salsa.debian.org/glibc-team/glibc/-/commit/49d137c4392cb1144f2313f78f31466aaa169b75
> > for an example.
Trying to transition from /lib to /usr/lib in the way that is advocated
by some people who don't like usrmerge (one package at a time, without
using usrmerge or something equivalent) can unfortunately *also* cause
transitional brokenness: see
<https://salsa.debian.org/gnome-team/glib/-/commit/1f9c505abe2639296615d13c048227a3df378576>.
(I would appreciate wider testing of the glib2.0 versions in experimental
that attempt to resolve this, particularly by people whose systems are
*not* merged-/usr. Merge requests against the debian/experimental branch
are also welcome.)
> The good news here is that shipping bash as /usr/bin/bash (instead of
> /bin/bash) solves that problems as dpkg will just install it under
> /usr/bin/bash.  No aliasing over symlinks involved.
I do agree with Ansgar that it seems good to have a timeline for making
merged /usr the only supported layout. I don't agree with the implication
that it's trivial to get there. There will probably be regressions -
but we have a lot of clever people in this project, and if we pull the
lever soon after bullseye is released, we have an entire release cycle to
sort out regressions and still be on time for Ansgar's proposed timeline.
(In an attempt to forestall people disputing that merged /usr has benefits:
please see my responses to previous threads on this topic, in particular
<https://lwn.net/ml/debian-devel/20181121140542.GA31273@espresso.pseudorandom.co.uk/>
almost exactly 2 years ago, for quite a lot of words about that which
I will not repeat here.)
I agree that if we were designing a new dpkg-based OS that did not have
legacy/upgrade-path constraints, we should put all the OS files in /usr
in the .deb, and do something centralized to provide symlinks
bin -> usr/bin and so on (either ship them in base-files, or use something
resembling systemd-tmpfiles to make them exist declaratively, or just
create them once during installation).
However, we *do* have legacy/upgrade-path constraints:
* Existing packages have paths like bin/bash in the .deb.
* Existing installed systems have files on disk at paths like /bin/bash.
* We cannot avoid the need for the path /bin/bash to exist (whether
  it's a regular file or involves a symlink /bin -> usr/bin or
  /bin/bash -> /usr/bin/bash), because existing scripts start with
  #!/bin/bash (and similarly /lib/ld-linux.so.2,
  /lib64/ld-linux-x86-64.so.2 etc. must continue to exist at those paths,
  with or without the help of symlinks, otherwise no binaries will run).
So we need a way to get there from here. The only possibilities I can
see for that within the framework of a dpkg-based OS are:
* Aliasing via symlinks:
  Something magics /bin -> usr/bin, etc., symlinks into existence,
  and dpkg continues to tolerate unpacking over them, providing
  a per-installation flag day to move to merged /usr. usrmerge is
  an implementation of this approach, debootstrap --merged-usr is
  another. Individual systems can do this transition at any time (many
  have already done so), and they will get the benefit of merged /usr
  (in particular, a class of avoidable bugs can no longer affect them)
  after they have passed the flag day.
* Package-by-package migration:
  For every package that ships a file in /bin, /sbin, /lib* whose path
  might have been hard-coded elsewhere, the maintainer of that package
  takes action to move the file into /usr. If the file's full path might
  have been hard-coded elsewhere (/bin/bash, etc.), the maintainer
  scripts must additionally create an unmanaged symlink
  /bin/bash -> /usr/bin/bash, etc. - similar to what was done to move
  /usr/bin/chacl in src:acl to /bin, but in reverse. And then, to get the
  benefits of merged /usr, we have to wait for *every* package to stop
  shipping anything in /bin, /sbin, /lib*, and *then* have a usrmerge-like
  per-installation flag day at which the collection of symlinks /bin/bash
  -> /usr/bin/bash, etc. are replaced by a single symlink /bin -> usr/bin.
* Combine those two strategies, while arguing about which one is better,
  and hopefully minimizing the extent to which proponents of one approach
  prevent the other approach from proceeding. (In the absence of consensus,
  this is what is happening in practice.)
Packages like bash that install files whose traditional paths are on
the root filesystem and are hard-coded into other packages cannot safely
move their files into /usr, without having to take other action to ensure
compatibility, until one of those strategies has already happened.
In all three cases, Debian-as-a-project does not get the full benefit of
merged /usr until all installations that remain supported have had their
per-installation transition; and in all three cases, we cannot ship
/bin -> usr/bin, etc. in base-files until after every supported system
has undergone that transition. So if Debian 12 is the first to make merged
/usr mandatory during upgrade, as Ansgar proposes, then Debian 13 would be
the first release where base-files could safely ship those symlinks.
I agree that aliasing via symlinks has a complexity cost for low-level
packages like dpkg and debootstrap clones. One way to mitigate this
would be to make the transitional period as short as we can (which is
essentially what Ansgar is proposing, I think), so that some release of
Debian (12 in Ansgar's proposal) can guarantee to be merged-/usr, after
which we have the option to take away the scaffolding that we used to get
there (Ansgar proposes that this should happen between Debian 12 and 13).
There's certainly an argument to be made that since we are *already*
paying this complexity cost, we might as well get the benefit from it
as soon as we can.
Package-by-package migration touches a large number of packages, and we
don't get any benefit from merged /usr until it has *finished*. Previous
experience in Debian from migration to multiarch library directories
(started in around 2012, highly unlikely to finish in 2020) and migration
to /usr/share/doc (started in 1999, finished in 2008) suggests that this
is not going to be fast - to speed this up, I think we would have to be
a lot more willing to accept NMUs of core packages. However, the longer
we take, the longer the period will be in which we are encountering
transitional bugs. It is not obvious to me that this is desirable, and
in particular the need to set up per-executable compat symlinks actively
works against the ongoing goal of reducing the amount of imperative code
in maintainer scripts.
The third option combines the weird transitional states affecting
low-level packages from the first option with the unbounded duration of
the second option, while letting individual maintainers delay the first
option happening by asserting that it's wrong and we should be following
the second.
Looking at our friends in other distros, it's perhaps instructive to
compare Fedora with openSUSE (note that both use the same package
manager). Fedora took the aliasing-via-symlinks approach, and as I
understand it, was able to go from the traditional layout to merged
/usr everywhere in a single 6-month release cycle. openSUSE took a
package-by-package approach, and as far as I'm aware, is still working
on it.
    smcv
Reply to: