[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1050001: Unwinding directory aliasing [and 3 more messages]



Hi Ian,

On Sat, Aug 26, 2023 at 11:24:33AM +0100, Ian Jackson wrote:
> Helmut Grohne writes ("Bug#1050001: Unwinding directory aliasing"):
> > On Wed, Aug 23, 2023 at 05:04:36PM +0100, Ian Jackson wrote:
> > > And, the approach being taken very seriously privileges Debian itself,
> > > and those well-staffed derivatives able to do the necessary transition
> > > auditing (albeit, indeed, with tooling from Debian).  I am
> > > firmly ideologically opposed to such a tradeoff.
> > 
> > I have difficulties disagreeing with that. Getting this done is more
> > important to me though.
> 
> I have hoisted this to the start of the mail.  I think it is a hugely
> important point.
> 
> Debian is not simply a technical project trying to thread its way in a
> complicated world.  Debian is an ideological project.  At its best,
> Debian is the infrastructrure that enables vast swathes of people to
> massively enhance their own technological autonomy.  Many of our most
> controversial decisions served this goal, and stood the test of time.
> 
> That's why *I'm* involved in Debian.  Our technical choices should
> serve those goals, always.
> 
> (To an extent, this divergence in goals may explain why at times our
> comments have been talking slightly past each other.)

I think I understand this and it resonates with me, but there are limits
to that. I don't think that Debian is that technological leader that you
perceive it to be. I hoped that other distributions would adopt the
multiarch directory layout to regain compatibility with Debian and none
did even though there is a clear, technical advantage to doing this.
Debian does not exist in isolation. It is dependent on a lot of
upstreams and in order for that relationship to be healthy, there needs
to be cooperation.

> If you want to think about it on more practical (or even, selfish)
> level, we want Debian to continue to be the preferred choice, when
> someone is choosing an upstream.  We didn't get where we are today by
> following bad technical decisions made by others.

In the grand theme of things, the aliasing symlinks may be a suboptimal
technical approach. Please keep in mind though, that this change in
large parts is about people rather than something deeply technical.
After we stopped supporting booting without /usr being mounted in the
initramfs, the split between / and /usr was effectively random and stuff
was moving back and forth every so often and inconsistent between
different distributions or releases. I still remember iptables being an
annoying instance in this regard. So leaving technical aspects aside,
having large parts of the open source community agree on having those
aliasing symlinks already is a significant benefit even if it has
technical downsides.

In order to prefer Debian over something else, we want it to be similar
enough to make switching feasible while making it differ from others
enough to make switching worthwhile. Not having aliasing symlinks hardly
seems like an aspect that makes Debian more suitable to people. I guess
that you disagree with this and that is fine.

> This is indeed a plausible practical reason to do it the aliased way.
> >From my point of view, it amounts to saying "everyone else has made
> this mistake, so to be compatible, we must too".

I wouldn't say it that way, but it comes quite close.

> But I think that seriously underestimates our influence.  Debian
> derivatives make up well over half of all Linux installations.
> They're the default basis for most CI images.  If we decide this was a
> failed experiment, then indeed there will be some pain for a while,
> but fairly soon people will stop making this assumption.

Quite evidently, we judge this differently. The two of us value the
benefit of the end states differently and the cost to getting to each of
them. Therefore we arrive at different conclusions.

> I don't like the phrase "symlink farm" because it suggests that all,
> most, or even a substantial minority, of files have these symlinks.
> True, at the start, there will be at least a symlink allotment
> but I'm hoping that in the end it'll be a symlink flowerbed.

Let me suggest that this is wishful thinking. It's not only me saying
this, but you can read this from other responses as well. I encourage
you to use codesearch.d.n to see how your flowerbed really is a farm.
Let me give some examples to get you started:
 libreoffice -> /bin/python
 ghostscript, imagemagick, mesa -> /bin/env
 bind9, manpages, net-snmp, qtbase-opensource-src -> /bin/perl

So I see that we either get a symlink farm or we get to include a lot of
Debian-specific patches or we get to argue with a lot of upstreams about
something that may seem entirely pointless to them. In any of these
cases, I consider that a significant cost.

> But pushing ahead won't lead to such a state.  As I say, I think
> people will keep introducing new references to files by their
> non-physical names, and we'll keep getting lossage, essentially
> forever.  (Adopting Simon's terminology.)

Simon and others also kept telling you that this is fine. The majority
of such references are read-only and all such references are fine. When
such references are for updating it is a small fraction that actually
causes loss events. When dealing with Debian binary packages, we can
systematically make them refer to physical locations using a debhelper
utility. The two of us judge the frequency of these events vastly
differently. Only time will tell.

> Or to put it another way, the delays to completion of this project
> have not been due to the political opposition,.  They have been
> because the project encountered technical problems.  Problems whose
> existence was predicted by subject matter experts but dismissed at the
> time as FUD.  Problems which were apparently not regarded as real by the
> non-expert decisionmakers on the TC.  Problems which still remain in
> large part unresolved, albeit in some caes "mitigated".

That's one way to look at it. Again, my view on this differs
significantly. For one thing, we have disagreement on the end state of
this. My impression has always been that the disagreement on the end
state was involving a minority. A significant group seems to not see any
benefit in doing the /usr-merge (one way or another) and wants to not
stand in the way. Then we had CTTE decisions that basically said
aliasing is the end goal. The disagreement on how to get there (which is
not the one we are discussing here) seems more severe to me. In
particular, I've seen a lot of disagreement on what is a problem worth
fixing and what is not. For instance, usrmerge is formally rc-buggy in
bookworm and trixie. Quite obviously, we disagree on whether that's
worth fixing. So the bigger disagreement I see here - and that's more of
a political one - is about what is supported / worth fixing and what is
not. Those people in favour of doing the aliasing approach seem to be
fine with the problems we face now (or don't recognize them as problems
at all) and don't see a need for fixing them. From their point of view,
the transition is done and all that's left is an annoying moratorium.

> > > Aliasing is EBW, and "Only use canonical names" is not good enough
> > > ==================================================================
> > > 
> > > There is basically one underlying technical reason for preferring the
> > > un-aliased usrmerge approach: aliasing directories in this way leads
> > > to great complication in file management, especially in package
> > > management software and in individual packages.
> > 
> > I'm not sure I follow this argument precisely.
> 
> This argument is basically drawing together the common factor in the
> DEP-17 problem list.

And that's precisely why I don't understand your argument. All of the
DEP-17 problems are supposed to go away. So it appears as if you cite a
list of problems that I expect to not be problems for much longer as a
reason for changing the end goal.

> I think "package management" is the wrong term here.  It's not just
> our tools and packages that are affected.  All forms of operating
> system management are affected: anything that changes the software,
> and not just its configuration.
>
> Affected tooling includes not just our .debs, but also remote
> configuration management systems like Ansible, image construction
> (Dockerfiles), and 3rd-party software installation progrmas (including
> both 3rd-party .debs and 3rd-party script-based installation systems).

I don't follow the reasoning. Much of the tasks you'd carry out with
(wlog) ansible - even when updating files - will continue to work in the
aliasing layout. The reason that dpkg is more affected is that it has an
inventory of files and reasons about their ownership in terms of
packages. That's not how any kind of configuration management operates.
If you just "make install" something, chances are that it'll just work
with an aliasing layout even when installing with --prefix=/. I continue
to argue that the problems we are seeing are quite specific to dpkg in
large parts.

> And yes, actual *end users* (especially of something like Ubuntu) are
> largely unaffected because they don't do much operation system
> management.  Regarding Ubuntu specifically: Ubuntu's approach to 3rd
> party softare during upgrades is (for very plausible reasons) quite
> sledgehammery.

I'm sorry, but without evidence, you can argue the exact opposite in
roughly the same way. Ubuntu has way more 3rd party software as PPAs and
therefore they're way more affected. They also enabled it way earlier
and therefore their more of their users are affected.

> Well, this is a key part of the problem.  IMO we need to be able to
> state simple and clear rules, which when followed, will result in
> reliable construction of working software systems.

This hits more the core of the matter. Simon, myself and others argue
that due to the aliasing, it does not matter whether you refer to a file
by its physical or aliased path in most cases. The one known exception
is package management. And there the simple rule is to always use /usr.
To me, this feels simple enough.

> We build our systems by building on layers of abstractions.  Layers of
> abstraction allow us to narrow the scope of our consideration.  In our
> systems, the filesystem is a pervasive abstraction.  A filesystem with
> directory aliasing is a much more complicated and subtle abstraction.

It is, which is I see us having and mitigating problems. The DEP-17 plan
to deal with this is not to remove aliasing from the filesystem, but
from the affected components (package management). Once that has
happened, we return to a less complicated abstraction.

> I would be much more convinced that all of this isn't a problem, if
> there existed, and had been formally adopted (eg by existing in some
> manual somewhere) a short and clear set of rules about what is and
> isn't allowed - not just rules for us within Debian, but rules for
> everyone, everywhere, referring to and modifying files.

It appears to me that the empty set of rules (outside packaging) is too
simple for you to consider.

> I think one reason that hasn't been done is that it's hard.  Another
> is perhaps that some of the rules required for successful and reliable
> operation contradict some of the ostensible goals of the aliasing.

This again is the disagreement of what is considered to be supported.
When you say reliable, you have something much different in mind than
the people who want that aliasing approach. To them, coming up with that
empty set of rules does not feel hard at all, but to you that set does
not yield the properties you seek and you therefore dismiss it.

> > We already have the Debian Usr Merge Analysis Tool available at
> > https://salsa.debian.org/helmutg/dumat and its output at
> > https://subdivi.de/~helmut/dumat.yaml. As explained on d-d@l.d.o, I want
> > to turn those findings into automatic RC bugs. Does that alleviate your
> > concerns to some extent?
> 
> That's certainly helpful for the transition now.  Are we going to
> maintain this or something like it indefinitely ?

I don't think so. Once packages have moved to /usr, we can simply have a
lintian-checkable policy of not shipping stuff in aliased paths. What's
the benefit of continuing this crazy approach then?

> But I don't think it addresses the point I intended.  We need to be
> able to spot when a user installs a .deb, that they got from
> "somewhere", when a directory-aliasing thing is going to go wrong.

This is the disagreement of what we consider supported.

> But, a goal of the directory aliasing is to be compatible with other
> systems, so that 3rd party software is *allowed* to refer to, and,
> presumably, ship files in /bin (because "there's no difference now,
> right?").

If they install without dpkg, yes. If they install with dpkg, they
better ship files below /usr (even if this is a very late step in
package construction). If they really ship in /bin, it is unsupported
but likely still works for some time (until dpkg understands directory
vs symlink conflicts).

> Our downstreams (of all kinds) are are more likely to use other
> tooling (of all kinds).

Tooling that does not have some kind of inventory of the filesystem is
mostly unaffected. Given the wide adoption of the aliasing approach, I
think it is fair to ask for evidence at this time.

> We can get rid of the former bug class simply by moving everything to
> /usr, once.  We'll *experience* those bugs now, but if we do it as
> part of a coordinated programme we can have tooling to spot it.  When
> that transition is complete, those bugs won't arise any more.

The really good thing about this is that DEP-17 moves us closer to what
you want as in effect it now is about moving everything to /usr (just
without removing the aliasing links) and indeed it inherits the property
of removing entire classes of bugs.

> As I say, I don't think the directory aliasing situation is ever going
> to be finished.  We can revert it, or have it forever be weird.

Given the state of discussion, I think both of us understand quite
precisely what we disagree about. We don't seem to be moving beyond this
state of understanding one another. How can we move forward?

If you still see Debian reverting to a non-aliased layout (and you seem
to), I see the following possible steps forward:

 * Become more precise about what your layout looks like precisely. Our
   exchange makes it clear that we're not exactly sure whether it is
   more a farm or a flower bed. Doing quantitative analysis can help
   here. Which of the paths need symlinks? Which do not? How many
   upstreams need to change something to become compatible with the
   suggested layout?

 * Gauge the problems induced by moving to that layout. For instance,
   cron temporarily dropped /bin from its $PATH default. Can we classify
   (in a similar way to DEP-17) the problems that we run into by
   reverting? Can we automate that and do quantitative analysis?

 * Refine the migration path. I expect that you want to base your
   migration on dpkg-fsys-usrunmess. This tool has significant
   limitations and is known to be a bit fragile.

 * At least systemd upstream (and probably others) have declared that
   they do not wish to support the split layout. This extends to the
   current systemd maintainers in Debian. How do you see maintaining the
   support that is pending to be removed upstream?

Please don't jump into quick replies to these items. My understanding is
that each of these steps requires significant effort. If you have a
ready answer here, we likely have more misunderstandings.

>From a CTTE pov, I think we should close this issue for now as not being
actionable. While you succeeded in explaining the existence of technical
advantages of your approach, too many pieces seem missing at this stage
to be able to seriously consider this. You will understand that we
have to require a more detailed transition plan on this matter as the
previous attempt based on trust didn't work as well as was hoped.

I also note that I do not see DEP-17 negatively affecting your view,
because it also moves files from / to /usr. It also adds a number of
mitigations that may become unnecessary in your view, but those are
temporary and will need to be reverted later anyway. As such, my
expectation is that moving from where we are to your idea is not any
easier than moving from a post-DEP-17 state. Therefore, I do not see any
need to delay DEP-17 work.

Helmut


Reply to: