[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1050001: Unwinding the directory aliasing mistake



Hi Ian,

While I have a CTTE hat, this response should be considered a
Freexian/personal response rather than an official CTTE response.

On Fri, Aug 18, 2023 at 07:57:14AM +0100, Ian Jackson wrote:
> Thanks to work funded by Freexian we now have a list of many of these
> malfunctions[2] (although new ones keep popping up (eg [3]).

Let me point out that the DEP17 document referenced as [3] now has a
proposal section. Let me also point out that as a result of this work,
we've started systematically addressing limitations e.g.:

https://udd.debian.org/cgi-bin/bts-usertags.cgi?user=debian-qa%40lists.debian.org&tag=fileconflict
https://udd.debian.org/cgi-bin/bts-usertags.cgi?user=helmutg%40debian.org&tag=dep17p2
https://udd.debian.org/cgi-bin/bts-usertags.cgi?user=helmutg%40debian.org&tag=dep17p6
https://salsa.debian.org/systemd-team/systemd/-/merge_requests/204
https://salsa.debian.org/systemd-team/systemd/-/merge_requests/203
https://salsa.debian.org/systemd-team/systemd/-/merge_requests/205
https://salsa.debian.org/installer-team/debootstrap/-/merge_requests/93
https://salsa.debian.org/installer-team/debootstrap/-/merge_requests/96
https://salsa.debian.org/installer-team/debian-installer/-/merge_requests/39

> The most serious malfunction is the disappearing files bug, which has
> prompted a seriously inconvenient file move moratoriam which has now
> been in place for many years.  After 4 years, we still don't have a
> clear path forward to resolving that and other problems [4].

I disagree that this is the most serious malfunction. It was the first
observed one, but the resulting moratorium has prevented us from
observing others. As we start moving files, we'll be observing those
others.

> It should now be clear to most observers that the decision to go down
> this path was a mistake.[5]

I have not been part of that decision back then and I kinda feel
similar, but keep in mind that our knowledge now is different from that
back then. I thought that I'd know a lot about the /usr-merge back then
and in working on it, I learn again and again that I still don't see the
whole picture. It appears to me that you call it a mistake now as we
progress with seeing that picture and finding a way out.

> Fixing the mistake
> 
> I think we can probably fix it by backing out this change, and then
> doing usrmerge the traditional Debian way by making changes to
> debhelper, so that we move the files package by package, in the .debs.

There is two sides to this. One is that backing out the change is
something that sounds easy but in reality is not. This keeps popping up
as a suggestion and I've even suggested it myself about a year ago, but
to this date, we don't have a reliable way to unmerge a system. So if we
consider backing it out, we must consider the amount of work our users
have to go through and that's a huge cost.

The other side is that there may have been better ways of doing the
migration. It's an academic question that I've given thoughts to as
well. While there is something to learn from here, this doesn't really
give us more options as the cost of reversion is huge.

> But given the atmosphere, such a proposal would need clear political
> backing from the TC.  It wouldn't be worth anyone's time or emotional
> energy to attempt to make a concrete and detailed plan if the TC is
> not minded to consider it.

I disagree. While the atmosphere has been hot at times and we had to
invoke DAM more than once, I also think that progress is happening. The
d-devel@l.d.o discussion has resulted in quite a few consensus items
already and that's recorded in the latest DEP17 draft. So what you think
not to be worth anyone's time is actually happening right now.

> So I would like to pose the following questions for the TC:
> 
>  * Given the information that the Committee has now, that it didn't
>    have in 2019, does the Commitee agree that the decision in 2019
>    was questionable ?

All decisions are made with limited information. I don't think this adds
any value.

>  * Is the Committee open to the idea of a plan being developed which
>    reverses the mistake, rather than merely "mitigating" the problems ?
>    Would such a plan be considered on its merits ?

Anyone is entitled to propose a way forward. Such work is considered
design work and therefore not carried out by the CTTE itself. I caution
though that the amount of analysis work to be done here is non-trivial.
For one thing, we'd need a convincing argument that there is a
non-painful method to reverse the layout. For another, we need an
understanding of how the resulting breakage to packages is being dealt
with. For instance, cron no longer has /bin in its search path
(#1042894, thanks to pabs for pointing this out). And then we need to
compare it with DEP17 to see which of the plans is favourable.

> I appreciate that I'm asking the TC to revisit a previous and
> controversial decision.  That this isn't a step we should take
> lightly, but I think in this case it's clearly warranted.

My feeling is that, this is not worth the time, but I am totally biased
due to being the main driver DEP17. From my point of view, the question
is less whether we can revisit this. The question is whether there are
convincing arguments that the resulting cost is warranted.

> Timeline for a fix
> 
> Any plan to solve this would probably take a few releases (ie, many
> more years) to sort out, sadly.  We would probably need to wait with
> shipping packages that move files in a conventional-for-Debian way,
> until we have our userbase's systems restored to a non-aliased state.
> So I think we would need trixie to undo the aliasing, and in trixie+1
> we could move the files.

This is a huge problem. Support for split /usr is deprecated in systemd
and scheduled for removal at the end of 2023. So in addition to
explaining how the reversion works, you also need to explain how to
support split /usr in systemd as upstream is unwilling to do so.

To me you are explaining that the cost of reversion is even bigger than
I originally anticipated.

> This delay is unfortunate, but - unlike pressing forward - it has
> relatively low levels of risk, most of which is front-loaded.  I think
> we can develop tools which will reliably de-alias a system; and, once
> users' systems have been de-aliased, the actual file movement is
> routine work that Debian knows how to do.

Both of us asses the involved risks quite differently. The usrmerge
package has been part of Debian for multiple releases, but it saw a
number of bug reports in the last year. This indicates that the kind of
surgery it performs is non-trivial. Conversely, dpkg-fsys-usrunmess has
a lot more limitations and risks.

On the flip side, our recent research has gained us quite a good
understanding of the kinds of problems we expect from aliasing. We may
still find new problem categories, which is an unknown risk indeed. The
problem categories that we know are now being analyzed:
https://subdivi.de/~helmut/dumat.yaml (updated 4 times a day)

At this time, your argument for "low level of risk" is unconvincing to
me.

> I can see that there could possibly be ways of going straight to the
> desired state (un-aliased systems but with nothing much in /), but I
> haven't given them much thought them through because they seem risky
> to me and involve some grievous hacks.

I have given them much though. I agree that there is risk involved and
that they require much reliance on implementation-defined behavior of
dpkg as well as ugly hacks. On the flip side, I argue that the amount of
hacks we need to employ (according to dumat.yaml) is fairly limited and
that the time we need to cope with them (remove them after forky) also
is. So the appeal of this way forward is that it allows most maintainers
to not pay much attention (protecting your mental health!) and gets us
done rather sooner than later.

You see, when I started DEP17, I argued for changing dpkg. The resulting
discussion made me reconsider that and I've moved towards working on the
most consensual opinion. This is what I am doing now. If you can come up
with a plan and argue for it on d-devel in way that achieves even minor
consensus, I'm happy with helping analyze it. Having more options to
choose from can only be for the better. I won't be initiating a
reversion-based approach on my own though.

> Protecting my mental health
> 
> I will try to avoid regularly reading this thread.  I hope that now
> that I have made the suggestion, others will be able to carry the
> conversation.  I will be configuring my mail client to disregard my
> personal copies of messages sent to this bug; when I want to read
> the thread I will look at the mailing list.

This is unfortunate. Unless you actively start working on your approach,
I don't see it as actionable in any way to the CTTE.

> Also, if you disagree with my decision to raise this now, please don't
> email me.  If you feel I have overstepped a boundary, please contact
> the Community Team or DAM.

It feels rather strange to submit this as a bug report when you do not
want feedback. A mail to the list is easier to not respond to, but a bug
needs a closure one way or another eventually.

> If the Comittee gives a positive indication, I will be happy to
> re-engage at the level of technical work to try to make it happen.

I'm not sure whether what I write counts as positive. I guess I am
discouraging you (with technical reasons) while still being open to
consideration.

Helmut


Reply to: