[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: DEP 17: Improve support for directory aliasing in dpkg



Hi!

On Sat, 2023-04-22 at 10:27:26 +0200, Helmut Grohne wrote:
> On Sat, Apr 08, 2023 at 04:35:25AM +0200, Guillem Jover wrote:
> > Let's also get back to the very basics. dpkg manages objects shipped
> > in binary packages, on the filesystem. It assumes this managing role in
> > exclusivity, it will for example overwrite unmanaged files. It preserves
> > admin changes with interfaces specifically provided for that (diversions,
> > statoverrides, conffile changes) or the unfortunate symlink redirects.
> > These shipped objects define the filesystem layout (not the other way
> > around). Due to the missing fsys metadata, where it does not have all
> > such metadata at hand when necessary (it might only have the one for
> > the currently unpacked .deb), it might use heuristics or check the
> > filesystem for such metadata, because it does not have anything else,
> > but that should not be taken to mean that the filesystem is the source
> > of truth, as most of those will be unnecessary once it has such
> > metadata at hand.
> 
> This captures an insight I previously didn't have in that clarity and
> that I find agreeable conceptually.
> 
> > So the reason this proposal is still conceptually wrong is manifold:
> > 
> > * dpkg cannot safely and atomically perform such switches (and I don't
> >   see it ever being able to portably do so, so I don't see ever
> >   supporting that).
> 
> I agree, but the proposal also does not ask dpkg to perform such
> switches, so I kinda fail to see how this is a relevant argument.

It is relevant because it affects the end state, and what solutions
are going to be appropriate then. See below.

This might perhaps also have been a source of misunderstanding, my
thinking is not focused solely on this particular instance but how
this interacts with other current or long term behavior and upcoming
features, and how this all would look like in the end.

> > * No packages ships those symlinks (and none should! as that would
> >   currently imply having the same pathname contain different file types
> >   on the same system, introducing ordering issues and file type
> >   conflicts).
> 
> I disagree with this argument on two levels. For one thing, I think that
> the transition only is complete once these symlinks are shipped in a
> package. In particular, that notion of complete likely encompasses that
> no aliasing occurs anymore as all aliased files have been moved to their
> canonical location somehow (<- and this likely will be a quite difficult
> thing to do). For another, no package actually ships those symlinks now.
> They are created behind dpkg's back in some postinst. This is
> unfortunate and I agree with Simon Richter that this kinda is a policy
> violation, but at this time, it is an aspect we have to deal with
> whether we want to or not.
> 
> I suspect that you disagree with the notion the we have to deal with
> this situation, which I consider to be our fundamental disagreement.

I don't think we disagree (?), I probably didn't express myself clearly.
The fact that no package ships those symlinks *is* and *has* been a
problem, and what I've been saying all along, this will be the only
correct way to let dpkg know whether there will be aliasing in play.
At the same time what I was trying to say is that we cannot ship those
symlinks because even though dpkg does not yet track fsys metadata
(even though it should and is one requirement to be able to be
aliasing-aware), it would be an implicit file type conflict, where
dpkg (currently) would not know or be able to do anything meaningful
with it, and might make unpacks fail in the future (depending on the
ordering or packages being unpacked).

Coming now back to the atomic and safe switches, and the ordering,
as I think I've mentioned elsewhere, dpkg should eventually be made
aliasing-aware, in that it should know about all fsys file types and
be able to detect these cases during unpack (once these symlinks are
properly shipped in a package). But given these mentioned constraints
it cannot be made to support (as in accept) unpacking files inside
aliased directories (it should be able to unpack the symlinks creating
those aliased directories though!).

There are several reasons for that:

 * One is that the expected behavior for file types tracked by dpkg
   is to switch their file type if this is data.tar initiated and the
   operation can be done when the dirs are empty (so to get rid of
   these dpkg-maintscript-helper parts) otherwise abort, applying the
   symlink←→dir preservation behavior should only be done (if at all)
   for admin initiated changes on the fsys.
 * Another is that dpkg would need to allow those pathnames to have at
   the same time two sets of metadata attributes (mode, perms, xattrs,
   file type, one a symlink target), which is a terrible interface.
 * But more importantly this causes ordering issues and unpredictability.
   If there is a package A shipping a directory and package B shipping
   an aliasing symlink on the same pathname, and package C shipping also
   contents within that directory, and we have established that dpkg
   cannot always safely perform such file type switch, then depending on
   the unpack order and whether the "directory" is empty or not, dpkg
   would be able to perform the file type switch or not, and you might
   end up with files appearing in two "directories" and with an
   aliased directory or not. This is also terrible behavior. And that's
   why I say dpkg should simply refuse that, and something that should
   not be supported.

> > * This introduces a series of commands to let dpkg know that a
> >   filesystem change that was not shipped in any .deb (even though that
> >   should have been the way to do it), has been done, which:
> >   - Switches the source of truth from the .deb to the fsys.
> 
> While this is correct on some level, the aim of this change is to put
> that truth back into dpkg of course.

Sure, the problem is the price that will need to be paid to get there,
in terms of problematic interfaces or behavior and what kind of
workarounds or hacks that will entail, and for how long.

> >   - Confuses admin initiated changes from distro initiated ones.
> 
> I think we already do this with dpkg-divert, dpkg-statoverride and other
> tools. While this may not be nice, it certain has prior art and is
> consistent with how we have been doing things in the past.

dpkg-divert distinguishes between local and package level changes, it
is true that dpkg-statoverride does not have (currently) that
distinction, although it is primarily an admin tool where I don't
think it makes much sense to support something like declarative
package statoverrides TBH once we can ship fsys metadata (perhaps
conditional one though).

> > * Wants to be a generic change but it is really targeted to this
> >   specific mess. We have been doing similar aliasing transitions for
> >   many doc dirs, by stopping shipping files within, shipping that
> >   pathname as a symlink and then switching the directories to symlinks
> >   to match (via the dpkg-maintscript-helper hack because we miss fsys
> >   metadata). This means we'd need to then register all these directories
> >   too? Meh.
> 
> I would love to agree with this, but I believe that this ship has
> sailed. This likely is part of our fundamental disagreement.

The comment was not focused on how this could have been done, but in
that this is a common operation we do, and would need to get the same
treatment, which seems bad.

> > * This information can get out of sync with reality, as it adds an
> >   additional and unconnected with anything source of truth, that dpkg
> >   cannot do anything about if it diverges (in contrast to diversions
> >   or statoverrides f.ex.). This can never happen when that information
> >   comes from the real source of truth (the fsys metadata via the .deb).
> 
> I have difficulties accurately capturing the argument. The problem of
> information getting out of sync with reality should affect every aspect
> of dpkg and indeed, that kinda is the status quo where upgrades can
> loose files, because dpkg has an incomplete picture of reality. The aim
> of this change is to allow us to re-sync the status quo into dpkg. My
> view is that dpkg's information presently is out of sync with reality
> and the proposed change partially fixes that.

The current problem stems from both dpkg lacking fsys metadata and
Debian holding dpkg wrong in an unsupported way, but where ideally
both of these will eventually go away (?). My objection was that the
proposal introduces a mechanism which makes things worse because it
adds more information sources that can/will get out of sync.

> >   [ As an aside, I think ideally eventually nothing distro provided should
> >     be allowed to be installed within an aliased dir, and dpkg should
> >     eventually just error out in those cases, which eventually would get
> >     rid of the aliasing problems and any such complexity (I'm not sure how
> >     or when that would be feasible though, but obviously in Debian at
> >     least not until nothing ships files there). ]
> 
> It seems to me that this is something everyone agrees on. So our
> disagreement resides in the way to get there rather than where to get
> to.

If that's the case, then great. My impression though is that some
people expect dpkg will be able to unpack content within aliased
directories (?), which I don't see happening for the reasons I
mentioned above. This will imply that you cannot install any old
package that ships content there, which might be unexpected, but I
don't see any other sane way to handle this. :/

> > So this still looks like a terrible interface, like it did at the time
> > it was discarded; founded on a hack, an interface that seems wants to
> > be kind of a file-type override but it cannot be, and cannot even
> > properly act as record tracker, etc…
> 
> I agree that in a perfect world, we would not need this. Let me circle
> back to our fundamental disagreement.
> 
> My impression is that at this time basically everyone except you agrees
> that we have to deal with the aliasing problems that have been rolled
> out to users and will be forced in bookworm. I believe that this is the
> state that we have to consider as starting point and that we cannot
> magically turn this transition back to perform it in a better way. And
> indeed, I believe that there would have been a better way[1] that no
> longer is available to us.

I think I've mentioned before multiple times, that dpkg should
eventually be able to be aliasing-aware. I think I've also mentioned
that to get there we need to move all files out of aliased directories,
otherwise several of the changes required for that "support" might not
be even able to be deployed.

> On the other hand, my impression is that you continue to see the
> transition as fundamentally broken and in a state that we cannot work
> from. You appear to believe that if we want to do it, we must start over
> in a better way. That better way must not cause aliasing problems to
> dpkg.

Well, it should be obvious by now this somewhat called transition is
fundamentally broken, and I also see that there is no magic simple and
clean way to get out of it. And every way out, is through further
complexity, workarounds or badness. Of course given the corner Debian
has painted itself into, there needs to be a way out, my objection is
what kind of price to pay for that.

> > I thought it would be clear that if there is stuff that depends on
> > any of this kind of changes to dpkg, relying on those changes in
> > Debian would not be possible until after trixie+1. Of course there is
> > always the route to further pile up over the Jenga tower of hacks,
> > by for example adding huge amounts of Pre-Depends…
> 
> I agree that we probably will deal with this until at least trixie+1.
> This is precisely why I would like to have a plan to finish it sooner
> rather than later.

Also, to note, that even if the way out was through some dpkg
workaround, which would even get backported to bookworm, AIUI upgrades
are never guaranteed to start from the last point release, so that
would not seem to help much anyway.


So coming back to workarounds and hacks, I'm finding the diversions
stuff to be rather bad, as it requires to bypass an explicit dpkg
refusal to deal with diverted directories, so it's going into further
unsupported territory. :/ My other concern is that this might end up
leaving unsupported directory diversions around which could break dpkg
if it starts refusing to work on them during unpack, not just during
diversion additions.

I did a PoC (untested) implementation for the partial upgrade deletion
prevention workaround to see how bad that might look like, and in
comparison to the diverted stuff it is bad but not as bad. As I
mentioned on our talks, this needs to imply emitting a warning,
because otherwise this might end up as relied on behavior that should
not be supported, and it would be a temporary hack for Debian and
derivatives until things have moved out.

  https://git.hadrons.org/git/debian/dpkg/dpkg.git/log/?h=pu/aliasing-workaround

Also, in case there is any confusion, this is a _partial_ workaround
that does not cover many of the other badness, such as file overwrites
and disappearances in other stages of the package life-cycle nor in
other tools from the dpkg suite, from local packages, or from admin
initiated changes via supported interfaces.

I still think all the proposed workarounds are pretty terrible, TBH.

Thanks,
Guillem


Reply to: