[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: DEP 17: Improve support for directory aliasing in dpkg



Hi,

On 21.04.23 15:03, Raphael Hertzog wrote:

Here you are considering all files, but for the purpose of our issue,
we can restrict ourselves to the directories known by dpkg. We really
only care about directories that have been turned into symlinks (or
packaged symlinks that are pointing to directories). That's a a much lower
number of paths that we would have to check.

Having all paths in the database is cheaper, because doubling the number of paths multiplies the (average) cost by log_{262144} 2 only, and we do significantly more lookups than inserts.

The other problem is that we do not know all of these paths, because the file system has been modified externally without informing dpkg. The closest thing we can do is scan everything that is supposed to be a directory.

As an additional complication, dpkg silently resolves symlink-vs-directory conflicts in favour of the directory (which happens seldom, but third-party tools sometimes generate broken packages like that, so it is useful to keep it that way).

Thus this time-consuming operation would be done once, the first
time that the updated dpkg starts and when /var/lib/dpkg/aliases
does not yet exist.

That is still a public interface. :/

In any case, now that you have a database of aliases, you can do the other
modifications to detect conflicting files and avoid file losses.

How does that sound?

Alas, that is the easy part. My branch already implements most of that, including the logic to trigger a reload after a maintainer script if the stat information changed (like for diversions).

The proposal I made above is not a real database in the sense that we
don't record what was shipped by the .deb when we installed the files...
it's rather the opposite, it analyzes the system to detect possible
conflicts with dpkg's view of the system.

That is going to be slow, and it changes dpkg's public interface to a more complex one where our tight loop that handles unpacking files gains additional error states.

It can be seen as complimentary to it. In any case, I don't see how
implementing metadata tracking would help to solve the problem that we
have today. dpkg would know that all .deb have /bin as a directory and
not as a symlink, and it would be able to conclude that the directory
has been replaced by a symlink by something external, but that's it.

It should still accept that replacement and do its best to work with it.

That means there are two sources of truth: packages and the file system. We then need a (lowercase) policy how to resolve conflicts between these, which becomes our public interface, and thus part of (uppercase) Policy.

I'd also single out the usrmerge transition here. This package operates in a grey area of Policy where technically a grave bug is warranted because it manipulates files belonging to other packages without going through one of the approved interfaces, but since we accidentally shipped that, we need to deal with it now. That does not mean this is acceptable, it just wasn't enforced.

To me it would also be acceptable to just hardcode "if usrmerge or usr-is-merged is installed, take over the known aliases and silently discard that package", then salt the earth in dak that no package of this name can ever be shipped again until bookworm+3.

That would be significantly easier than finding a generic solution that covers all existing use cases.

   Simon


Reply to: