Re: Multiarch file overlap summary and proposal
I was thinking more about this, and I was finally able to put a finger on
why I don't like package splitting as a solution.
We know from prior experience with splitting packages for large
arch-independent data that one of the more common mistakes that we'll make
is to move the wrong files: to put into the arch-independent package a
file that's actually arch-dependent.
Look at the failure mode when that happens with the sort of package that
we're talking about splitting out of m-a:same packages:
* The arch-independent package gets arch-dependent content that happens to
match the architecture of the maintainer's build machine, since that's
the only place the arch-independent package is built. The maintainer
will by definition not notice, since the content is right for their
system.
* The maintainer is probably using a popular system type (usually either
i386 or amd64), and everyone else on that system type will also not
notice, so the bug can be latent for some time.
* Systems with the wrong architecture will get data files that have the
wrong format or the wrong information. This is usually not a case that
the software is designed to detect, so the result is normally random
segfaults or similar sorts of major bugs. The failure case for header
files is *particularly* bad: C software will generally compile fine with
the wrong-sized data types and then, at *runtime*, happily pass the
wrong data into the library, resulting in random segfaults and possibly
even data corruption. This won't happen until runtime, so could go
undetected for long periods of time.
This is a particularly nasty failure mode due to how long it can stay
undetected and how much havoc it causes.
Now, compare to the failure mode with refcounting if the maintainer
doesn't realize that an arch-specific file can't be shared:
* Each arch-specific package will continue to get the appropriate files
for that architecture. Each package will still be usable and consistent
independently, so users who don't care about multiarch won't ever see a
problem.
* Users who want to co-install separate architectures will immediately
encounter a dpkg error saying that the files aren't consistent. This
means they won't be able to co-install the packages, but dpkg will
prevent any actual harm from happening. The user will then report a bug
and the maintainer will realize what happened and be able to find some
way to fix it.
* Even better, we can automatically detect this error case by scanning the
archive for architecture pairs that have non-matching overlapping files
and deal with it proactively.
The refcounting failure mode behavior is just completely superior here.
And this *is* a mistake that we're going to make frequently; we know that
from past experience with splitting packages. Note that this problem
often happens because, when the maintainer originally split the package,
there was nothing arch-specific in the file, but upstream made it
arch-specific later on and the maintainer didn't notice. (It's very easy
to miss.) This is particularly common with header files.
Note that arch-qualifying all of the files does not have the problems of
package splitting, but it's also a much more intrusive fix.
--
Russ Allbery (rra@debian.org) <http://www.eyrie.org/~eagle/>
Reply to: