[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Multiarch file overlap summary and proposal

I was thinking more about this, and I was finally able to put a finger on
why I don't like package splitting as a solution.

We know from prior experience with splitting packages for large
arch-independent data that one of the more common mistakes that we'll make
is to move the wrong files: to put into the arch-independent package a
file that's actually arch-dependent.

Look at the failure mode when that happens with the sort of package that
we're talking about splitting out of m-a:same packages:

* The arch-independent package gets arch-dependent content that happens to
  match the architecture of the maintainer's build machine, since that's
  the only place the arch-independent package is built.  The maintainer
  will by definition not notice, since the content is right for their

* The maintainer is probably using a popular system type (usually either
  i386 or amd64), and everyone else on that system type will also not
  notice, so the bug can be latent for some time.

* Systems with the wrong architecture will get data files that have the
  wrong format or the wrong information.  This is usually not a case that
  the software is designed to detect, so the result is normally random
  segfaults or similar sorts of major bugs.  The failure case for header
  files is *particularly* bad: C software will generally compile fine with
  the wrong-sized data types and then, at *runtime*, happily pass the
  wrong data into the library, resulting in random segfaults and possibly
  even data corruption.  This won't happen until runtime, so could go
  undetected for long periods of time.

This is a particularly nasty failure mode due to how long it can stay
undetected and how much havoc it causes.

Now, compare to the failure mode with refcounting if the maintainer
doesn't realize that an arch-specific file can't be shared:

* Each arch-specific package will continue to get the appropriate files
  for that architecture.  Each package will still be usable and consistent
  independently, so users who don't care about multiarch won't ever see a

* Users who want to co-install separate architectures will immediately
  encounter a dpkg error saying that the files aren't consistent.  This
  means they won't be able to co-install the packages, but dpkg will
  prevent any actual harm from happening.  The user will then report a bug
  and the maintainer will realize what happened and be able to find some
  way to fix it.

* Even better, we can automatically detect this error case by scanning the
  archive for architecture pairs that have non-matching overlapping files
  and deal with it proactively.

The refcounting failure mode behavior is just completely superior here.
And this *is* a mistake that we're going to make frequently; we know that
from past experience with splitting packages.  Note that this problem
often happens because, when the maintainer originally split the package,
there was nothing arch-specific in the file, but upstream made it
arch-specific later on and the maintainer didn't notice.  (It's very easy
to miss.)  This is particularly common with header files.

Note that arch-qualifying all of the files does not have the problems of
package splitting, but it's also a much more intrusive fix.

Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>

Reply to: