Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 2012-02-08 at 15:14:35 -0800, Steve Langasek wrote:
> So I had a look at the Ubuntu archive, which already has a large collection
> of packages converted to Multi-Arch: same, to provide some hard facts for
> this discussion.
> - 2197 files are shipped in /usr/share by these packages, outside of
> /usr/share/doc - which, by and large, are files that can actually be
> shared between architectures.
> - These files are distributed between 47 different subdirectories:
> 703 ./usr/share/man
> 11 ./usr/share/info
> 3 ./usr/share/java
These three are always compressed so would need to be split anyway.
> 187 ./usr/share/lintian
> 53 ./usr/share/bug
> 4 ./usr/share/mime-info
> 4 ./usr/share/menu
> 3 ./usr/share/applications
These should usually be pkgname based, thus can be just kept arch-qualified.
I've not checked the rest in detail, but just with these, the 2197 files
get reduced to 1229 which might not need moving out otherwise.
> - For many of these files, it would be actively harmful to use
> architecture-qualified filenames. Manpages included in -dev packages
> should not change names based on the architecture; having
> /usr/share/pam-config contain multiple files for the same profile, one
> for each architecture of the package that's installed, would not work
> correctly; etc.
I said that arch-qualifying should apply for things that are currently
pkgname based, but never that this should be used to avoid any file
conflict, for the rest the correct solution would be to just split them
> - If we needed to split the arch-indep contents out of the M-A: same
> package instead of reference counting in dpkg, that would be roughly 170
> new binary packages. 139 of them would contain 10 files or less
> (exclusive of /usr/share/doc).
Given that several of those would need to be created regardless due to
the many compressed files above, and several others do not need to be
split at all, the resulting number of packages does not seem onerous
to me at all, it actually seems like the right thing to do, after all.
Riku mentioned as an argument that this increases the data to download
due to slightly bigger Packages files, but pdiffs were introduced
exactly to fix that problem. And, as long as the packages do not get
updated one should not get pdiff updates. And with the splitting of
Description there's even less data to download now.
> I think there are pretty solid benefits to proceeding with a dpkg that
> allows sharing files across M-A: same packages. Even if we decided we
> couldn't rely on gzip, there are still lots of other cases where this
While there's obviously some benefits, otherwise we'd not have
considered shared files an option at all, I don't think they outweigh
at all the problems and fragility they introduce.
> And besides, consider that a M-A: same package shipping contents in a
> non-architecture-qualified path that vary by architecture is *always* a bug
> in that package, which will need to be fixed. Requiring that M-A: same
> packages don't use non-architecture-qualified paths even for files which
> *don't* vary by architecture doesn't help much to ensure that we won't have
> bugs. It would be easier for lintian to spot errors in M-A: same packages
> if we can say that any file that doesn't have an architecture-qualified path
> is buggy, but at this point we already have Jakub's reports anyway, which we
> could make a regular part of our archive consistency checks. So I don't
> believe that having dpkg be more strict about files that *could* be shared
> will make the user experience any better; it just presents more occasions
> for packages to be regarded as buggy and for dpkg to error out.
W/o automatic checks or actual installation testing any such issues can
be introduced, this is not specific to M-A: same packages, we do have
similar problems when moving files around two packages, or when stomping
over other package namespaces, etc.
Adding shared file support into dpkg, introduces additional uneeded
complexity, that can never be taken out, and which seems clear to me
should be dealt with at the package level instead.