Re: Please test gzip -9n - related to dpkg with multiarch support
On Wed, 2012-02-08 at 11:56:06 -0800, Russ Allbery wrote:
> Riku Voipio <riku.voipio@iki.fi> writes:
> > That is a major waste of space of having multiple copies of identical
> > files with different arch-qualified names. Is that really better
> > architecture to have multiple copies of identical files on user systems?
>
> Is it really, though? The files we're talking about are not generally
> large. I have a hard time seeing a case where the files would be large
> enough to cause any noticable issue and you wouldn't want to move them
> into a separate -common or -doc package anyway.
Exactly, in addition this is already an “issue” with lots of packages
(regardless of multi-arch) which do not use a common symlinked doc dir.
These are some numbers I'm getting on my system (w/ the attached
quickly hacked up script), all wild approximations, just to get a feel
of it:
Approx. installed m-a:same lib waste (w/o -dev,-doc): 20051501
Approx. installed m-a:same lib waste (w/ -dev,-doc): 23310229
Approx. installed m-a:same lib waste per package (23310229 / 293): 79557.09
Approx. predicted lib waste per arch (779 * 79557.09): 61974973.11
Approx. total lib waste per arch (4003 * 79557.09): 318467031.27
So, supposedly, if all possible libs were to be multiarchified I'd
waste 60 MiB in case I wanted to have all of them installed for each
architecture I enable. Which is not going to be the case. But if it
was and 60 MiB were such a problem I could just as well use
«dpkg --exclude-path» support.
Also I think there's problably some room for improvement which would
benefit non-multiarch installations too. For example TODO, USAGE and
lots of similar files should be moved to the -dev packages. AUTHORS
THANKS and CREDITS files should probably be already represented in
copyright, etc. Provably a lintian warning could be introduced for
this.
regards,
guillem
#!/bin/sh
echo "List of files that might be candidates to be split out"
grep-status -n -sPackage -FMulti-Arch same | \
egrep -v -e '-(dev|doc)' | xargs dpkg -L | grep '\/usr\/share\/' | \
egrep -v '(copyright|changelog|NEWS|README)' | \
while read f; do test -f "$f" && printf "$f\0"; done | \
du -bsch --files0-from -
waste_libs=$(grep-status -n -sPackage -FMulti-Arch same | \
egrep -v -e '-(dev|doc)' | xargs dpkg -L | grep '\/usr\/share\/' | \
while read f; do test -f "$f" && printf "$f\0"; done | \
du -bc --files0-from - | tail -n 1 | cut -f1)
echo "Approx. installed m-a:same lib waste (w/o -dev,-doc): $waste_libs"
waste_same=$(grep-status -n -sPackage -FMulti-Arch same | \
xargs dpkg -L | grep '\/usr\/share\/' | \
while read f; do test -f "$f" && printf "$f\0"; done | \
du -bc --files0-from - | tail -n 1 | cut -f1)
echo "Approx. installed m-a:same lib waste (w/ -dev,-doc): $waste_same"
inst_same=$(grep-status -n -sPackage -FMulti-Arch same|wc -l)
waste_per_lib=$(echo "scale=2; $waste_same / $inst_same" | bc -l)
echo "Approx. installed m-a:same lib waste per package ($waste_same / $inst_same): $waste_per_lib"
inst_libs=$(grep-status -n -r -sPackage -FSection libs| \
egrep -v '(common|data|-bin)'| wc -l)
waste_inst=$(echo "scale=2; $inst_libs * $waste_per_lib" | bc -l)
echo "Approx. predicted lib waste per arch ($inst_libs * $waste_per_lib): $waste_inst"
total_libs=$(grep-aptavail -n -r -sPackage -FSection libs| \
egrep -v '(common|data|-bin)'| wc -l)
waste_total=$(echo "scale=2; $total_libs * $waste_per_lib" | bc -l)
echo "Approx. total lib waste per arch ($total_libs * $waste_per_lib): $waste_total"
Reply to: