[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Please test gzip -9n - related to dpkg with multiarch support



On Wed, 2012-02-08 at 11:56:06 -0800, Russ Allbery wrote:
> Riku Voipio <riku.voipio@iki.fi> writes:
> > That is a major waste of space of having multiple copies of identical
> > files with different arch-qualified names. Is that really better
> > architecture to have multiple copies of identical files on user systems?
> 
> Is it really, though?  The files we're talking about are not generally
> large.  I have a hard time seeing a case where the files would be large
> enough to cause any noticable issue and you wouldn't want to move them
> into a separate -common or -doc package anyway.

Exactly, in addition this is already an “issue” with lots of packages
(regardless of multi-arch) which do not use a common symlinked doc dir.

These are some numbers I'm getting on my system (w/ the attached
quickly hacked up script), all wild approximations, just to get a feel
of it:

  Approx. installed m-a:same lib waste (w/o -dev,-doc): 20051501
  Approx. installed m-a:same lib waste (w/ -dev,-doc): 23310229
  Approx. installed m-a:same lib waste per package (23310229 / 293): 79557.09
  Approx. predicted lib waste per arch (779 * 79557.09): 61974973.11
  Approx. total lib waste per arch (4003 * 79557.09): 318467031.27

So, supposedly, if all possible libs were to be multiarchified I'd
waste 60 MiB in case I wanted to have all of them installed for each
architecture I enable. Which is not going to be the case. But if it
was and 60 MiB were such a problem I could just as well use
«dpkg --exclude-path» support.

Also I think there's problably some room for improvement which would
benefit non-multiarch installations too. For example TODO, USAGE and
lots of similar files should be moved to the -dev packages. AUTHORS
THANKS and CREDITS files should probably be already represented in
copyright, etc. Provably a lintian warning could be introduced for
this.

regards,
guillem
#!/bin/sh

echo "List of files that might be candidates to be split out"
grep-status -n -sPackage -FMulti-Arch same | \
  egrep -v -e '-(dev|doc)' | xargs dpkg -L | grep '\/usr\/share\/' | \
  egrep -v '(copyright|changelog|NEWS|README)' | \
  while read f; do test -f "$f" && printf "$f\0"; done | \
  du -bsch --files0-from -

waste_libs=$(grep-status -n -sPackage -FMulti-Arch same | \
  egrep -v -e '-(dev|doc)' | xargs dpkg -L | grep '\/usr\/share\/' | \
  while read f; do test -f "$f" && printf "$f\0"; done | \
  du -bc --files0-from - | tail -n 1 | cut -f1)
echo "Approx. installed m-a:same lib waste (w/o -dev,-doc): $waste_libs"

waste_same=$(grep-status -n -sPackage -FMulti-Arch same | \
  xargs dpkg -L | grep '\/usr\/share\/' | \
  while read f; do test -f "$f" && printf "$f\0"; done | \
  du -bc --files0-from - | tail -n 1 | cut -f1)
echo "Approx. installed m-a:same lib waste (w/ -dev,-doc): $waste_same"

inst_same=$(grep-status -n -sPackage -FMulti-Arch same|wc -l)
waste_per_lib=$(echo "scale=2; $waste_same / $inst_same" | bc -l)
echo "Approx. installed m-a:same lib waste per package ($waste_same / $inst_same): $waste_per_lib"

inst_libs=$(grep-status -n -r -sPackage -FSection libs| \
  egrep -v '(common|data|-bin)'| wc -l)
waste_inst=$(echo "scale=2; $inst_libs * $waste_per_lib" | bc -l)
echo "Approx. predicted lib waste per arch ($inst_libs * $waste_per_lib): $waste_inst"

total_libs=$(grep-aptavail -n -r -sPackage -FSection libs| \
  egrep -v '(common|data|-bin)'| wc -l)
waste_total=$(echo "scale=2; $total_libs * $waste_per_lib" | bc -l)
echo "Approx. total lib waste per arch ($total_libs * $waste_per_lib): $waste_total"

Reply to: