[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Install size estimation (using du -S data)



Hi,

	We have conflicting data here. Mrvn says that the total du
 data is only 76k. Charles says that the data is about 400k (which is
 way more in line with my off the cuff calculations).

	I have not the time at the moment to run the data collection
 myself, but I 	think we should have this reconciled. I have
 personally copied everyone who seemed interested in this endeavor,
 and maybe we should go offline while this is resolved.


	I am inclined to believe the 400k figures. I would, for
 scalability reasons, advocate that we re run our scripts on a _ful__
 i386 mirror (which I do not have at the moment -- ran out of space).

	I also would strongly advocate *NOT* stuffing this data into
 the Packages or the Available files, but keeping this apart on the
 archive and when downloaded on the users disk.

	manoj

>>"Brederlow" == Brederlow  <goswin.brederlow@student.uni-tuebingen.de> writes:

 Brederlow> I have generated a complete du index for all packages via
 Brederlow> a small bash The packages.n files contain only n level of
 Brederlow> subdirectories and are therefore slightly smaler.  The
 Brederlow> untrimmed one is 74 K for binary-i386 (including
 Brederlow> binary-all). 

 Charles> I've been working on the 'du -S' stuff over the last week (approx),
 Charles> and I think it's time I 'went public' with it.  I'm afraid there's a
 Charles> lot of stuff here; I thought it worth presenting the supporting data to
 Charles> back up my arguments...  Put it down to practice for the PhD thesis if
 Charles> you want.  ;-)

 Charles> I've written a few simple tools to help analyse things.
 Charles> They are included in the uuencoded gzipped tar at the end of
 Charles> this message.

 Charles> Here are the sizes of hamm's Packages files:

compressed  uncompr. ratio uncompressed_name
    21390     77711  72.4% Packages.hamm.contrib.orig
   332143   1108583  70.0% Packages.hamm.main.orig
    58041    184713  68.5% Packages.hamm.non-free.orig

 Charles> Now, if we run ./gen-du on each, which downloads each
 Charles> package, extracts its contents, runs 'du -S' on it, and adds
 Charles> the output as a 'Du:' entry in the Packages file, we get
 Charles> these file sizes:

compressed  uncompr. ratio uncompressed_name
   130250    871106  85.0% Packages.hamm.contrib.du
   409086   1482499  72.4% Packages.hamm.main.du
    70646    243962  71.0% Packages.hamm.non-free.du

 Charles> Here's a sample of the output:

| Package: stow
| Version: 1.3.2-9
[...]
| installed-size: 140
| Du: 1 usr
|  16   usr/bin
|  1    usr/doc
|  13   usr/doc/stow
|  74   usr/doc/stow/html
|  19   usr/info
|  1    usr/lib
|  2    usr/lib/menu
|  1    usr/man
|  7    usr/man/man8




 Script and Data from Mrvn:
______________________________________________________________________
#!/bin/bash
cd <debian mirror>/hamm/hamm/binary-i386/
# Go through all deb files
for FILE in */*.deb; do
  echo Processing $FILE
  echo "---" $FILE >> packages.du
  mkdir tmp
  cd tmp
  ar -x ../$FILE data.tar.gz
  tar -xzf data.tar.gz
  rm data.tar.gz
  du >> ../packages.du
  cd ..
  rm -rf tmp
done

Heres the result:

-rw-------   1 zxmqu18  zx         150743 May 28 13:24 packages.2
-rw-------   1 zxmqu18  zx          27769 May 28 13:28 packages.2.bz2
-rw-------   1 zxmqu18  zx          36115 May 28 13:27 packages.2.gz
-rw-------   1 zxmqu18  zx         242973 May 28 13:24 packages.3
-rw-------   1 zxmqu18  zx          41957 May 28 13:27 packages.3.bz2
-rw-------   1 zxmqu18  zx          54169 May 28 13:27 packages.3.gz
-rw-------   1 zxmqu18  zx         303213 May 28 13:25 packages.4
-rw-------   1 zxmqu18  zx          52802 May 28 13:27 packages.4.bz2
-rw-------   1 zxmqu18  zx          66980 May 28 13:27 packages.4.gz
-rw-------   1 zxmqu18  zx         371491 May 28 13:25 packages.5
-rw-------   1 zxmqu18  zx          62824 May 28 13:27 packages.5.bz2
-rw-------   1 zxmqu18  zx          79203 May 28 13:27 packages.5.gz
-rw-------   1 zxmqu18  zx         452649 May 28 13:18 packages.du
-rw-------   1 zxmqu18  zx          74014 May 28 13:23 packages.du.bz2
-rw-------   1 zxmqu18  zx          91843 May 28 13:23 packages.du.gz

The packages.n files contain only n level of subdirectories and are
therefore slightly smaler.

The file looks as follow (only the beginning)

--- admin/acct_6.3.2-4.deb
2       ./etc/cron.daily
2       ./etc/cron.monthly
4       ./etc/init.d
9       ./etc
29      ./usr/bin
12      ./usr/info
6       ./usr/man/man1
5       ./usr/man/man8
12      ./usr/man
49      ./usr/sbin
63      ./usr/doc/acct
64      ./usr/doc
2       ./usr/lib/menu
3       ./usr/lib
170     ./usr
1       ./var/account
2       ./var
182     .
--- admin/adjtimex_1.5-1.deb
1       ./usr/bin
33      ./usr/sbin
6       ./usr/man/man8


-- 
 Date: 17 Mar 90 22:34:02 GMT From: merlyn@iwarp.intel.com (Randal
 Schwartz) @X=split(//,'Just another Perl hacker,');*Y=*X;print @Y;
Manoj Srivastava  <srivasta@acm.org> <http://www.datasync.com/%7Esrivasta/>
Key C7261095 fingerprint = CB D9 F4 12 68 07 E4 05  CC 2D 27 12 1D F5 E8 6E


--
To UNSUBSCRIBE, email to deity-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org


Reply to: