Install size estimation (using du -S data)
Hi,
We have conflicting data here. Mrvn says that the total du
data is only 76k. Charles says that the data is about 400k (which is
way more in line with my off the cuff calculations).
I have not the time at the moment to run the data collection
myself, but I think we should have this reconciled. I have
personally copied everyone who seemed interested in this endeavor,
and maybe we should go offline while this is resolved.
I am inclined to believe the 400k figures. I would, for
scalability reasons, advocate that we re run our scripts on a _ful__
i386 mirror (which I do not have at the moment -- ran out of space).
I also would strongly advocate *NOT* stuffing this data into
the Packages or the Available files, but keeping this apart on the
archive and when downloaded on the users disk.
manoj
>>"Brederlow" == Brederlow <goswin.brederlow@student.uni-tuebingen.de> writes:
Brederlow> I have generated a complete du index for all packages via
Brederlow> a small bash The packages.n files contain only n level of
Brederlow> subdirectories and are therefore slightly smaler. The
Brederlow> untrimmed one is 74 K for binary-i386 (including
Brederlow> binary-all).
Charles> I've been working on the 'du -S' stuff over the last week (approx),
Charles> and I think it's time I 'went public' with it. I'm afraid there's a
Charles> lot of stuff here; I thought it worth presenting the supporting data to
Charles> back up my arguments... Put it down to practice for the PhD thesis if
Charles> you want. ;-)
Charles> I've written a few simple tools to help analyse things.
Charles> They are included in the uuencoded gzipped tar at the end of
Charles> this message.
Charles> Here are the sizes of hamm's Packages files:
compressed uncompr. ratio uncompressed_name
21390 77711 72.4% Packages.hamm.contrib.orig
332143 1108583 70.0% Packages.hamm.main.orig
58041 184713 68.5% Packages.hamm.non-free.orig
Charles> Now, if we run ./gen-du on each, which downloads each
Charles> package, extracts its contents, runs 'du -S' on it, and adds
Charles> the output as a 'Du:' entry in the Packages file, we get
Charles> these file sizes:
compressed uncompr. ratio uncompressed_name
130250 871106 85.0% Packages.hamm.contrib.du
409086 1482499 72.4% Packages.hamm.main.du
70646 243962 71.0% Packages.hamm.non-free.du
Charles> Here's a sample of the output:
| Package: stow
| Version: 1.3.2-9
[...]
| installed-size: 140
| Du: 1 usr
| 16 usr/bin
| 1 usr/doc
| 13 usr/doc/stow
| 74 usr/doc/stow/html
| 19 usr/info
| 1 usr/lib
| 2 usr/lib/menu
| 1 usr/man
| 7 usr/man/man8
Script and Data from Mrvn:
______________________________________________________________________
#!/bin/bash
cd <debian mirror>/hamm/hamm/binary-i386/
# Go through all deb files
for FILE in */*.deb; do
echo Processing $FILE
echo "---" $FILE >> packages.du
mkdir tmp
cd tmp
ar -x ../$FILE data.tar.gz
tar -xzf data.tar.gz
rm data.tar.gz
du >> ../packages.du
cd ..
rm -rf tmp
done
Heres the result:
-rw------- 1 zxmqu18 zx 150743 May 28 13:24 packages.2
-rw------- 1 zxmqu18 zx 27769 May 28 13:28 packages.2.bz2
-rw------- 1 zxmqu18 zx 36115 May 28 13:27 packages.2.gz
-rw------- 1 zxmqu18 zx 242973 May 28 13:24 packages.3
-rw------- 1 zxmqu18 zx 41957 May 28 13:27 packages.3.bz2
-rw------- 1 zxmqu18 zx 54169 May 28 13:27 packages.3.gz
-rw------- 1 zxmqu18 zx 303213 May 28 13:25 packages.4
-rw------- 1 zxmqu18 zx 52802 May 28 13:27 packages.4.bz2
-rw------- 1 zxmqu18 zx 66980 May 28 13:27 packages.4.gz
-rw------- 1 zxmqu18 zx 371491 May 28 13:25 packages.5
-rw------- 1 zxmqu18 zx 62824 May 28 13:27 packages.5.bz2
-rw------- 1 zxmqu18 zx 79203 May 28 13:27 packages.5.gz
-rw------- 1 zxmqu18 zx 452649 May 28 13:18 packages.du
-rw------- 1 zxmqu18 zx 74014 May 28 13:23 packages.du.bz2
-rw------- 1 zxmqu18 zx 91843 May 28 13:23 packages.du.gz
The packages.n files contain only n level of subdirectories and are
therefore slightly smaler.
The file looks as follow (only the beginning)
--- admin/acct_6.3.2-4.deb
2 ./etc/cron.daily
2 ./etc/cron.monthly
4 ./etc/init.d
9 ./etc
29 ./usr/bin
12 ./usr/info
6 ./usr/man/man1
5 ./usr/man/man8
12 ./usr/man
49 ./usr/sbin
63 ./usr/doc/acct
64 ./usr/doc
2 ./usr/lib/menu
3 ./usr/lib
170 ./usr
1 ./var/account
2 ./var
182 .
--- admin/adjtimex_1.5-1.deb
1 ./usr/bin
33 ./usr/sbin
6 ./usr/man/man8
--
Date: 17 Mar 90 22:34:02 GMT From: merlyn@iwarp.intel.com (Randal
Schwartz) @X=split(//,'Just another Perl hacker,');*Y=*X;print @Y;
Manoj Srivastava <srivasta@acm.org> <http://www.datasync.com/%7Esrivasta/>
Key C7261095 fingerprint = CB D9 F4 12 68 07 E4 05 CC 2D 27 12 1D F5 E8 6E
--
To UNSUBSCRIBE, email to debian-devel-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
Reply to: