On Thu, Apr 11, 2002 at 10:40:31PM -0700, Robert Tiberius Johnson wrote:
> On Wed, 2002-04-10 at 02:28, Anthony Towns wrote:
> > I'd suggest your formula would be better off being:
> > bandwidthcost = sum( x = 1..30, prob(x) * cost(x) / x )
> I think it depends on what you're measuring. I can think of two ways to
> measure the "goodness" of these schemes (there are certainly others):
>
> 1. What is the average bandwidth required at the server?
> 2. What is the average bandwidth required at the client?
I don't think the bandwidth at the server is a major issue to anyone,
although obviously improvements there are a Good Thing.
Personally, I think "amount of time spent waiting for apt-get update
to finish" is the important measure (well, "apt-get update; apt-get
dist-upgrade" is important too, but I don't thing we've seen any feasible
ideas at improving the latter).
> prob2(i)=(prob1(i)/i)*norm,
>
> where norm is a normalization factor so the probabilities sum to 1.
> I've been looking at question 2, and you're suggesting that I look at
> question 1, except you forgot the normalization factor. I think this is
> what you mean. Please correct me if I've misunderstood.
No, I'm not. I'm saying that "the amount of time spent waiting for
apt-get update" needs to count every apt-get update you run, not just
the first. So, if over a period of a week, I run it seven times, and you
run it once, I wait seven times as long as you do, so it's seven times
more important to speed things up for me, than for you.
> Anyway, here are the results you asked for. I'm NOT including the
> normalization factor for easier comparison with your numbers. My diff
> numbers are a little different from yours mainly because I charge 1K of
> overhead for each file request.
Merging, and reordering by decreasing estimated bandwidth. The ones marked
with *'s aren't worth considering because there's a method that's both
has less bandwidth required, and takes up less diskspace. The ones without
stars are thus ordered by increasing diskspace, and decreasing bandwidth.
> days/
> bsize dspace ebwidth
> -------------------------------
Having the "ebwidth" of the current situation (everyone downloads the
entire Packages file) for comparison would be helpful.
> 1 12.000K 342.00K [diff]
> 20 312.50K * 173.70K [cksum/rsync]
> 2 24.000K * 171.20K [diff]
> 3 36.000K * 95.900K [diff]
> 40 156.30K * 89.300K [cksum/rsync]
> 60 104.20K * 62.200K [cksum/rsync]
> 4 48.000K * 58.500K [diff]
> 80 78.100K * 49.300K [cksum/rsync]
> 100 62.500K * 42.200K [cksum/rsync]
> 5 60.000K * 38.800K [diff]
> 120 52.100K * 37.900K [cksum/rsync]
> 400 15.600K 37.700K [cksum/rsync]
> 380 16.400K 36.800K [cksum/rsync]
> 360 17.400K 35.900K [cksum/rsync]
> 140 44.600K * 35.300K [cksum/rsync]
> 340 18.400K 35.100K [cksum/rsync]
> 320 19.500K 34.300K [cksum/rsync]
> 300 20.800K * 33.600K [cksum/rsync]
> 160 39.100K * 33.600K [cksum/rsync]
> 280 22.300K 33.000K [cksum/rsync]
> 180 34.700K * 32.700K [cksum/rsync]
> 260 24.000K 32.500K [cksum/rsync]
> 240 26.000K 32.200K [cksum/rsync]
> 200 31.300K * 32.200K [cksum/rsync]
> 220 28.400K 32.100K [cksum/rsync]
> 6 72.000K 27.900K [diff]
> 7 84.000K 21.800K [diff]
> 8 96.000K 18.200K [diff]
> 9 108.00K 16.100K [diff]
> 10 120.00K 14.900K [diff]
> 11 132.00K 14.100K [diff]
> 12 144.00K 13.700K [diff]
> 13 156.00K 13.400K [diff]
> 14 168.00K 13.300K [diff]
> 15 180.00K 13.100K [diff]
180k is roughly 10% of the size of the corresponding Packages.gz, so
is relatively trivial. Since we'll probably do it at the same time as
dropping the uncompressed Packages file (sid/main/i386 alone is 6MB),
this is pretty neglible.
Cheers,
aj
--
Anthony Towns <aj@humbug.org.au> <http://azure.humbug.org.au/~aj/>
I don't speak for anyone save myself. GPG signed mail preferred.
``BAM! Science triumphs again!''
-- http://www.angryflower.com/vegeta.gif
Attachment:
pgp_y0V8lVW4o.pgp
Description: PGP signature