[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: big Packages.gz file



From: Hamish Moffatt <hamish@debian.org>
Subject: Re: big Packages.gz file
Date: Tue, 9 Jan 2001 23:40:01 +1100

> On Tue, Jan 09, 2001 at 06:04:58PM +0900, Miles Bader wrote:
> > The packages file gets downloaded _every single time_ you do an update,
> > and for those of us with a slow modem link, that really sucks.

This is only a small part of the whole story, IMHO. See my other email
replying you. ;)

> Maybe there could be another version of Packages.gz without the
> extended descriptions -- I imagine they would take something like
> 33% of the Packages file, in line count at least.

Exactly. DIFF or RSYNC method of APT (as Goswin pointed out), or just
seperate Descriptions out (as I pointed out and you got it too),
nearly 66% of the bits are saved. But this is only a hack, albeit
efficient.

Cause this does not solve the problem of the package pool within the
package pool system. It does it on the protocol and client tool side.

1) AIUI, package pool should be a storage system, which should has a
smart algorithm for deleting packages which no distribution or other
packages referncing. (Garbage collection by reference counts.)

2) A distribution, put aside the work of our honoured release manager,
should be a partial package index listing. Thus, should be seperated
from storage system. The current ``testing'' distribution doesn't to
it well enough. (Thus, it has a regulation on upload frequency.)

With these two things in mind, RSYNC can help very little. And the
package pool's indexing problem remains. While on my previous letters,
I try to get out a discussion on one of my humble try to help. ;)

As soon as I have enough time, and enough discussion, I maybe write a
more prepared document. But I need discussion first. Thanks!

--
echo <<EOF |cpp - -|egrep -v '(^#|^$)'
/*   =|=X ++
 *   /\+_ p7 <zw@debian.org> */
EOF



Reply to: