[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Debian's problems, Debian's future

On Wed, Mar 27, 2002 at 05:47:45PM -0600, Colin Watson wrote:
> On Wed, Mar 27, 2002 at 09:12:40PM +0100, Jeroen Dekkers wrote:
> > On Wed, Mar 27, 2002 at 07:35:26PM +0100, Eduard Bloch wrote:
> > > I cannot follow, provide links. I must guess, and I don't think that
> > > making multiple Packages files would improve the speed or RAM usage.
> > 
> > It does also other things, like making distribution creation more
> > flexible. I'm thinking of having a some kind of package file for every
> > source package. That would include the current information and maybe a
> > lot more things like URL of upstream, license, etc. This file would be
> > stored in every package pool directory
> > (i.e. pool/main/f/foobar/Packages). 
> > 
> > Then we create a lot of bigger Packages files, only including the
> > packagename, version number and some other things which might be
> > useful (but not too much). Those bigger Packages files can be a lot
> > more flexible, for example we could have a different Package file for
> > different licenses, different upstream projects (gnome, kde, gnu, X,
> > etc), different use of machines (server, desktop), etc.
> > 
> > I'm not sure it will increase the speed really much, it would at least
> > make the Packages files a lot smaller.
> My considered guess (no, I haven't benchmarked it, please do if you want
> to dispute this ...) is that it would slow things down substantially for
> most users. Going back and forth with individual HTTP or FTP requests
> for every source package - 5000-odd when bootstrapping a system - will
> introduce huge amounts of latency, especially on slow connections. If
> you try to solve this by parallelizing downloads then you'll probably
> create unreasonable load on the server, especially in the case of
> servers like Apache which have trouble scaling well to large numbers of
> simultaneous connections. And you're still going to need a master index
> of all the miniature Packages files if you don't want to have to send
> If-Modified-Since: requests for all 5000-odd of them every time you
> update.

You need a master index, but you can build that master index more
flexible. The master index can also be smaller, could include
descriptions or not, etc. You can have a lot of indexes for special
purposes, for example a 'server' and 'workstation' thing and others.

Also you don't need all those 5000 things. If you just add the
necessary things to the main Packages file. People who aren't
interested in it don't have to download it. People who are interested
in it can download the individual files if they just want one or two
descriptions or download the big file with all of them if they want
all the things.

> Fundamentally, streaming a large download is much faster than trying to
> download a hundred little bits a hundredth of the size.

Yes, that's true. But if you only want to upgrade your system you
don't need the file with all descriptions.
> Given that you can already create your own Packages files with different
> views into the package pool anyway, I'm afraid I don't see how splitting
> Packages like this is terribly useful or practical.

You can do other things like adding a lot of more tags which would
bloat the main Packages file to much. It would also make the creation
of all other Packages files easier.

Jeroen Dekkers
Jabber supporter - http://www.jabber.org Jabber ID: jdekkers@jabber.org
Debian GNU supporter - http://www.debian.org http://www.gnu.org
IRC: jeroen@openprojects

Attachment: pgpgqCBTxT8CB.pgp
Description: PGP signature

Reply to: