[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: split descriptions Re: PROPOSAL to sarge+1 - Split main in sub-repositories



On Sun, Sep 05, 2004 at 10:36:22AM +0200, Andrea Mennucc wrote:

> Matt Zimmerman wrote:
> >If you skipped downloading descriptions, then you would not have
> >descriptions for any new packages, and the ones that you have could be 
> >wrong
> >(some of them do contain version-specific data).
> 
> you would have short description at least
> 
> it would also be quite easy to add a web interface:
> APT would download the packages dependencies and  then download
> descriptions of packages that are missing (using the above interface)

And the mirrors would run this web interface as well?  Or all descriptions
would then be served from a single server?  Either way, it doesn't scale.

> >This is what general-purpose data compression does; there is no need to
> >invent a macro language.
> 
> quite on the opposite: I study and teach compression, and I can tell you
> that macros could provide a benefit

You say this, but you do not show it.  If the Packages file is suitably
ordered, gzip should come quite close to the efficiency of your macro
scheme, without the incredible increase in complexity that you propose,
which would then require implementation in the hundreds of tools which parse
the Packages file.

> depends on the point of view....
> 
> when APT memory-maps all those files, that is a LOT of memory (for older 
> systems): a macro language would decrease that and ease older systems 
> (and also newer ones)

APT doesn't mmap the Packages files.

> > The fact is, Debian unstable only gets
> >new versions of packages once a day. 
> 
> if you call that "only"

I do.

> >A long time ago, I set up a cron job
> >which runs:    ......
> 
> your solution assumes that
> 1) you keep your PC on 24h/day
> 2) you do not pay for connection

It assumes neither.  It only assumes that you are capable of scheduling a
job to run at a time when it is convenient for you.  You can buy a device
for a few USD which switches current on/off based on the time of day, and
modern PCs can switch themselves on based on a BIOS setting.

Whether you pay for bandwidth is also irrelevant.  You would pay for that
bandwidth regardless of whether you download attended or unattended.  My
unstable system seems to download on the order of 10-20M of debs per day for
an upgrade, and about 2M of Packages files, so even if you could reduce that
by 50%, that is only a 5-10% savings on the total download.

As is so unfortunately common when someone offers a "solution" to a problem
like this in Debian, the "facts" are unverified, the claims misleading, and
the solutions are worse than the original problem.

If you would prefer not to download package descriptions, I suggest that you
provide Packages files with descriptions filtered as a service for yourself
and anyone who is interested.  Perhaps you can convince your local mirror to
host a symlink farm which lets you implement this easily.  I do not think
that this approach is appropriate for the official Debian archive.

-- 
 - mdz



Reply to: