[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

split descriptions Re: PROPOSAL to sarge+1 - Split main in sub-repositories



hi

I encountered the same problem

So I had also this solution in mind: split out descriptions

Here is the summary of the idea

- Change the file "Packages" so it does not contain the descriptions
 (it compresses MUCH better since it then has a limited vocabulary!
  see below)

- We put many Descriptions.lang that contain the descriptions of
 packages in different languages

- we may moreover "compress" Descriptions by using macros, see below

Here I explain more in detail

----
 There is an HUGE save in downloads. Try the comparison as follows:

 cp /var/lib/apt/lists/*sarge_main_binary-i386_Packages /tmp/main_all
 cd /tmp
 egrep  '^ |^Description|^Packa' main_all >  main_descr
 egrep -v '^ |^Description' main_all >  main_data
 ls -s main_*
 bzip2 -v main_*

As you see the above gives impressive savings: due to the
 limited vocabulary  of main_data, the file main_data.bz2 is
 38% of main_all.bz2 , tantamounting to only 800kb
 (whereas main_data is 50% of main_all)
 (even the sum of  main_data.bz2 to main_desc.bz2
  saves 4% wrt main_all.bz2)

---
 people who use different languages do not need to download the english
 descriptions, they download the descriptions in their language only

----
 descriptions do not change that often, and changes are not relevant
 to the dependency system:
 we may add a flag in APT so that people using slow modems
 decide that they download the file "Descriptions.lang"  only if it is
 older than 2weeks, or skip it , or whatever

---
 APT will work faster! indeed APT memory-maps the Package files
in memory; Package files without descriptions are half the size

---
 We will save on hard disk.
 On 20august I tried to install Debian on a old notebook with 800MB
 memory and I failed; one reason was that I had  24196MB used
 by   /var/lib/apt

---
 many times the description is just a standard part, talking about the
 package , plus a standard part, saying that this is the library of that
 package: so I propose a macro system such as follows

old version of description:

vvvvvvvvvvv
Source: foo
Description: Foo really does it
 Foo is a package that really does it , blah blah

Package: libfoo
Description: library from Foo
 Foo is a package that really does it , blah blah
 .
 This package contains the library from package foo.

Package: libfoo-dev
Description: library from Foo
 Foo is a package that really does it , blah blah
 .
 This package contains the development files from package foo.
^^^^^^^^^^^^^^^


new version of description, as found in "debian/control" and in
"Descriptions.lang"
vvvvvvvvvvv
Source: foo
Description: Foo really does it
 Foo is a package that really does it

Package: libfoo
Description: library from Foo
 @SRC(foo)@
 .
 @LIB(foo)@

Package: libfoo-dev
Description: devel library from Foo
 @SRC(foo)@
 .
 @DEV(foo)@
^^^^^^^^^^^

note that courrently the "description of a source package"
is in debian/control, but it
 is NOT distributed : I propose to distribute that as well



a.





Daniel Ruoso wrote:
Hi,

One thing everybody knows and many worry about is that the Packages file
is getting too big. The bad news (or good news) is that it tends to grow
even more, which most people would agree.

I'm having a weird experience on trying to keep a 486 with 8MB RAM
updated in sid, the three main problems are: I have to download through
a PPP on the serial line, I have only 8MB of RAM, so I have to create a
specific packages file with a small set of packages in my local mirror,
and I have to keep this file updated, or else I would have to wait a few
days to Process the entire Packages file.

What I planned is the following...
split main to the sections inside it, in a way to have separated package
files for each section, so I can choose which sections I don't want at
all, like kde and gnome, and maybe even games, or sound (since I don't
have a sound card in this computer)...

The potential advantage is that this could be used in any computer, for
example, I don't use KDE and I always assure that I don't have any kde
library installed (I have something personal against it :), I could just
avoid the downloading of the kde Packages.gz file, saving not only
bandwidth, but also memory when processing it.

What do you think?





Reply to: