Packages file (long email) [WAS: Splitting Packages]
I have made some analysis over the Packages file and tried to
split and optimize it to be better useable always keeping in my mind
as a primary target very old system (but this does not mean that it
doesn't help also newer ones).
Im pretty sure that some of the conclusion were already discussed but
I will raise them again. Maybe another discussion/proposal can bring
out new ideas. btw this is only from my idea ;)
In the first step lets take a look to the structure of the Packages file.
The structure of each package can be optimized. These are just "micro"
optimizations but repeated *numpkgs time can increase general performance.
I have noticed that there are somehow too many exception inside
the structure, look at 2 packages for example (bash and ax25-xtools):
Maintainer: Patrick Ouellette <email@example.com>
Depends: libc6 (>= 2.2.4-4), libfltk1, libgl1, libstdc++2.10-glibc2.2
(>= 1:2.95.4-0.010810), xlibs (>> 4.1.0)
Suggests: talkd, ax25-apps, ax25-tools
Description: AX-25 Tools (X versions)
Maintainer: Matthias Klose <firstname.lastname@example.org>
Replaces: bash-doc (<= 2.05-1), bash-completion
Depends: base-files (>= 2.1.12)
Pre-Depends: libc6 (>= 2.2.4-4), libncurses5 (>= 5.2.20020112a-1)
Description: The GNU Bourne Again SHell
The first thing that poped up to my eyes was the entry "Essential: yes"
Since there's already a priority structure why not use that? Having a
essential makes more sense to me than having an exception that needs to be
analized and that anyway give exactly the same results.
The same concept can apply to many other things.
Source: ax25-tools for example.
Why there is a source field in an optional pkg and not in bash for
Now that's just the general idea that pushed me to think to a sort of "new"
1) where priority should include "essential"
2) Instead of using the different keywords such as Conflict:, Pre-Depends:,
Replaces: etc, etc, I would rather suggest one line only that will
all of them and each pkg listed can be prepended with some info ex:
etc. etc. (just an example!)
so to look like
Depends: +base-files (>= 2.1.12), !libc6 (>= 2.2.4-4), *bash-completion
and so on...
In this way is possible to save some lines in the file and the parser can
benefit from that.
My idea of having this small and static structure is to remove as much as
possible redundant info like Source, Arch, Maintainer and to optimize
the file parser
1) reduce the general size of the file
2) reduce time to parse the file
3) even reduce a bit the flexibility (yeah I know... but why lie???)
For keywors like Section/Priority/Version I made some notes at the end of
the mail that might be interesting.
The next step in my idea is to split the Packages file in several files
according to Section and Priority.
I have figured out 2 possible scenarios.
First scenario (keywords: Section/Priority)
one entry in the sources.list will look like:
deb http://<mirror>/debian unstable main/base/* main/net/important contrib/*
where * means all the priorities that belongs to main/base/ contrib/
Second scenario (keywords: Priority/Section)
one entry in the sources.list will look like:
deb http://<mirror>/debian unstable main/essential/* contrib/optional/web/*
where * means all the sections that exits in that priority.
Decide which one is better over the other is very difficul but don't forget
that they can be really easily implemented in parallel becuase they can
The Packages file that reside in main/net/optional is exactly the same
that is in main/optional/net/ so a symlink is more than enough giving
people freedom to choose the way they prefer. apt can take care easily
to avoid duplicate downloads of Packages files.
Now Ben Armstrong pointed out 1 problem: What if I need only ONE package
is not in a section mentioned in the sources.list???
Well I thought about 2 possible solutions that can coexist at the same
hounestly I don't find them as "the best" (I will really appreciate idea
One could be an external file in /etc/apt where people can specify
that apt should care of.
Two could be the possibility to specify single pkgs directly in the
sources.list extending the entry to (ex)
Now before yelling here since Im sure 100% that all of you will ask:
what about the version control/dendencies of that pkg??? ;)
please continue to read :)
The general approach of dividing the Packages file rise one issue:
What about dependencies???
Keeping the actual way of handling dependencies will break for one
if a pkg depends on another package that is in one section not listed in the
sources.list than there's a problem.
The solution is to have in /main /contrib /non-free a file called Available
that file should contain one entry per line and every line should look like:
According to some stats I made over unstable/main it does not get bigger
50KB (.gz) so I think it's an acceptable size for over 8300 pkgs.
This file can be used really for many things.
1) can keep version control for single pkgs (here we go, and since
are declared also inside the .deb a small check will avoid to
Packages file for that section/priority)
2) can be used to correct broken dependencies (read above)
3) fast search over the entire archive even if a specific
not present in the sources.list
4) diffed against the old one can be used to decide with Packages files
be downloaded according the sources.list (this can be an interesting
to analize in order to reduce network load probably even for mirroring
5) permits to remove Section / Priority / Version lines from the
Now in conclusion. Using this approach will:
1) increase flexibility to handle pkgs/archive etc.
2) reduce in general network load
3) increase performance in order to handle small systems (ALWAYS
what the admin pretends from a small system!)
4) require a rewrite of code in many pkgs for it's implementation
(but this is common also if other implementation will take place)
5) the transition to this system is smooth since it's a reimplementation
is already in place
Well I hope I didn't waste my time. I'm sorry that I didn't reach to
fake archive to show how it looks but all these conclusion/ideas come out
trying to build a script to generate it. The script is far to be nice,
and fast so doesn't make sense for me to show it (IT'S SO HUGLYYYY!!! ;)
like my idea ehehhe)
To UNSUBSCRIBE, email to email@example.com
with a subject of "unsubscribe". Trouble? Contact firstname.lastname@example.org