[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Using "language packs" in Debian Installer

On Saturday 09 September 2006 13:28, Javier Fernández-Sanguino Peña wrote:
> Since there was some misconception of what language packages are, I
> have written a description of what they are and how they are used.

IMO language packages as described by Javier are not the correct solution 
for the installer for two important reasons:
- we basically want to keep supporting all translations [1], so having the
  option to install language packs is not the issue;
- udebs are normally not uploaded in sync so keeping strings in language
  packs in sync with all udebs would be a major pains unless you would
  create a language pack per (group of) languages per udeb; however, the
  resulting explosion of the number of udebs would be a major issue all
  by itself.

There are currently two issues with having so many translations in the 
installer, both of which need to be addressed:
1) all strings for all translations are included uncompressed by default
   in /var/lib/cdebconf/templates.dat; as the filesystem is kept in
   memory, this takes up a _lot_ of memory (currently 8GB!) that could
   be put to better use considering that at most three languages (the
   selected language, a fallback language and English) will actually be
   used in an install;
2) cdebconf currently loads the whole template file into its memory, so
   effectively 16GB! is currently used for translations (#329743).

For the rest of this mail I will concentrate on the first issue as the 
second one is pretty well documented in that bug report.

What would help very much is if the unused part of templates.dat could be 
compressed [2] and translations only copied in as needed when a different 
language is selected in localechooser.
I have done some tests and this would reduce the size of templates.dat to 
about 500 MB with the compressed file(s) with unused translations taking 
about 2.5GB, so a net saving of 5GB!

There are a few places where support for this would need to be 
- while building udebs: you'd need to split the templates file into
  English (uncompressed) and translations (compressed);
- in the build system of the installer: for udebs included into the
  initrd, the uncompressed templates file would be treated as usual,
  the compressed one would be saved in a specific location (possibly
- you'd need a program that can be called from localechooser that
  extracts the selected language (and its fallbacks) from available
  compressed templates files and merges them into the regular one;
- udpkg would need to install the compressed templates file in the
  correct location when a new udeb is installed;
- you'd need a program that can be called from udpkg that will merges
  the currently selected language (and fallbacks) from the compressed
  file into the regular one when a new udeb is installed;
- probably lowmem would need to be changed (or could be simplified).

The programs (or program) to merge from a compressed templates file to the 
regular one should be written in C and should probably be part of 

So basically my proposal would be not to implement language packs as 
described by Javier, but rather make sure that unused translations are 
saved compressed so that they don't take up space unnecessarily.
Note that implementing this would also make the second issue mentioned in 
the beginning much less urgent (though still very much worth fixing).

This is of course only a proposal; if people have alternative solutions 
that would get the same memory savings, please bring them forward.

I very much hope that someone will be willing to work on this (my 
knowledge of C is definitely not enough to implement this).
If this is not tackled soon (probably post-Etch though), we may be forced 
to freeze the addition of new languages in the installer.


[1] At least until it really becomes totally impossible to do so.

Attachment: pgpeFcW5mTy0n.pgp
Description: PGP signature

Reply to: