Re: Re: Google Summer of Code 2009: Debian's Shortlist
> On 2009-04-11, Filipus Klutiero <firstname.lastname@example.org> wrote:
> > Obey Arthur Liu wrote:
> >> === And the details: ===
> > [...]
> > These descriptions are very short. Assuming these are the abstracts,
> > that's not the students' fault. The abstracts were shortened this year
> > to 500 characters. I struggled to shorten mine to fit this. At this
> > length, it's probably impossible to fit a decent summary of most
> > projects. It would normally make sense to use abstracts for this use
> > case. Maybe Google should be asked to change the limit. Otherwise I'd
> > like to see a custom description which describes a little further. I
> > currently can't comment on all projects presented.
> > That said, this shortlist remains useful, and I thank you for this great
> > jump in transparency.
> Mind to tell us what your proposed project was? For more transparency?
> Kind regards,
> Philipp Kern
Here is my application, stripped only from the personal section. I will not
submit this idea anymore. Students should feel free to use this application as
they wish. The project needs a student with a good understanding of Debian
package management. Most importantly, a mentor familiar with APT should be
found. This was a very scarce resource in the last 3 years.
= Project Title =
Improved package management of language packs
= Origins =
There are currently two methods to distribute localized data:
* Bundling localized data for all languages with the application package. The
main issue with this approach is the size of the package.
* Architecture-independent packages associated with the application packages
providing localized data. There is typically one package per language. Since
they 'enable' the language translation for that software when installed they
are called language packages or language packs. The main issue with this
approach is that the language package for an application is not installed
The first method is suboptimal while the second is less usable.
For more information, see
= Project =
The intention is to optimize distribution of localized data by improving the
usability of the second method, which should encourage its use and diminish
the usage of the first method.
Concretely, the first goal is that installation of language packs happens
automatically. For example, French people should get openoffice.org-l10n-fr
installed automatically when openoffice.org is installed.
The second goal is to control the effects of growing the number of packages.
The growth of the Packages file and the number of packages returned by searches
should be avoided or limited.
= Benefits to Debian =
The Debian groups which would benefit from this project are users and mirror
providers. Administrators of non-English systems are particularly targeted.
== Direct benefits (improvements to language packs handling) ==
The first direct benefit to users is that administrators will no longer need to
specifically select the language packages they want to install in order to make
The second direct benefit is that the size of Packages and the number of
packages matching a search should be reduced. Note that the actual secondary
direct benefits will depend on how exactly controlling the effects of growing
the number of packages will be done.
== Indirect benefits (avoiding packages bundling l10n data) ==
The indirect benefit of this project is that increasing the interest in
language packs should reduce the number of application packages bundling
localized data. Concretely, the issues of this method will be avoided:
* Localized data increases (for all architectures) the binary package size.
* On multi-architecture mirrors, architecture-specific packages increase disk
usage and bandwidth usage for synchronizations.
* Increases bandwidth usage for users and uploading mirrors.
* Increases disk space usage for users. localepurge, considered a hack,
exists to diminish this issue.
* Time for installs is increased due to getting and unpacking a larger .deb.
* Localized data is in the same binary package and therefore has to be built
from the same source package as the application.
* Localized data can not be handled by different maintainers.
* Translation updates can not be made independently from the application
binary package and could cause a regression in the application package. It is
risky to do translation updates during a freeze.
* A translation update means that the application binary package needs to be
rebuilt. This causes larger updates (mostly more bandwidth usage) and
increased buildd usage, so maintainers tend to wait for a new software release
before providing the translation updates. The delay for translator's work to
reach users tends to increase (e.g. debconf updates sitting in the BTS).
Work from maintainers will be needed to obtain these indirect benefits.
Nevertheless, I expect the indirect benefits to be greater than the direct
= Deliverables =
*APT installing language packs for given language(s) automatically
*Means to control the effects of growing the number of packages
*Depending on developer feedback, improved development tools for building
*Advice to developers about when language packs should be used and tips to do
= Project Details =
Installing language packages automatically should only require changes to APT
Means to control the effects of growing the number of packages need more
discussion. The current proposition to have new components would require
changes to APT and archive maintenance tools. Changes to APT front-ends and
tools may be another way.
== Implementation of "APT installing language packs for given language(s)
Currently the main idea to implement this is based on a desired language(s)
setting. It has basically 2 steps:
*Map the application package to the language package(s) using the desired
*Install the language package(s) when installing the application package
I do not have a clear idea of how to implement the first step for now, mainly
due to the "dialects". For example, foo-l10n-fr-ca should be used if a system
has "fr_CA" as a desired languages setting, but should also be used for a "fr"
desired languages setting if there is no foo-l10n-fr nor any foo-l10n-fr-fr.
The second step should be easy. The desired language(s) setting should be an
apt configuration option.
For now I think it will be possible to map the application package to the
language pack simply using package names, but it would be possible and perhaps
cleaner to use new control fields (e.g. Provides-l10n: iceweasel, L10n-
So this could be done with a change to policy mentioning how language packages
should be named or documenting the new control fields. Changes to APT will of
course be needed. Installing the new apt version for the first time should
preset the desired language(s) setting to, for example, debian-
installer/language. Changes to archive maintenance tools may be needed for new
= Project Schedule =
I can work on this project during the entire summer. I expect the main part of
this project to take about 7 weeks.
Determine the implementation and request comments.
Integrate feedback, perfect the proposition and review the project schedule
for the remaining time.
Week 3 and 4
Modify APT and Policy to allow automatic installation of language packs.
Week 5 to 7
Provide means to control the effects of growing the number of packages. For now
I believe this should consist in modifications to APT and archive maintenance
The remaining time should be plenty to deal with unexpected issues, bugs or
over-optimistic schedule items. If these do not take all the time, I will
improve developer tools. I may also produce patches for packages to start
adapting their language packs to the new specification. At this point, if it
was not done before, I will write the documentation for developers. If there
is still time, I may produce patches to modularize application packages
bundling l10n data.
The 2 first weeks will also be used to familiarize myself with software that
will require changes, that is at least APT and APT front-ends.
= Summer commitments =
During summer 2009, I will either graduate or work on this project, depending
on whether this offer is accepted. Otherwise, I have no commitment or plan for
= Plans for Debian =
I must confess I'm already involved with Debian, since at least 2005. I mainly
provide support and work on quality and a bit on documentation. The temptation
to maintain packages has been tempting, but never quite enough to get started.
Since the end of 2006, this temptation was tempered by serious issues with the
BTS that make it impractical for me to start maintaining any package (long
story, I'm still working on fixing this). I do not have real plans for Debian
after the summer (at least, none I'd expect my schedule to allow). It is
nevertheless possible, depending on my scheduele, the progress on my BTS
issues and how I enjoy working on APT, that I give in to the temptation of
working on APT or an APT front-end (I use Synaptic, but I'd prefer a good
Qt/KDE front-end, which is not yet in sight).
= About this document =
This application is a small update of one sent in 2007 and 2008. Nothing was
done in this area since 2007, so the project is almost identical. In March
2009, Neil Williams submitted a DEP draft about "Tdeb"-s, which targets some
of the issues covered by this project.
== Credits ==
Javier Fernández-Sanguino created
http://wiki.debian.org/i18n/TranslationDataDistribution from which some of
this proposal's content comes from. Aigars Mahinovs and Eddy Petrișor wrote
http://wiki.debian.org/i18n/TranslationDebs which inspired the implementation
Thanks to Steve McIntyre and Erich Schubert for respectively backing up and
sending me back the 2007 version of this application, which I had lost.