[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

(my) summary about translated description with dpkg (still RFC)



Hello,

I will try to sumarize my response to all mails in only one. I speak there
of translating package description, but my text could apply with minor
changes to other Debian specific material, like menu entry or debconf
templates, or to more generic material, like regular po files for
application translations or manuals. (po and manual have some specific
problems like the need of coordination with upstream source or their size I
won't address here) I reduce that way the topic of my investigation just
because one have to begin somewhere. I do think po files have problems which
should be addressed as well. But we have to handle problems one after the
other.

The problem
===========

Allowing the user who want to to see the description of packages in
there local language. That is to say:
 - the *output* of the -s and -l options of dpkg
 - any *output* of dselect
 - the *output* of apt-cache {search,show,...}

That is not to say:
 - changing any status database in any way.
 - changing the way any tool is internationalized for its own messages.

The current solution
====================

Grisu have done a mail server which allow the translators to do their
job.  For now, he generate a translated Packages.gz file, so that apt
retrieve the translated descriptions. It's a hack because the DB files
of apt contain only the translation, what is problematic when the
different administrator don't want to see the descriptions in the same
language, for example. That's also a hack because these changes are
invisible to dpkg and dselect.

The most natural solution
=========================

It could be to change dpkg and all other package management tools so
that they understand and honorate a Description-fr: field for the
french translation, Description-de: for german, and so on for each
languages (just like rpm does by the way)

So, the translation gets in the control file, and the Debian world is
not changed that much.

Our proposal
============

Don't put the translation in the control file, but in a po file (the
natural format for translations). All these po files are separated
from the usual ones which translate the program. A package can even
distribute this po file without any support in its binary to gettext.

All the package-description-po-files go to a still-to-decide place on
the user's disk. Then, after each installation, a still-to-write tool
merge all these files in a big mo file containing all translations of
all packages (without regard of their origin). This file that can be
used by all package managing tool to *output* the translated
description. Of course, in such a mecanism, the files included in the
regular package will have the priority over the ones issued by the
centralized mecanism...

Comparing the proposal with the more natural solution
=====================================================

In the natural solution, we have to reinvent the wheel in dpkg to find
and handle the outdated descriptions, we have an encoding problem for
the control file. Debconf have the same problem, and try to handle
them, but still have problems. Gettext exists, and handle well all
these problems.

The more controversial point of our proposal is that we where planing
to centralize the translation in a way that keeps the maintainer out
of the loop. But it's not the key point, it's an add-on which ease the
work of translators. Each package can still provide the translation
and be 'self contained'. For the MIA maintainers, or the ones not
willing to interfere with the translation stuff, the po file providing
the translation is in a translator-maintained package. There is no
dependency between the formal package and the translation one. The
second enhance the first when installed. When no translation is
installed, the system will be the same than it is now, and will
display the description in good old english.

The problem of putting all translation in the regular package is that
the maintainer can't review all languages. In our system, he can
review the language he is fluent in, and deleguate the responsability
of translating to the other language to maintainer which can
translate.

The problem of putting the translation of all packages in a unique
file is that a scalability problem: a lot of unwanted material ends on
the user machine, and the size of the resulting package is also a
problem. The best solution would be to splitt each package to, let's
take emacs as example, emacs20, emacs20-fr, emacs20-de, emacs20-ja,
emacs20-kr, emacs20-ru.... The first one would contain the binaries,
and all the others would contain the translations of menu entries,
debconf templates, man pages, info or docbook manual, regular po files
in only one language. That would be easy to do with some kind of
dh_i18n script with no extra work to the maintainer, but the number of
packages in Debian would be multiplicated by 10 or more. That's not
feasible (mainly because of the lack of scalability of the textual
database of dpkg. But that's not a problem we have solution for).

For package description, if the user choose 2 or 3 languages, the
quantity of extra material will be quite reasonable, because there is
not that much text to translate, comparing the the manuals or po
files, for example. And since each package can still include it's own
translation set, the scalability problem goes back.

I don't see our solution as a hack. We allow managing tools to search
the translation of descriptions in a specialized textual domain which
can be shared among all managing tool, and we do that with the plain
regular tools of the gettext family.

Current status
==============

Grisu builds po files of all translations he gets through its ddts
system, so we can build packages of them.

We miss the merging tool. As proof of concept, we can use Grisu's
files and discard any other source of translation. That's not good,
but it would be a good proof of concept.

We have a patch which allows dpkg and dselect to search the description
translation in the good textual domain, but wichert don't want to hear about
it.


Ok, that mail is long enough. Sorry about that.

Bye, Mt.



Reply to: