[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

encodings for debconf templates


I am now trying to solve Bug#148490, where Debconf cannot convert
translated messages (like Description) into proper encoding according
to the current LC_CTYPE locale.

(For example, messages should be outputed in UTF-8 in fr_FR.UTF-8
locale while ISO-8859-15 should be used in fr_FR@euro locale.)

To achieve this, I wrote a small text script to convert given
string into the encoding specified by the current LC_CTYPE locale.
It is available in http://bugs.debian.org/148490 .

However, for the script to work well, I have to know what encodings
are the original debconf translations written in.  For example,
Japanese translations (Description-ja:) are written in EUC-JP
encoding and Polish translations are written in ISO-8859-2 encoding.

A list of already-translated languages is available in the
page of http://www.debian.org/international/l10n/templates/ .

bg     CP1251
ca     ISO-8859-1 ? ISO-8859-15 ?
cs     ISO-8859-2
da     ISO-8859-1
da_DK  ISO-8859-1
de     ISO-8859-1 ? ISO-8859-15 ?
dk     ?
es     ISO-8859-1 ? ISO-8859-15 ?
fi     ISO-8859-1 ? ISO-8859-15 ?
fr     ISO-8859-1 ? ISO-8859-15 ?
gl     ISO-8859-1 ? ISO-8859-15 ?
gl_ES  ISO-8859-1 ? ISO-8859-15 ?
hu     ISO-8859-2
hu_HU  ISO-8859-2
it     ISO-8859-1 ? ISO-8859-15 ?
ja     EUC-JP
ko     EUC-KR
lt     ISO-8859-13
nl     ISO-8859-1 ? ISO-8859-15 ?
no     ISO-8859-1
pl     ISO-8859-2
pt     ISO-8859-1 ? ISO-8859-15 ?
pt_BR  ISO-8859-1
ro     ISO-8859-2
ru     KOI8-R ? ISO-8859-5 ?
se     ?
sv     ISO-8859-1 ? ISO-8859-15 ?
tr_TR  ISO-8859-9

I guessed these encodings from /usr/share/i18n/SUPPORTED file.
What encodings are used for these languages?

Or, to internationalize Debconf, it may be a good idea to convert
all Debconf templates into UTF-8.  The merits of this way are
(1) I think this should be the right way, (2) my script don't need
to keep and maintain the list of encodings which may be changed in
future, and (3) template file won't be broken even if one file
contains many languages because all languages will use the same
encoding.  Of course, by taking this way, the above question don't
need to be answered.

However, (a) we will really need a mechanism to convert encodings
so that we can use Debconf in non-UTF-8 locales because many of us
still use non-UTF-8 locales and we should not finish supporting
non-UTF-8 locales even in future, (b) all packages which has translated
Debconf templates will have to be modified, (c) we don't have many
UTF-8 text editors to edit Debconf templates yet, and (d) such large
amount of work should be done *after* the release of Woody.

The problem of (a) will be solved by the script I wrote.
For (c), we already have Yudit, Vim, Emacs + mule-ucs, and so on.
I think (b) will be a large amount of labor and thus we should wait
for the release of Woody so that the labor won't disturb works for
release of Woody.

Tomohiro KUBOTA <kubota@debian.org>
"Introduction to I18N"  http://www.debian.org/doc/manuals/intro-i18n/

To UNSUBSCRIBE, email to debian-i18n-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org

Reply to: