[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[D-I] Testing the validity of encodings for Debian Installer translations



The Norwegian translators recently discovered that their translations
were having few encoding errors. The PO file for Debian Installer was
mentioned as using UTF-8 encoding, but on a few places, invalid
characters were there.

This yielded me to try testing these files. Though I've read that
iconv cannot always detect every possible errors, I've used it on all
files.

For each language, I have to possibilities:

-if the translation is using UTF-8, I try converting it to another
 encoding which is suited for that language. For that, I've built a
 table by grabbing information here and there and using my own
 knowledge of these languages. See at the end of this mail and please
 correct it
 In a few cases (fa, hi), I simply do not have any alternate encoding and the
 script does nothing

-if the translation uses another encoding, I try converting it to
 UTF-8

Languages which do not use the master file (pt, nl, uk...) have not
been tested.

Below is the result. Please notice that some "INVALID" warnings may be
false alarms because the chosen alternate encoding is
inappropriate. In such case, please check the alternate encoding
table.

Testing gl...UTF-8 to iso-8859-15 --> INVALID
Testing ja...UTF-8 to EUC-JP --> INVALID
Testing nb...UTF-8 to iso-8859-1 --> INVALID
Testing se...UTF-8 to iso-8859-1 --> INVALID
Testing sl...UTF-8 to iso-8859-2 --> INVALID
Testing sq...UTF-8 to iso-8859-1 --> INVALID

Others are found correct. 

The alternate encodings table used:

bg:cp1251
bs:iso-8859-2
ca:iso-8859-1
cs:iso-8859-2
cy:iso-8859-14
da:iso-8859-1
de:iso-8859-1
el:iso-8859-7
es:iso-8859-1
et:iso-8859-15
eu:iso-8859-1
fa:utf-8
fi:iso-8859-15
fr:iso-8859-1
ga:iso-8859-15
gl:iso-8859-15
he:iso-8859-8
hi:utf-8
hr:iso-8859-2
hu:iso-8859-2
id:iso-8859-1
ja:EUC-JP
ko:EUC-KR
lt:iso-8859-13
lv:iso-8859-13
mg:iso-8859-1
mk:cp1251
nb:iso-8859-1
nn:iso-8859-1
pl:iso-8859-2
pt_BR:iso-8859-1
ro:iso-8859-2
ru:koi8-r
se:iso-8859-1
sk:iso-8859-2
sl:iso-8859-2
sq:iso-8859-1
sr:iso-8859-5
sv:iso-8859-1
tl:iso-8859-1
tr:iso-8859-9
vi:tcvn5712-1
zh_CN:gb2312
zh_TW:big5






Reply to: