[D-I] Testing the validity of encodings for Debian Installer translations
The Norwegian translators recently discovered that their translations
were having few encoding errors. The PO file for Debian Installer was
mentioned as using UTF-8 encoding, but on a few places, invalid
characters were there.
This yielded me to try testing these files. Though I've read that
iconv cannot always detect every possible errors, I've used it on all
files.
For each language, I have to possibilities:
-if the translation is using UTF-8, I try converting it to another
encoding which is suited for that language. For that, I've built a
table by grabbing information here and there and using my own
knowledge of these languages. See at the end of this mail and please
correct it
In a few cases (fa, hi), I simply do not have any alternate encoding and the
script does nothing
-if the translation uses another encoding, I try converting it to
UTF-8
Languages which do not use the master file (pt, nl, uk...) have not
been tested.
Below is the result. Please notice that some "INVALID" warnings may be
false alarms because the chosen alternate encoding is
inappropriate. In such case, please check the alternate encoding
table.
Testing gl...UTF-8 to iso-8859-15 --> INVALID
Testing ja...UTF-8 to EUC-JP --> INVALID
Testing nb...UTF-8 to iso-8859-1 --> INVALID
Testing se...UTF-8 to iso-8859-1 --> INVALID
Testing sl...UTF-8 to iso-8859-2 --> INVALID
Testing sq...UTF-8 to iso-8859-1 --> INVALID
Others are found correct.
The alternate encodings table used:
bg:cp1251
bs:iso-8859-2
ca:iso-8859-1
cs:iso-8859-2
cy:iso-8859-14
da:iso-8859-1
de:iso-8859-1
el:iso-8859-7
es:iso-8859-1
et:iso-8859-15
eu:iso-8859-1
fa:utf-8
fi:iso-8859-15
fr:iso-8859-1
ga:iso-8859-15
gl:iso-8859-15
he:iso-8859-8
hi:utf-8
hr:iso-8859-2
hu:iso-8859-2
id:iso-8859-1
ja:EUC-JP
ko:EUC-KR
lt:iso-8859-13
lv:iso-8859-13
mg:iso-8859-1
mk:cp1251
nb:iso-8859-1
nn:iso-8859-1
pl:iso-8859-2
pt_BR:iso-8859-1
ro:iso-8859-2
ru:koi8-r
se:iso-8859-1
sk:iso-8859-2
sl:iso-8859-2
sq:iso-8859-1
sr:iso-8859-5
sv:iso-8859-1
tl:iso-8859-1
tr:iso-8859-9
vi:tcvn5712-1
zh_CN:gb2312
zh_TW:big5
Reply to: