[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Help about localization packages



Christian Perrier wrote:
> Of course the very quirky way of upstream to name its files does not
> help. Have they ever heard about ISO 639 and ISO 3166?

It's not quirky at all. In fact they are very correct, and in many times
 "over-correct" (see: RFC 4646). They simply state the language, writing
script and country for all language packs they provide. That is
according to RFC 4646 and IMHO how language packs and locales (!) should
be handled.
1. Many languages can be written in multiple scripts, even within a
country. This is quite common in Asia. Using the script tag in these
cases does make sense.
2. The same language can have different vocabularies in different
countries / regions. For example: the vocabulary in Hong Kong differs
from Taiwan in many occasions. Therefor the language handling for
Traditional Chinese should actually be: having a zh_Hant translation
with all the strings which are the same in all regions, and then
zh-Hant-TW and zh-Hant-HK (if there is a difference between Hong Kong
and Macao, an additional zh-Hant-MO would be appropriate) with those
strings which differ. The user would choose the zh_Hant-{TW|HK|MO}
translation, which falls back to zh-Hant for the common strings.
However, they could be packaged into the same package, of course, which
would simply be zh_Hant.

>> To do that, I begin with the help of the iceweasel-l10n-* packages.
> 
> Please note that some of them are, imho, incorrect:
> 
> iceweasel-l10n-uk-ua
> iceweasel-l10n-cy-gb
> iceweasel-l10n-dz-bt
> iceweasel-l10n-et-ee
> .../...
> indeed all those using a country code modifier, except pt-br, zh-cn
> and zh-tw

not incorrect at all, but maybe "over-correct" :). See my above
statement. I think in these cases the country code is redundant, however.

> am=Amharic: localization package should be koha-l10n-am
> es-VE stands for "Spanish in Venezuela". I would personnally advise
> against using that and just use "koha-l10n-es" by using what they call
> es_ES upstream

Are you sure that there are no differences in vocabulary between Spain
and Venezuela? ;)

> hy-Armn: Armenian. Localization package should be koha-l10n-hy
> kn-Knda: Kannada (language from India, state of Karnataka mostly)
>          --> koha-l10n-kn
> lo-Laoo: Lao --> koha-l10n-lo
> mi-NZ: Maori --> koha-l10n-mi
> tet: Tetun, a language from Indonesia. That language has fairly few
>      speakers. Is that what upstream intended?
> ur-Arab: Urdu. Nothing to do with "Arab". Urdu is a national
>          language of Pakistant --> koha-l10n-ur

These comments are all correct, although ur-Arab means Urdu, written in
Arabic script. But, yes, it is redundant here.

Cheers
Arne

-- 
Arne Götje (高盛華) <arne@linux.org.tw>
PGP/GnuPG key: 1024D/685D1E8C
Fingerprint: 2056 F6B7 DEA8 B478 311F  1C34 6E9F D06E 685D 1E8C
Key available at wwwkeys.pgp.net.   Encrypted e-mail preferred.


Attachment: signature.asc
Description: OpenPGP digital signature


Reply to: