[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#517854: Haven't install Chinese fonts and input engine in locale zh_HK

Christian Perrier wrote:
> Quoting Arne Goetje (arne@linux.org.tw):
>> Christian Perrier wrote:
>>> Damn. This is one of those cases where we suffer from the silly trick
>>> of using zh_CN and zh_TW to differentiate between two different
>>> *scripts*. I really dream of different ISO-639 codes for the two
>>> different written versions of Chinese: Traditional and Simplified.
>> Err, no. Just use RFC4647 for the whole locale system.
>> zh-Hans	= simplified
>> zh-Hant = traditional
> Which, if I'm correct, would mean having zh-Hans_CN and
> zh-Hant_TW|zh-Hant_HK locales, right ?

Almost. :)
RFC4647 uses hyphens instead of underscores.
The format is: langcode-script-territory-variation-x-usertags
 langcode = ISO 639-{1|2|3}
 script = ISO 15924
 territory = ISO 3166
 variation = a predefined list of variation tags
 x = separator between "official" tags and "user defined tags"
(literally 'x')
 usertags = user defined tags
and everything except langcode is optional.

So, we would have locales zh-Hans-CN, zh-Hans-SG, zh-Hant-TW, zh-Hant-HK
and zh-Hant-MO. These of course could be shortened to what we have today
(zh-CN, zh-SG, zh-TW, zh-HK, zh-MO), but we would lose the ability to
have a simple fallback mechanism.

RFC4647 has been designed to provide a simple fallback mechanism, means
zh-Hant-MO would fallback to zh-Hant, which would fallback to zh.

<nitpick> Given that zh is actually a meta tag for "any Chinese
language", it would probably even make sense to finally define what we
mean with "zh", namely "Mandarin Chinese", which has the ISO 639-3
language tag "cmn". Means, the locales should actually be
cmn-{Hans|Hant}-{CN|SG|TW|HK|MO}. </nitpick> ;)

I'm just convinced that upstream glibc won't want to walk that path.

> Also meaning renaming PO files progressively as well.
> It would be nice if we could implement this.

Yes, I'm still dreaming... ;)


Reply to: