[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: splitting root floppy by languages

Christian Perrier wrote:

> Quoting Joey Hess (joeyh@debian.org):
>> Everything except the asian langs currently fits on one floppy still,
>> but that is unlikely to last. I need to find some way to split the latin
>> languages, or perhaps split out the cryllic ones.
> This would give us:
> Asian root: 4languages : ja, ko, zh_TW, zh_CN
> Cyrillic root: 3Languages : ru, uk, bg
> BIDI root: 2languages : ar, he
> Others root: 29languages
Hmm.  Indonesian (id) has its own alphabet.  So does Turkish (tr).

> (the current target is 38 languages supported....2 being currently not
> started)
> The BIDI support heavily depends on current work. If we do no succeed
> in finding a way to displat RTL languages, we will be forced to drop
> them from languagechooser.
> It this technical (or pseudo-technical) split is enough, it has my
> preferrence. Otherwise, we could very quickly run in non-technical
> problems (is Lithuanian part of Scandinavian or East-European or
> whatever....).
You could go with the language familes.  These are mostly in ISO 639a,
though that doesn't have bs (Bosnian), or nb and nn (Norwegian was still
no).  They're also accurate as far as I know linguistically.  :-)

ASIAN: ja, ko, zh_CN, zh_TW
SEMITIC: ar, he
SLAVIC: bg, bs, cs, pl, ru, sk, sl, uk
ROMANCE: ca, es, fr, gl, it, pt, pt_BR, ro
GERMANIC: da, de, en, nl, nb, nn, sv 
other families: cy, el, id, lt, sq, tr

(Incidentally, this indicates that the translations do tend to be of a
fairly small number of language families, mostly the European ones.  For
instance, there isn't a single native Indian language translation or one
for any traditional sub-Saharan African language.  I guess this shouldn't
have surprised me, but it did.)

Of course, en will probably be on every disk.  Looking at number of
languages, this would be a split-up with lots of breathing room:

Asian root (4 languages + lots and lots of non-Latin characters)
Semitic root (2 languages + BIDI + 2 non-Latin scripts)
Slavic root (8 languages + Cyrillic + some accented Latin characters,
Romance root (8 langauges + some accented Latin characters, punctuation)
Germanic root (7 languages + some accented Latin characters, punctuation)
Other root (8 languages with few similarities + some accented Latin
characters, punctuation)

Make sure your vote will count.

Reply to: