After our discussions at DC11, we concluded that the best way to respect the need to have the two different variants of Serbian language represented would be to use a modifier. So, the conclusion was: sr: Ekavian variant, written in Cyrillic sr@latin: ditto in Latin sr@ijekavian: Ijekavian variant, Cyrillic sr@ijekavianlatin: Ijekavian variant, Latin (I hope I'm now writing "ijekavian" the right way...if I don't, please accept apologies and correct me, in the hope that it doesn't happen anymore..:-)) So, that solves a great part of the problem. However, for these variants to be used, we need locale files to exist so that people can define them in their environment (this is indeed done by D-I when the language is chosen-->the appropriate locale is defined in users environment, from the combination of chosen language and country). As of now, the glibc has three "Serbian" locales: cperrier@mykerinos:~$ ls -l /usr/share/i18n/locales/sr* -rw-r--r-- 1 root root 4940 2011-08-09 01:03 /usr/share/i18n/locales/sr_ME -rw-r--r-- 1 root root 9856 2011-08-09 01:03 /usr/share/i18n/locales/sr_RS -rw-r--r-- 1 root root 5465 2011-08-09 01:03 /usr/share/i18n/locales/sr_RS@latin Indeed, from what I see, the sr_ME locale seemsto be ijekavian: A diff between both files (converted from U+xxxx notation to UTF-8 with the attached script), gives things like: LC_TIME -abday "нед";"пон";"уто";"сри";"чет";"пет";"суб" -day "недјеља";"понедељак";"уторак";"сриједа";"четвртак";"петак";"субота" + +abday "нед";"пон";"уто";"сре";"чет";"пет";"суб" +day "недеља";"понедељак";"уторак";"среда";"четвртак";"петак";"субота" ...which, from my very basic understanding of the language is a good definition about differences between ekavian and ijekavian. So, it seems that a good basis for a locale using sr@ijekavian as language would be sr_ME. (by the way, it seems that using ijekavian in sr_ME is not a very good idea...this is indeed the same "trick" I was originally proposing with "sr_BA" being an ijekavian locale) If we go this way, now the "only" thing to do is choosing the "country" part (as, of course, the country-related things like postal codes, currency, etc. can't be copied from those of Montenegro). Of course, this might not be as easy as just saying it....as the only choice we can do is indeed sr_BA@ijakevian (and sr_BA@ijekavianlatin). Writing the locale is very easy: it requires basic knowledge about language+country and we can do it easily in a few days here in the list. But, of course, we first need to be sure about the locale name. Comments? --
#! /usr/bin/perl use encoding 'utf8'; sub c { my $text = shift; my $ret = ''; my $lastpos = 0; while ($text =~ m/\G(.*?)<U(....)>/g) { $lastpos = pos($text); $ret .= $1; my $n = hex($2); if ($n < 0x80) { $ret .= pack("U", $n); } elsif ($n < 0xc0) { $ret .= pack("UU", 0xc2, $n); } elsif ($n < 0x100) { $ret .= pack("UU", 0xc3, $n & 0xbf); } else { $ret .= pack("U", $n); } } return $ret.substr($text, $lastpos); } my $last = ''; while (<>) { if ($last ne '') { $_ = $last . $_; $last = ''; } if (m/\/\s*$/s) { s/\/\s*$//s; $last = $_; next; } s/"([^"]*)"/'"'.c($1).'"'/eg; s/";\s*"/";"/g; print; }
Attachment:
signature.asc
Description: Digital signature