[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Automating of localizations



On Thu, 2003-10-30 at 16:58, Gunnar Wolf wrote:
> Alastair McKinstry dijo [Thu, Oct 30, 2003 at 01:47:38PM +0000]:
> > >Ok... I will at least do my part... But, I agree, this deserves a
> > >Debconf dialog - it is the only sane way I think of. I think that
> > >adding a third specifier (i.e. es_MX_MX, es_MX_HR, es_MX_TJ, etc.)
> > >would be too little standard. Besides, we would have to come up with a
> > >way to unpredictably mix es_MX_TJ and en_MX_TJ, as in the border
> > >cities in general (Tijuana [TJ] would be the most prominent example)
> > >half of what is said is said in English ;-)
> > >
> > >Greetings,
> > 
> > I agree with the optional "third specifier". Something like this
> > already exists: according to the standard (I can send ptrs later, from home),
> > you can add modifiers, so instead of
> > es_MX_TJ, you would do es_MX@TJ.  There is no standard for
> > _what_ the modifiers are, IIRC.
> > 
> > If we add support, I recommend we use ISO-639-3 codes: these
> > are subdivision codes defined for regions in countries.
> > e.g. in france, departements, in Ireland, counties, in 
> > Germany, Land (spelling?).
> > 
> > We could add support to check for the subdivision, and 
> > use that to pick a better default; eg in the US, its hard to
> > pick a default timezone for the US as a whole, but more
> > straightforward for a given State (which is what is used for
> > the ISO-639-3 subvision in the US).
> 
> Ummm... I doubt ISO-639-3 is the proper answer for this - It only
> specifies languages, not even countries, even less regions [1]. 

Duh. Thinko. The standard is the ISO 3166-3. Both ISO-639 and
ISO-3166 are present in the iso-codes package, which I maintain.

Using the subdivision code may be overkill is most cases, but if we
do subdivide, I strongly recommend using ISO-3166-3.

> I peeked around a little bit on the libc6 package - it includes the
> timezones information in /usr/share/zoneinfo. Inside this directory
> hierarchy, the only files that are not timezone data (and can thus
> help us organize the 1609 files that _are_ timezone data) are
> iso3166.tab and zone.tab. iso3166.tab contains (obviously) the
> ISO-3166-1 alpha-2 country codes - the names for 262 countries or
> geographic entities (i.e., 'UM' for 'US minor outlying islands'). 
> 
> The other file will be much more useful for us: It contains -organized
> by country- the names of the timezones, an explanation for that name,
> and the geographic coordinates for its center. As for Mexico, the
> following zones are defined:
> 
> MX      +1924-09909     America/Mexico_City     Central Time - most locations
> MX      +2105-08646     America/Cancun  Central Time - Quintana Roo
> MX      +2058-08937     America/Merida  Central Time - Campeche, Yucatan
> MX      +2540-10019     America/Monterrey       Central Time - Coahuila, Durango, Nuevo Leon, Tamaulipas
> MX      +2313-10625     America/Mazatlan        Mountain Time - S Baja, Nayarit, Sinaloa
> MX      +2838-10605     America/Chihuahua       Mountain Time - Chihuahua
> MX      +2904-11058     America/Hermosillo      Mountain Standard Time - Sonora
> MX      +3232-11701     America/Tijuana Pacific Time
> 
> We do not have different timezones for each of those areas, but they
> roughly (even politically) define our country, and were a region of
> Mexico to change timezone, it would probably be reflected by this
> division. 
Yes, the zone.tab file is well designed for the task. The third column
_is_ the official timezone name, even if in modern times the actual time
is identical in two areas (eg. America/Chichuahua, America/Hermosillo).
If picking a default, the first line matching the country code is the
most populous.

> I think we could even do away with the %lang_map declaration in
> locale-config-skolelinux file conffiles.d/timezone - This is a hash
> based on the name of the locale. The locale, in my case, is es_MX - a
> simple regex would do:
> 
> $lang =~ /^([a-z]{2})           # language code
>            _([A-Z]{2})          # country code
>            (?:\.([A-Z0-9\-]+))? # optional charset
>            (?:@(.+))?/x;        # optional extra information
> 
> This was able to parse for me every locale specified in
> /etc/locale.gen and (adding at the beginning '(\S+)\s+' for the
> language name) also in /etc/locale.alias - We can infer from our
> chosen locale and libc6's /usr/share/zoneinfo/zone.tab all the needed
> information for finding out the corresponding timezones. 
> 
> Like it? :-)
> 
Yes. Thats how it was designed :-).

> Now, as a final note: There are many overlapping definitions, at least
> for my country - The more traditional timezones. Until a couple of
> years ago, in every Unix system I had used, I knew I lived in
> Mexico/General. We have also Mexico/BajaNorte and Mexico/BajaSur. I
> would prefer using the newer timezones as defied in the zone.tab file,
> but we can/should still support the older scheme (although not suggest
> it in an automatic way, perhaps).
> 
> Greetings,
> 
> -----
> 
> [1] ISO 639 Language Codes
>     http://www.w3.org/WAI/ER/IG/ert/iso639.htm



Reply to: