[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Li18nux Locale Name Guideline Public Review



On Tue, Jan 22, 2002 at 12:49:56PM +0900, Stephen J. Turnbull wrote:
> However, it's important to remember that a bad standard is better than
> no standard.  It is extremely difficult to change a bad standard, it
> is true.  But it's even harder to change "no standard", and in the
> meantime users suffer much more.

I'm not sure I agree. A lot of programming languages and a lot of
systems have done well without a formal standard - Perl, Python, Fortran
prior to 1966. But a bad standard, that's hard to implement or is
painful to use, will drive away users and implementers, and discourage
the creation of a new standard.
 
> Telling the
> relevant Li18nux/LSB working group "Debian has looked at the Li18nux
> proposal.  However, we intend to {use the IANA names, not impose
> unstandardized names, deprecate IBM code pages to compatibility
> packages} for these reasons: ...." would be great.  The Debian name
> commands a fair amount of respect because of Debian's continuing
> commitment to standards, both international and internal.

I can't honestly say I speak for Debian. I don't think anybody can
honestly say they speak for Debian on this, besides maybe Ben Collins
(libc maintainer). It's the whole "herd of cats" thing.
 
> >>>>> "David" == David Starner <starner@okstate.edu> writes:
>     David> Why all the IBM code pages? glibc currently supports two -
>     David> 1251 (be_BY, bg_BG) and 1255 (yi_US).
> 
> What do you mean by "support"?  For code pages, I would say "iconv" is
> the relevant functionality.  

I have no argument with iconv supporting any charset in use. But we're
talking about locale charsets, the charsets that every program can be
expected to handle, the master charsets for a user. Users should be able
to expect that you can send a file from one Linux box to another in the
same locale without having to recode it. While this isn't universally
true, adding charsets that aren't better then ones already in use
doesn't help anything. Furthermore, if possible, a charset should leave
C1 free of graphical characters, like ISO-8859-1 and EUC-JP do, and
UTF-8 does in a hamhanded way, and must leave C0 free of graphic
characters.  

What I mean by support is that it is included in the list of tested and
supported locales (/usr/share/doc/locales/SUPPORTED.gz on my system) -
attached to the bottom of this message.

>     David> As a final note - why does this exist? Linux has a locale
>     David> standard, in the same way that Perl has a standard
> 
> Aka, "why I use Python". :-)

Does Python have a formal standard? It would surprise me.
 
>     David> If you feel compelled to write a formal standard, you have
>     David> to write one that defines what the standard implementation
>     David> does.
> 
> Note what taking that to extremes implies: forget POSIX, which
> doesn't describe any real OS.  

Large parts of POSIX are directly based of existing implemenations.
Also, POSIX needed implementing; there were many diverging Unixes.
There's one locale implementation used on Linux - glibc's.

> While that's mostly a joke, there's something important here.  And
> that is that if we stick consistently to the "specify, then implement"
> approach, we end up with something workable not so far from where we
> actually are.  

This sounds an awful lot like creationism.

- Jargon File (4.3.0, 30 APR 2001) [jargon]:

creationism n. The (false) belief that large, innovative software
designs can be completely specified in advance and then painlessly
magicked out of the void by the normal efforts of a team of normally
talented programmers. In fact, experience has shown repeatedly that good
designs arise only from evolutionary, exploratory interaction between
one (or at most a small handful of) exceptionally able designer(s) and
an active user population -- and that the first try at a big new idea is
always wrong. Unfortunately, because these truths don't fit the planning
models beloved of {management}, they are generally ignored.
  
We've had the evolutionary, exploratory ineraction, and now, for the
most part, glibc supports the locales and charsets people need. 

> I'm not recommending knuckling under to Emerson's hobgoblin, but I
> hope Debian will lean toward specifying desiderata (== standards)
> independently of current implementations, rather than falling into the
> trap of making the standards overly dependent on the implementations.

Why haven't you standardized Emacs yet? What would you do with a Emacs
standard that ignored much of the good points of recent Emacsen? IMO, we
have a poorly-thought out standard, in an area without multiple
implementations and hence the need for a standard.

af_ZA ISO-8859-1
ar_AE ISO-8859-6
ar_BH ISO-8859-6
ar_DZ ISO-8859-6
ar_EG ISO-8859-6
ar_IN UTF-8
ar_IQ ISO-8859-6
ar_JO ISO-8859-6
ar_KW ISO-8859-6
ar_LB ISO-8859-6
ar_LY ISO-8859-6
ar_MA ISO-8859-6
ar_OM ISO-8859-6
ar_QA ISO-8859-6
ar_SA ISO-8859-6
ar_SD ISO-8859-6
ar_SY ISO-8859-6
ar_TN ISO-8859-6
ar_YE ISO-8859-6
be_BY CP1251
bg_BG CP1251
br_FR ISO-8859-1
bs_BA ISO-8859-2
ca_ES ISO-8859-1
ca_ES@euro ISO-8859-15
cs_CZ ISO-8859-2
cy_GB ISO-8859-14
da_DK ISO-8859-1
de_AT ISO-8859-1
de_AT@euro ISO-8859-15
de_BE ISO-8859-1
de_BE@euro ISO-8859-15
de_CH ISO-8859-1
de_DE ISO-8859-1
de_DE.UTF-8 UTF-8
de_DE@euro ISO-8859-15
de_LU ISO-8859-1
de_LU@euro ISO-8859-15
el_GR ISO-8859-7
el_GR.UTF-8 UTF-8
en_AU ISO-8859-1
en_BW ISO-8859-1
en_CA ISO-8859-1
en_DK ISO-8859-1
en_GB ISO-8859-1
en_GB.UTF-8 UTF-8
en_HK ISO-8859-1
en_IE ISO-8859-1
en_IE@euro ISO-8859-15
en_IN UTF-8
en_NZ ISO-8859-1
en_PH ISO-8859-1
en_SG ISO-8859-1
en_US ISO-8859-1
en_US.UTF-8 UTF-8
en_ZA ISO-8859-1
en_ZW ISO-8859-1
es_AR ISO-8859-1
es_BO ISO-8859-1
es_CL ISO-8859-1
es_CO ISO-8859-1
es_CR ISO-8859-1
es_DO ISO-8859-1
es_EC ISO-8859-1
es_ES ISO-8859-1
es_ES@euro ISO-8859-15
es_GT ISO-8859-1
es_HN ISO-8859-1
es_MX ISO-8859-1
es_NI ISO-8859-1
es_PA ISO-8859-1
es_PE ISO-8859-1
es_PR ISO-8859-1
es_PY ISO-8859-1
es_SV ISO-8859-1
es_US ISO-8859-1
es_UY ISO-8859-1
es_VE ISO-8859-1
et_EE ISO-8859-1
eu_ES ISO-8859-1
eu_ES@euro ISO-8859-15
fa_IR.UTF-8 UTF-8
fi_FI ISO-8859-1
fi_FI@euro ISO-8859-15
fo_FO ISO-8859-1
fr_BE ISO-8859-1
fr_BE@euro ISO-8859-15
fr_CA ISO-8859-1
fr_CH ISO-8859-1
fr_FR ISO-8859-1
fr_FR.UTF-8 UTF-8
fr_FR@euro ISO-8859-15
fr_LU ISO-8859-1
fr_LU@euro ISO-8859-15
ga_IE ISO-8859-1
ga_IE@euro ISO-8859-15
gl_ES ISO-8859-1
gl_ES@euro ISO-8859-15
gv_GB ISO-8859-1
he_IL ISO-8859-8
hi_IN.UTF-8 UTF-8
hr_HR ISO-8859-2
hu_HU ISO-8859-2
id_ID ISO-8859-1
is_IS ISO-8859-1
it_CH ISO-8859-1
it_IT ISO-8859-1
it_IT@euro ISO-8859-15
iw_IL ISO-8859-8
ja_JP.EUC-JP EUC-JP
ja_JP.UTF-8 UTF-8
ka_GE GEORGIAN-PS
kl_GL ISO-8859-1
ko_KR.EUC-KR EUC-KR
ko_KR.UTF-8 UTF-8
kw_GB ISO-8859-1
lt_LT ISO-8859-13
lv_LV ISO-8859-13
mi_NZ ISO-8859-13
mk_MK ISO-8859-5
mr_IN.UTF-8 UTF-8
ms_MY ISO-8859-1
mt_MT ISO-8859-3
nl_BE ISO-8859-1
nl_BE@euro ISO-8859-15
nl_NL ISO-8859-1
nl_NL@euro ISO-8859-15
nn_NO ISO-8859-1
no_NO ISO-8859-1
oc_FR ISO-8859-1
pl_PL ISO-8859-2
pt_BR ISO-8859-1
pt_PT ISO-8859-1
pt_PT@euro ISO-8859-15
ro_RO ISO-8859-2
ru_RU ISO-8859-5
ru_RU.UTF-8 UTF-8
ru_RU.KOI8-R KOI8-R
ru_UA KOI8-U
sk_SK ISO-8859-2
sl_SI ISO-8859-2
sq_AL ISO-8859-1
sr_YU ISO-8859-2
sr_YU@cyrillic ISO-8859-5
sv_FI ISO-8859-1
sv_FI@euro ISO-8859-15
sv_SE ISO-8859-1
ta_IN UTF-8
te_IN UTF-8
tg_TJ KOI8-T
th_TH TIS-620
tl_PH ISO-8859-1
tr_TR ISO-8859-9
uk_UA KOI8-U
ur_PK UTF-8
uz_UZ ISO-8859-1
vi_VN.UTF-8 UTF-8
yi_US CP1255
zh_CN GB2312
zh_CN.GB18030 GB18030
zh_CN.GBK GBK
zh_CN.UTF-8 UTF-8
zh_HK BIG5-HKSCS
zh_HK.UTF-8 UTF-8
zh_TW BIG5
zh_TW.EUC-TW EUC-TW
zh_TW.UTF-8 UTF-8

-- 
David Starner - starner@okstate.edu, dvdeug/jabber.com (Jabber)
Pointless website: http://dvdeug.dhis.org
What we've got is a blue-light special on truth. It's the hottest thing 
with the youth. -- Information Society, "Peace and Love, inc."



Reply to: