[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Unicode flame war (Was Re: Don't abolish non-unicode locales)



On Mon, Aug 06, 2001 at 09:48:12AM +0900, Tomohiro KUBOTA wrote:
> Hi,
> 
> At Fri, 3 Aug 2001 07:11:05 +0100,
> David Starner <dstarner98@aasaa.ofe.org> wrote:
> 
> >> I understand I can't ask Germans to use double-width characters for
> >> any non-ASCII characters.  Similarly, you can't ask Japanese to use
> >> single-width non-letter-symbols (like triangle, star, rule elements,

Is triangle or star a part of japanese alphabet?
No. (It is a part of older japanese encoding, that's true.)

> >> and so on) and letters (like Greeks and Cyrillics).

doublewidth greek and cyrillic???
Ok, I want to use single-width kana. You can't ask me to use
double-width kana. So I want to change unicode width properties
to have single-width kana.

> >
> > There's a slight difference between being totally wrong for a language, and
> > making older texts look ugly and out of proportion.
> 
> You mean, double-width German looks ugly?  There doesn't exist such
> a big difference.  And more, I never forced Germans to use doublewidth
> diacritic alphabets while your discussion seems to be willing to force
> Japanese people to use singlewidth non-letter-symbols and so on.

there is a slight difference between letters belonging to the language,
and graphical symbols that were thrown into encoding just for a good measure,
so that you can do ascii-art.
(I gave up on some graphical symbols when I went from Kamenicky encoding on
MS-DOS machines to Latin 2 on linux. I do not miss them)

> 
> 
> > Well, and efficency and so simple non-combining implementations could handle
> > the language. But, yes, that's true. There's a lot of ways Unicode could
> > have been better, but that would have killed it in practice.
> 
> Then do you think precombined diacritic alphabets in Unicode are evil?
> I don't think so.  It makes it much easier for European languages speakers
> to migrate into Unicode.
> 

if unicode went for combining characters only, technical development
would have gone completely differently, it does not mean it would have
been more difficult. But discussion about this belongs to
soc.history.what-if

> The problem is that the easiness of migration differs from language to
> language.

True.

> 

On Mon, Aug 06, 2001 at 09:51:19AM +0900, Tomohiro KUBOTA wrote:
> Hi,
> 
> At Fri, 3 Aug 2001 11:56:42 +0200,
> Radovan Garabik <garabik@melkor.dnp.fmph.uniba.sk> wrote:
> 
> > Well, your reasoning suggested that you do not care about multilingual
> > people, because there are a minority (no offence intended, I know you
> > did not meant it this way)
> 
> Ok, as you understand my will, multilingual people can use UTF-8.
> Also, people who only use their mother tongue can use UTF-8 if
> they want.  It should be easily achieved only by setting LANG
> variable to *.UTF-8 (for example, en_US.UTF-8 or zh_CN.UTF-8).

IMHO all locale fundamental design is flawed, but once it is here 
we have to live with it. Your suggestion is probably the best 
available way.

> 
> 
> > I was not speaking about forcing users, I was talking about
> > default encoding of Packages (I repeat myself: you cannot have
> > proper Packages in any other commonly used encoding other than unicode)
> 
> However, you insisted that garbages by outputing UTF-8 stream on
> other multibyte terminals are not significant problems.  Imagine
> a screen is scrolled accidentally.  The dselect cursor to indicate
> a name of package may indicate a wrong package.  The contents at
> the top line will be lost.  Thus, leaving this situation is just
> like forcing users to use UTF-8 locales.
> 

I would be willing to live with it.
Anyway, the whole flamewar had some good points in it:
We agreed that separate Packages-ascii (or Packages-utf8) is a
good solution.

> 
> >> http://www.debian.or.jp/~kubota/unicode-symbols.html ?  Did I wrote
> >> only about yen-sign problem?
> > no, but almost all of the problems (sans CJK unification) has nothing
> > to do with unicode, but with the need to support legacy encodings
> 
> We need to _migrate_ from local encodings to UTF-8, which mean that
> Unicode has to supply enough compatibility to local encodings.

Doesn't it?
There are problems, like the one with yen-sign (which is a bug in
previous encodings, anyway), and double-width graphical symbols -
that's worse, but overall, for texts (and that is what counts),
unicode works for Japanese.

> 
> If all people in the world were use Debian, the problem would be
> simple.  However, we need to read and write mails in ISO-2022-JP,
> exchange data for/from Windows/Macintosh, and so on.


well, debian has no official support for central european CP1250
encoding.
Yet, when I get a mail in such an encoding, mutt converts it
to Latin 2 and displays it without problems. I see no reason
why it would not work for ISO-2022-JP too (once "good enough"
conversion tables are made).

> 
> 
> > (Notice: I feel neutral about CJK unification - mostly because I
> > do not use Han characters :-), but I really understand both sides)
> 
> "Both sides are reasonable" means that both sides have defects.
> I think you now understand there exist people who need to use
> non-UTF-8 locales, which is my aim of Unicode-related discussion now.

I do.

> 
> 
> I feel the discussion is too diverged.  Now I think you and I don't

Me too :-)
I suggest to stop flaming about unicode and japanese encodings, since
we are just commenting on each other's previous mails and it could
take eternity to die out and would lead nowhere.
(and besides, I found myself several times in a position where I am
defending things I do not care much about :-))

> disagree on the really important points.  What I think is important
> is:
> 
> 1. UTF-8 is one of important encodings in the world and should be
>    supported by Debian.

True.

> 2. However, developers must not force users to use UTF-8 locales.
>    Instead, Debian should continue supporting locales which Debian
>    supports now.

True.

> 3. To acheve (1) and (2), developers have to design softwares so
>    that they work well on various locales which Debian supports.

True.


And:

4. We have to decide on overall debian infrastructure (which 
   demonstrates itself in Packages file), whether to use ASCII or UTF-8
   (since we probably agree there is no real third possibility). Note that
   this _does not_ concern localized debians (Packages-jp could still be in
   EUC-JP or whatever, Packages-sk could be in latin2, Packages-ascii could
   be in, well, ASCII, or Packages-utf8 will be in utf-8 and Packages will
   be in ASCII)

-- 
 -----------------------------------------------------------
| Radovan Garabik http://melkor.dnp.fmph.uniba.sk/~garabik/ |
| __..--^^^--..__    garabik @ melkor.dnp.fmph.uniba.sk     |
 -----------------------------------------------------------
Antivirus alert: file .signature infected by signature virus.
Hi! I'm a signature virus! Copy me into your signature file to help me spread!



Reply to: