[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: support for multilingual Packages files?



Hi,

At Mon, 30 Jul 2001 13:59:55 +0200,
Radovan Garabik <garabik@melkor.dnp.fmph.uniba.sk> wrote:

> Problems should be made visible and discussed, and solutions
> should be find, instead of just telling "unicode is bad, we are never
> going to accept it" (no, I am not talking about you, Tomohiro, I know you are 
> reasonable, but I know several people with this attitude)
> IMHO there is no other (better) alternative for global encoding than unicode,
> and unicode is not _that_ bad, and is already getting strong position
> elsewhere.

You are too optimistic.  Every time I send a mail to Unicode Consortium,
I am getting more and more pessimistic about the future of Unicode.
I feel the Unicode Consortium is a place for political dispute between
major vendors and they cannot supply a usable CJK support.  (I know
some members are struggling for users' interest.  However, I don't
know Unicode Consortium can think for users.)

Don't think about a future when all users will use UTF-8.  Think
about a future that UTF-8 is used as one of many encodings which
Debian supports.

As I said, in multibyte encodings including UTF-8, number of characters,
bytes, and columns differ one another.  This causes, for example, a line
exceeds the right limit and cause a unwanted scroll of screen.  Other
example; a garbage 0x80-0xff character is outputed and the next byte
is a control code.  The garbage character is regarded as a first byte
of multibyte character and the control character doesn't work.


> It will mostly concern names. You will not see names properly. Yes, this
> is not ideal, but not much else can be done about it.
> (oh, well, and occasional diacritics in english words like r?le and r?sum?,
> and I think it is reasonable leaving this up to maintainer's common sense - 
> to decide if he prefers "correct" usage no matter what or leaves diacritics 
> out and saves people without utf-8 console some headache)

I don't think we can hope such a "common sense".  In almost cases,
maintainers use non-ASCII characters because they are innocent
that foreign people cannot read the characters.  



> >    e) mandate ASCII; UTF-8 is optional
> 
> optional in the same Packages file?

Either will do.  I imagine the size won't so large and thus it
can be in the same Packages file.


> or they could decide if they prefer not to include the ASCII version at all,
> so that nobody is confused by incorrect variant of their name (I am talking
> now about latin-script names with diacritics)

It is YOU who want to avoid confusion of characters with and without
diacritics.  Why can you say that all people with Latin-script names
want to use question mark than eliminating diacritics?

And please note, though Japanese people (I cannot say about other
people) know 26 alphabets very well, they don't know about how many
types of diacritics for Latin scripts.  Such diacritics are less
popular than Greek characters.

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/
"Introduction to I18N"  http://www.debian.org/doc/manuals/intro-i18n/



Reply to: