[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Personal names in Kanji



Hi,

At Fri, 23 Aug 2002 13:50:19 +0200,
Gerfried Fuchs wrote:

>  Sorry for my late response, but I'm just questioning what the advantage
> for the users might be?  Those reading the english pages usually don't
> know what to do with the names in kanji, those who know kanji usually
> don't read the english pages anyway and/or should be happy with the
> "translated" names nevertheless.

There are several reasons.

A. Cultural Aspects.

1. There seem certain amount of people who feel fun to know how to
   spell persons' names in their original and right expression, even
   though they cannot read them correctly.  There are of course a 
   little amount of people who can read them.

2. People who cannot read Kanji (Cyrillic, Greek, Thai, Hangul,
   etc) can simply ignore them because Latin transcription is written.
   They are harmless for them.
   (Latin transcriptions are sometimes more important than original
   expression, because they are usually used in English mailing lists
   like I do now.)

3. Japanese (and I imagine Chinese) people tend to want to know Japanese
   and Chinese names in Kanji, though most Japanese people cannot read
   Chinese and vice versa.  Thus, Japanese (Chinese) people want to
   read Chinese (Japanese) names in Kanji in English web pages,
   respectively.  I don't know how Korean people feel.  (Korean people
   have their names in Kanji but they often write their names in Hangul.)

4. As Osamu pointed out, it is impossible to convert algorithmically
   from Latin transcription of Japanese names into original Kanji
   names.  (One exception:  For ISHIKAWA Mutsumi, I used Hiragana
   expression because he always use Hiragana in Japanese Linux
   communities.)

B. Technical Aspects.

5. So far, in the age when Unicode is starting to be popular but not
   very popular yet, ASCII characters are the only characters which
   are truely portable in the whole world.  In other words, all non-
   ASCII characters have some possibility not to be displayed.
   However, accented alphabets, which are not ASCII characters, are
   sometimes used in English pages.  Apparently, such characters
   cannot be displayed by text browsers in non-Latin-script-language
   people such as Asians and Russians.  However, we don't complain
   about that.  From the viewpoint of equality, if ISO-8859-1 local
   characters are permitted, Kanji should be also permitted.
   Of course internationalized softwares can display both of accented
   alphabets and kanji.

6. About &#xxxxx; expression.  It is painful to prepare such expressions.
   However, it is written in ASCII characters and harmless for
   translators (just copy and paste them).  FYI, I used the following
   Perl script to prepare &#xxxxx; expression of Japanese names.  For
   Russian and Chinese names, I needed more tricks.

   -----
   #!/usr/bin/perl

   use Text::Iconv;

   $converter = Text::Iconv->new("EUC-JP", "UCS-2");

   while(<>) {
       $string = $converter->convert($_);
       $len = length($string)/2;
       @string = split("", $string);
       for ($i=0; $i<$len; $i++) {
           $a = substr($string,0,1);
           $b = substr($string,1,1);
           $string = substr($string,2);
           $c = ord($a)+ord($b)*256;
           if ($c > 126) {print '&#',$c,';';}
           else {print chr($c);}
       }
   }
   -----

   Please modify "EUC-JP" into your favorite encoding.

   Note that this script don't think about endian.  This might not
   work well in big-endian systems.

   This script Depends: on libtext-iconv-perl package.


7. I hope more people will be aware of the fact that Asian (and other
   non-Latin-alphabet-language-speaking) people exist and they also
   use Debian.  I hope people (especially, developers) will feel
   something about problems we have.  European-language-speaking
   developers sometimes tend to develop softwares which cannot handle
   Kanji.


* "Kanji", "Hanji", and "Hanja" are Japanese, Chinese, and Korean
  words to call Chinese-originated system of ideogram (CJK Han
  Ideogram).  It is sometimes called "Chinese characters".

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/
"Introduction to I18N"  http://www.debian.org/doc/manuals/intro-i18n/



Reply to: