[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Kanji characters?



At Wed, 17 Feb 1999 23:10:16 +0100,
 <homega@vlc.servicom.es> wrote:

> P.D.  is there any substantial difference between the various characters
> (Auto Detect, Shift-JIS, and EUCP-JP)?

For Japanese characters, there are three major encoding methods (in
Internet world): 
  - ISO-2022-JP (or so called JIS), variant ISO-2022, using
   ESC sequence to switch character sets.
   This encoding is used for Japanese e-mail and/or NetNews articles.

  - EUC-JP, also variant ISO-2022, but it uses GL plane for
   Japanese character sets, and it needs 8bit for each bytes.
   This encoding is usually used for Japanese text on UNIX boxes.

  - Shift_JIS, is code point shifted version.
   This encoding is usually used for Japanese text on DOS/Windows/Mac boxes.

Someone uses ISO-2022-JP for HTML, others uses EUC-JP, and others uses
Shift_JIS.  Auto Detect will detect which encoding is used automatically.
But, between EUC-JP and Shift_JIS, there are some byte patterns which can 
not detect whether EUC-JP or Shift_JIS.

Regards,
Fumitoshi UKAI / Debian JP Project


Reply to: