[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: First beta version of the Debian SGML/XML HOWTO



Hi,

Wed, 10 Nov 1999 16:26:38 +0100, Stephane Bortzmeyer <bortzmeyer@debian.org> wrote about Re: First beta version of the Debian SGML/XML HOWTO  (<[🔎] 199911101526.QAA22331@ezili.sis.pasteur.fr>):
bortzmeyer> On Wednesday 10 November 1999, at 22 h 5, the keyboard of Taketoshi Sano 
bortzmeyer> <xlj06203@nifty.ne.jp> wrote:
bortzmeyer> I have a small problem, see hereunder.
bortzmeyer> 
bortzmeyer> > > PS: what encoding will you use for a Japanese text? 
bortzmeyer> > >      I'm not sure all the XML tools will support it.
bortzmeyer> > 
bortzmeyer> > It is written in EUC-JP. 
bortzmeyer> 
bortzmeyer> Why not UTF-8? I remember a discussion/flamewar on debian-devel a few months 
bortzmeyer> ago about wether Debian programs should or should not support multi-byte 
bortzmeyer> options, Unicode, etc.

Yes, UTF-8 is good solution, but at now, it has some problem.

 1. The web browser which supports UTF-8 is only Netscape
    Communicator/Navigator.
    I use mainly w3m and lynx, both are text-based and not support
    UTF-8.
 2. The UTF-8 solution of jade translator seems bad. It translate all
    of 2 byte character to entity. &#... &#... &#... ...
    (The rumor may be true, I mistaken.)
 3. Any good UTF-8 based text editor doesn't exist. I use xemacs, but
    it doesn't support UTF-8 (Of course, I can convert to UTF-8 by "lv").

    Oh, A few days ago, Takuo Kitame ITP mule-ucs, it will support
    UTF-8/UTF-16. Is this Gospel for us?

bortzmeyer> > jade is 8bit through, so EUC-JP text can be handled by jade. libxml seems
bortzmeyer> > to have no problem also.
bortzmeyer> 
bortzmeyer> XML::Parser, the Perl module which is used to implement pre-processing 
bortzmeyer> (conditional sections) does not know this encoding:

Yes, this is problem ...
And very difficult problem.

bortzmeyer> In the meantime, I used x-euc-jp-unicode. It seems to work. Look at 
bortzmeyer> <http://www.debian.org/~bortz/SGML-HOWTO/potato/howto.ja.html> and, if it is 
bortzmeyer> fine, I'll commit.

It seems UTF-8.
Please look by w3m or lynx... Oops.
-- 
Kenshi Muto
kmuto@debian.org
http://www.debian.org/~kmuto/


Reply to: