[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: UTF-8 editor support in Debian?



Hi,

At Thu, 5 Jul 2001 23:55:00 +1000,
Drew Parsons <dparsons@emerall.com> wrote:

> > > Can we do more to support UTF-8?
> > 
> > Feel free to submit patches to improve UTF8 support to the various
> > programs.
> 
> I knew someone was going to say that... ;)
> 
> What are the main issues that need to be thought about?  Is it as
> simple as using w_char instead of char in the code, or is there more
> to it than that?  Can a general "plan of attack" be summarised in a
> couple of paragraphs?

Please consult my document "Introduction to I18N", which is included
in Debian Documentation Project.  To tell the truth, we don't have
enough basis for UTF-8/I18N support yet.  It cannot be summarized
to a brief instruction.

The important thing is, please don't "improve" software to handle
UTF-8 directly.  Instead, please improve softwares to be sensible
to LC_CTYPE locale.  (Internal encoding can be Unicode, provided
that I/O is done using LC_CTYPE encoding.)

I think the most basis of UTF-8 support is UTF-8-enabled console
and terminal.  Unicode support of Linux console covers only a
small part of Unicode character set.  Is someone interested in
this area?  Though I am not familiar with this area, I think "kon2"
and "jfbterm" packages may give us a hint.

UTF-8-enabled xterm is now under development.  I think xterm should
be LC_CTYPE-sensible.  Since the core part of xterm is already
developed to use Unicode, we will need an encoding converter
to simulate LC_CTYPE sensibility.  In i18n@xfree86 mailing list,
two implementations of Robert Brady's patch and "luit" are under
development for this purpose.  I am a member of this list for
more than half a year.

There are a few project of UTF-8-enabled editors.  Yudit, Mule-UCS,
Vim, and so on.  However, note that multilingual editor will need
multilingual input method.  You will not want to input Japanese
and Russian in UCS codepoint, won't you?  I heard that "im-sdk"
(available from Li18nux, http://www.li18nux.org) is available
for multilingual input.  You will also want to study developing
XIM (X Input Method) client.

X clients which are already internationalized can display UTF-8
in UTF-8 locales.  For example, see Blackbox and Twm examples from
http://www.debian.or.jp/~kubota/mojibake/window-managers.html .
I18N of Twm is done by me and you can read a detailed (but limited
to essential part) explanation on my "Introduction to I18N".
This type of improvement of softwares can be done easily.
The poor proportion of the image is responsible to XLC_LOCALE
definition, not to Blackbox/Twm.  Thus, you may want to develop
more beautiful XLC_LOCALE definition files for XFree86.

Almost basic text-handling (character counting, insertion,
substitution, search, and so on so on) will be rewritten using
wchar_t easily.

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/
"Introduction to I18N"  http://www.debian.org/doc/manuals/intro-i18n/



Reply to: