[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

UTF-8 locales



Hi,

I am interested in support of various character codes.

I suppose a certain amount of developers are interested in UTF-8
support.  They are trying adding UTF-8 support for their softwares
such as Xterm, GNU roff, and so on.

I believe that UTF-8 support should be implemented using
LOCALE technology, i.e., calling setlocale(LC_ALL,"");, using
wchar_t instead of char, and leaving everything to the OS.  The
advantage of this method is:
 - the software will support not only UTF-8 but also many
   character codes in the world (including multibyte ones).
   This helps users to transit into UTF-8 smoothly and gradually.
 - the software can provide a united way to determine the
   character code to be used, i.e., LANG variable and so on.
   Otherwise users have to remember methods to enable UTF-8
   mode for every softwares they are using.  (For example, 
   '-u8' option for Xterm.)
 - softwares which are already written using LOCALE technology
   don't need any modification.  In other words, such softwares
   have already become to support UTF-8.
Note that LOCALE programming is not difficult nor troublesome
than UTF-8 programming.

Solaris takes this model.  Read
http://docs.sun.com/ab2/coll.651.1/SOLUNICOSUPPT
for detail.

However, the current woody system (with locale 2.1.97-1) has only
one UTF-8 locale of ko_KR.utf8.  UTF-8 locales are needed for this
model to work well.  Why only it?

---
Tomohiro KUBOTA <kubota@debian.org>
http://surfchem0.riken.go.jp/~kubota/



Reply to: