[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: UTF-8 editor support in Debian?



On Thu, 5 Jul 2001, dparsons@emerall.com wrote:
> 
> What are the main issues that need to be thought about?  Is it as
> simple as using w_char instead of char in the code, or is there more
> to it than that?  Can a general "plan of attack" be summarised in a
> couple of paragraphs?

That depends.  wchar_t is in general not portable, since it is defined
by the relevant standards as locale dependent, and could be as little
as 8 bits.  Moreover, the standard C mbs* and wcs* interfaces are
extremely limited and hard to use for any real work.  Even worse,
there are a lot of extremely buggy versions around.  You can in
general only counton wchar_t being Unicode based if you are a glibc
based system.  In particular, this is not so for any of the free BSDs
or Solaris etc.

I recommend you to have a look at Markus Kuhn's Unicode page.  I don't
have the URL handy, but do a google search for "markus kuhn unicode".
You will also want to have a look Bruno Haible's libiconv, libcharset
and libutf8 packages which provide partial (but extremely useful)
solutions to the portability problems mentioned above.

As for editors, emacs20 with mule-ucs is fine but has a few caveats
due to "philosophic differences" between Unicode and ISO-2022 (which
is what the mule internal coding is based on).  The next version of
emacs, emacs 21, is said to include native and much smoother support
for non-CJK unicode, though I haven't seen it myself.  Other than
that, you might want to try yudit.

-- 
Gaute Strokkenes                        http://www.srcf.ucam.org/~gs234/
Let's climb to the TOP of that MOUNTAIN and think about STRIP MINING!!



Reply to: