[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: UTF-8 editor support in Debian?



Hello,

DP> So what exactly is the state of support for creating UTF-8 files
DP> at the moment?

It's pretty good, actually.  The main issue currently is the failed
brain surgery that the MULE folks have carried out on Emacs.

As Tomohiro has mentioned, there are two ways of implementing UTF-8
support: hard-wired UTF-8 characters internally (this approach has a
number of advantages), or traditional use of the libc support using
LC_CTYPE.

Testing is currently using versions of libc and Xlib that have satis-
factory support for UTF-8 locales.  This means that most applications
that use the traditional approach can be used for editing UTF-8.  For
example, you can do

  echo 'XEdit*international: yes' | xrdb -merge
  LC_ALL=en_US.UTF-8 xedit &

to get a reasonably functional UTF-8 editor.  Note, however, that
UTF-8 Asian input method support has only become reliable in XFree86
4.1.0 (currently in unstable), and that UTF-8 compose and dead-key
support will not appear before 4.2.0.  Single-keystroke input,
however, should work fine no matter how exotic your keysyms.

(A side note: putting `*international: true' in your X defaults file
is currently not a good idea, as it breaks a number of Xaw clients.
I've just noticed xfontsel, I'll fix it ASAP.)

As has been mentioned before, both the console and XTerm (as well as
reportedly gnome-terminal) have reasonable support for a UTF-8 mode;
this means that UTF-8 console applications should work fine, whether
they use the hardcoded or traditional approach.  Currently, configu-
ring xterm to be usable in UTF-8 mode is not as trivial as I'd like
(you typically have to use both the -u8 and -fn flags), but we're
working on fixing that upstream.

(The suggested command line is

  $ xterm -u8 -fn '-misc-fixed-medium-r-normal--14-130-75-75-c-70-iso10646-1'

but even then the fonts menu will not work right.)

Finally, if all else fails, you can always use your native encoding
and convert to UTF-8 post hoc.  The magic command line is

  $ iconv -f ISO-8859-1 -t UTF-8 <foo >bar

Use `iconv -l' to see the horribly bloated list of supported encodings.

Hope that helps,

                                        Juliusz



Reply to: