[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Man pages and UTF-8



David Given <dg@cowlark.com> writes:

> What, then, should I be doing? Is it legitimate to include UTF-8 in my
> man page and assume that it'll be fixed (some day)? This
> seems... un-Debian-like.  Is there an alternative way of representing
> Unicode in troff that might work better?

> Of course, a perfectly viable solution is to simply not put non-ASCII
> characters in my man page, but this seems kinda cheating. Particularly
> since I went out of my way to ask the author how to spell his name in
> kanji...

Fixing UTF-8 support in man pages is a Debian-wide project that's going to
require a fair bit of coordination and will require tracking down and
dealing with all of the man pages that currently use legacy encodings (at
least that aren't in some locale with a different charset default).  I
think it's more than you could be expected to tackle for your package,
although it's something that we need to do at some point once groff is
ready.

In the meantime, your options are:

 * Install the man page into a locale that's defined to have UTF-8 input.
   As near as I can tell, the way the current setup works is that certain
   locales in the man tree are defined to have certain charsets.  I'm not
   sure where that code is, though.  And it sounds like that doesn't work
   for your situation.

 * Rewrite the man page to use named character escapes instead of raw
   UTF-8 input.  See the groff_char(7) man page for more details.
   However, I don't know if this will work for Asian characters.  It's the
   best solution for European characters, but a quick skim doesn't show a
   way to enter an arbitrary UTF-8 code point.

So, not good.  :/

It might be worth raising this on debian-devel, since it's been a sore
point in UTF-8 support for a while.

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>



Reply to: