Re: UTF-8 manual pages

On Wed, Oct 17, 2007 at 06:03:34PM +0930, Clytie Siddall wrote:
> Another note about Colin's original, and very well-thought-out post:  
> I think Yelp _does_ support UTF-8. I'm pretty sure I tested my pilot  
> Vietnamese manpage in it (as well as groff-utf8) a year or two back.  

Actually I was thinking about this a few days ago too.  I just tried for
simplified Chinese (zh_CN):

On up-to-date unstable with yelp 2.20.0-1 and man-db 2.5.0-3:
The one-line description (in the search result page) is displayed
correctly.  The manpage itself is not, I see Latin characters as if the
Chinese is literally parsed as ISO-8859-*.

On a not-so-up-to-date unstable with yelp 2.18.1-1 and man-db 2.4.4-4:
The one-line description is in English.  The manpage itself is broken
too, but in a different way -- I see Chinese (and some Japanese)
characters, but they are not what they are supposed to be, but just some
nonsense.  Something is wrong in the decoding-encoding process.

I am testing in a en_US.UTF-8 locale, use "LANG=zh_CN.UTF-8 yelp" to
start yelp.  The man page used is chsh(1), shipped in passwd package
itself, and uses GB2312/GBK encoding.  If I use "LANG=zh_CN.GBK yelp"
instead, everything is the same except that the one-line description in
up-to-date unstable is broken too (in yet a third way).

Just a data point for anyone who is interested in more digging.


