i18n of man-db improved; please test
I've uploaded man-db 2.4.2-1 to unstable, which includes many i18n
fixes. Specifically, it no longer makes the mistake of assuming that the
encoding of the source manual page has anything much to do with the
encoding of groff input or the locale you're using, and it will recode
input to or output from groff as necessary to take account of groff's
idiosyncrasies. Every locale I've managed to test now displays its man
pages properly, including in particular ISO-8859-2, KOI8-R, and UTF-8
encodings: http://people.debian.org/~cjwatson/man-db/freaky.png is a
particularly fun example of the last. :) You'll need groff >= 1.18.1-11
for anything involving source manual page encodings other than EUC-JP or
All of this requires that man has a set of tables with things like
required groff devices, LESSCHARSET values, and so on. In particular,
there's a big table at the start of src/encodings.c listing the source
encoding of each subdirectory of /usr/share/man and friends, which is
very likely to be missing a number of entries.
Please test the new version in your locale and let me know if you have
any problems. If you do, I'd appreciate the output of 'locale' and a
sample man page I can use for testing.
* Pages like latin2(7) don't work, because /usr/share/man/man* is
assumed to be ISO-8859-1. Maybe one day there'll be a way for pages
to specify a non-default encoding, but it's not there yet, and
probably won't be there properly until groff 2. I've been thinking
of adding a --source-encoding option to man as a stopgap measure.
* Although I haven't actually tried it, I'd be astonished if CJKV
languages other than Japanese worked. Volunteers to fix this would
be appreciated. I'd suggest starting by looking at 'man -d' output
in a Japanese locale, finding the groff pipeline that man runs, and
seeing if you can use a similar one by hand for your locale. I
believe that the nippon device should work for other multibyte
encodings, but I've never had the chance to verify this.
I'm not subscribed to -i18n, sorry, so I'd appreciate copies of any
replies, but I will check the archives every so often.
Colin Watson [email@example.com]