[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

i18n of man-db improved; please test



Hi,

I've uploaded man-db 2.4.2-1 to unstable, which includes many i18n
fixes. Specifically, it no longer makes the mistake of assuming that the
encoding of the source manual page has anything much to do with the
encoding of groff input or the locale you're using, and it will recode
input to or output from groff as necessary to take account of groff's
idiosyncrasies. Every locale I've managed to test now displays its man
pages properly, including in particular ISO-8859-2, KOI8-R, and UTF-8
encodings: http://people.debian.org/~cjwatson/man-db/freaky.png is a
particularly fun example of the last. :) You'll need groff >= 1.18.1-11
for anything involving source manual page encodings other than EUC-JP or
ISO-8859-1.

All of this requires that man has a set of tables with things like
required groff devices, LESSCHARSET values, and so on. In particular,
there's a big table at the start of src/encodings.c listing the source
encoding of each subdirectory of /usr/share/man and friends, which is
very likely to be missing a number of entries.

Please test the new version in your locale and let me know if you have
any problems. If you do, I'd appreciate the output of 'locale' and a
sample man page I can use for testing.

Known issues:

  * Pages like latin2(7) don't work, because /usr/share/man/man* is
    assumed to be ISO-8859-1. Maybe one day there'll be a way for pages
    to specify a non-default encoding, but it's not there yet, and
    probably won't be there properly until groff 2. I've been thinking
    of adding a --source-encoding option to man as a stopgap measure.

  * Although I haven't actually tried it, I'd be astonished if CJKV
    languages other than Japanese worked. Volunteers to fix this would
    be appreciated. I'd suggest starting by looking at 'man -d' output
    in a Japanese locale, finding the groff pipeline that man runs, and
    seeing if you can use a similar one by hand for your locale. I
    believe that the nippon device should work for other multibyte
    encodings, but I've never had the chance to verify this.

I'm not subscribed to -i18n, sorry, so I'd appreciate copies of any
replies, but I will check the archives every so often.

Cheers,

-- 
Colin Watson                                  [cjwatson@flatline.org.uk]



Reply to: