Re: Man pages and UTF-8
-----BEGIN PGP SIGNED MESSAGE-----
Ben Finney wrote:
>> The standard encoding for Japanese man pages is EUC-JP
> That's no more true than "the standard encoding for English text is
> ASCII". The world is moving to Unicode encodings, though legacy
> encodings will remain for some time.
> They're also both equally irrelevant. The standard encoding for Debian
> GNU/Linux is UTF-8.
man-db says otherwise --- if you're in a Japanese locale it looks up man pages
in /usr/share/man/ja and assumes they're in EUC-JP format.
> A previous message in this thread asserted that groff is capable of
> generating UTF-8 output; but has trouble consuming UTF-8 input.
Again, man-db says otherwise.
In fact, man-db says that while there's a default table of hard-coded
encodings, this may be overridden by an explicit encoding in the directory
(I missed that comment last time.) Does this mean that if I install my UTF-8
encoded man page into /usr/share/en.UTF-8, it'll all work? What happens if
someone tries to read the man page on a non-English locale? I know that if a
locale-specific man page isn't found it'll fall back to the C locale (i.e.
/usr/share/man/manX), but can it be set to also fall back explicitly to English?
┌── ｄｇ＠ｃｏｗｌａｒｋ．ｃｏｍ ─── http://www.cowlark.com ───────────────────
│ "There does not now, nor will there ever, exist a programming language in
│ which it is the least bit hard to write bad programs." --- Flon's Axiom
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
-----END PGP SIGNATURE-----