[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Man pages and UTF-8



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ben Finney wrote:
[...]
>> The standard encoding for Japanese man pages is EUC-JP
> 
> That's no more true than "the standard encoding for English text is
> ASCII". The world is moving to Unicode encodings, though legacy
> encodings will remain for some time.
> 
> They're also both equally irrelevant. The standard encoding for Debian
> GNU/Linux is UTF-8.

man-db says otherwise --- if you're in a Japanese locale it looks up man pages
in /usr/share/man/ja and assumes they're in EUC-JP format.

> A previous message in this thread asserted that groff is capable of
> generating UTF-8 output; but has trouble consuming UTF-8 input.

Again, man-db says otherwise.

In fact, man-db says that while there's a default table of hard-coded
encodings, this may be overridden by an explicit encoding in the directory
name. e.g.:

/usr/share/man/ru.KOI8-R
/usr/share/man/ru.UTF-8

(I missed that comment last time.) Does this mean that if I install my UTF-8
encoded man page into /usr/share/en.UTF-8, it'll all work? What happens if
someone tries to read the man page on a non-English locale? I know that if a
locale-specific man page isn't found it'll fall back to the C locale (i.e.
/usr/share/man/manX), but can it be set to also fall back explicitly to English?

- --
┌── dg@cowlark.com ─── http://www.cowlark.com ───────────────────
│
│ "There does not now, nor will there ever, exist a programming language in
│ which it is the least bit hard to write bad programs." --- Flon's Axiom
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGwHDvf9E0noFvlzgRAjjuAJ0WWBwgg/1BJFHEcOkeUiLmWdQ2lgCfVSaa
Mf37sGztIml9GsBr4tc66rY=
=Ot16
-----END PGP SIGNATURE-----



Reply to: