Please keep an eye on manpage encoding issues (especially in etch).
Hi debian-i18n members,
Manpage encoding issues are seen for some packages and for some
languages; some manpages are encoded in UTF-8 and unreadable in any
environment.
Since ancient manpages do not have encoding information in itself,
each parent directory of translated manpages has a default input
manpage encoding, which is defined in src/encodings.c of the man-db
program (e.g. Japanese manpages under /usr/share/man/ja are all
assumed to be EUC-JP-encoded files). In the past, no one violated it
and it worked well, but these days, automatically generation of
manpages from other format and trend toward UTF-8 sometimes break it.
I've found following four bug reports on wrongly encoded manpages,
which affect upcoming etch release:
Bug#391061 (aptitude):
Japanese, due to DocBook XSL, open in testing (0.4.3-1),
fixed in 0.4.4-1 (unstable).
Bug#391699 (apt):
Japanese, due to DocBook XSL, open,
patch available but local regeneration before package rebuild required.
Bug#395503 (manpage-es):
Spanish, open.
Bug#397953 (debhelper):
French, open.
I'm afraid more packages are involved in this issue.
Since the release of etch is closing in and I think unreadable
manpages are important and easy-to-fix bugs, I'd like to squash this
issue.
For Japanese manpages, Junichi Uekawa made sure only manpages in apt
and aptitude were affected by following procedures[1]:
(1) Extract /usr/share/man/ja from Contents-i386.gz and expand
corresponding packages.
(2) Run a following one-liner and find manpages that cause error:
for A in $(find usr/share/man/ja/ ) ;do echo ---------$A ; \
zcat $A | iconv -f euc-jp -t euc-jp > /dev/null; done
This procedure should be available for other encoding-directory pairs.
Since I cannot get information about encoding-directory relationships
for other languages without reading src/encodings.c, I am happy if you
check your familiar languages.
I also propose that we should describe encoding-directory
relationships of manpages in some documentation in the future
(post-etch) and check them with package checkers (lintian, linda, and
piuparts) to make sure package maintainers will check this issue.
[1] http://lists.debian.or.jp/debian-devel/200610/msg00010.html (in Japanese)
Regards,
-nori
Reply to: