[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Please keep an eye on manpage encoding issues (especially in etch).



Hi,

Thank you for sharing a problem, Clytie. :-)

From: Clytie Siddall <clytie@riverland.net.au>
Subject: Re: Please keep an eye on manpage encoding issues (especially in etch).
Date: Sun, 12 Nov 2006 17:06:40 +1030

> On 12/11/2006, at 3:48 PM, Kobayashi Noritada wrote:
> >
> >
> >>> Manpage encoding issues are seen for some packages and for some
> >>> languages; some manpages are encoded in UTF-8 and unreadable in any
> >>> environment.
> >>
> >> Can we specify the encoding in the manpage text in some way?
> >
> > No, the input encoding is determined by the locale and input manpage
> > path, which is hard-coded in man-db's src/encodings.c.
> 
> :(
> 
> Users of some languages, like mine, will need UTF8 anyway, and will  
> set their man conf correspondingly. In any case, UTF8 will be used  
> increasingly. It is, or should be, the standard encoding for  
> internationalization.

Yes, exactly I think so.  I hear Fedora have forced manpages to be
encoded in UTF-8 and Debian should do so in the future, or at least
support UTF-8 manpages for languages that have used non-UTF8 ones in
some way (e.g. by detecting input encodings, or by completely
switching to UTF-8).

However, completely switching manpage encodings to UTF-8 will require
both modification of man-db's encoding handling and conversion of all
the old non-UTF8 manpages (e.g. EUC-JP ones used for Japanese, and
latin-1 ones used for European languages), which will be a big
transition and will be controversial.  Actually it was controversial
at least in a thread[1] starting from my post in this April in a
Japanese list.

So, in the first mail, I proposed determining some policy about
encodings of manpages and creating system to check manpage encodings
in the future.  After that, or in parallel, we should do some
transition or support.

> But meanwhile, we need a way to label non-UTF8 manpages, or manpages  
> which don't match the man conf encoding setting. Bruno, what do you  

Yes, a number of non-UTF8 manpages are now used for many languages for
which only non-UTF8 ones are approved.  And at least for etch, we now
need conversion of UTF-8 manpages for those languages.

[1] http://lists.debian.or.jp/debian-users/200604/threads.html#00078 (in Japanese)

Thanks,

-nori



Reply to: