[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [debian-i18n] i18n of man-db improved; please test

On Mon, Sep 22, 2003 at 02:27:25AM +0200, Rüdiger Kuhlmann wrote:
> >--[Colin Watson]--<cjwatson@debian.org>
> > On Sun, Sep 21, 2003 at 11:52:06AM +0200, Rüdiger Kuhlmann wrote:
> > > >--[Colin Watson]--<cjwatson@debian.org>
> > > > pages properly, including in particular ISO-8859-2, KOI8-R, and UTF-8
> > > > encodings: http://people.debian.org/~cjwatson/man-db/freaky.png is a
> > > Well, there are man pages in the ru and ru_RU subdirectories. At least
> > > those don't work as expected, though they might be misplaced there.
> > > They belong to rpm, mc and adduser.
> > That's very curious, because I used those pages as test cases.
> You're right: I tested my mICQ man pages that ended up there as well. The
> difference is that those contain an .encoding line, while the others don't.
> Ooops. I thought those were helping things.

.encoding has received, er, about ->.<- this much testing in some time,
and probably doesn't interact well with man also doing recoding. Also I
think some of its assumptions are flawed with current groff, although
Ukai-san might disagree with me. :-) I'd avoid it for now if I were you.
The ascii8 device is almost as much of a hack, but probably a safer one
now that man can make good use of it in a variety of locales.

What .encoding line were you using?

> Btw, is there a way to display a man page in another encoding without the
> necessity to open a new terminal for it? Because e.g.
> LC_ALL=ru_RU.KOI8-R LC_CTYPE=UTF-8 man rpm
> doesn't give me UTF-8 output, but KOI8-R.

Your LC_CTYPE there seems wrong to me: you want LC_CTYPE=ru_RU.UTF-8,
surely? (LC_CTYPE controls things other than just the character set,
such as case conversion rules, and I don't think glibc supports
specifying only one bit of it.) Furthermore, LC_ALL overrides LC_CTYPE,
not the other way round, so maybe using LANG for a default instead of
LC_ALL would be better.

> > > Another bug I noticed is that in the ru_RU.UTF-8 locale, man won't
> > > find the man pages under ru_RU.KOI8-R.
> > Hm. Yes, that is a bug (although not a regression; I think man-db 2.4.1
> > behaved the same way). I wonder how to solve that correctly and
> > generally.
> No idea. I consider the idea of depending language and encoding on each
> other fundamentally flawed anyway. At least the fact that without some
> installed locale, glibc can't even figure out that de_DE.UTF-8 uses the
> UTF-8 encoding. Wonna check some man pages? No problem, become root, add
> locale to the ever-increasing list of locales to generate, regenerate all
> those, start a new terminal for it, then finally you can check it. Duh.

I think that was a space-saving exercise. Can't comment on its wisdom or
otherwise. I certainly swear at locale-gen's slowness every time I have
to add a new one for testing, and generating anything in UTF-8 is
particularly slothful. I wonder if there's room for optimization there?

> PS. Did I mention that latex-ucs* needs updating? Not that -Tdvi seems
> to actually use it...

You'd probably have to take that up with groff upstream. I don't really
know the dvi device well at all.


Colin Watson                                  [cjwatson@flatline.org.uk]

Reply to: