[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#467249: FW by lidaobing@gmail.com : Bug#467249: man-db: over sensitive on the spell of locale

On Thu, Feb 28, 2008 at 09:21:41PM +0100, Adam Borowski wrote:
> On Thu, Feb 28, 2008 at 10:42:30AM +0100, Michelle Konzack wrote:
> > It seems there is a common problem while setting up the correct UNICODE
> > locale in systems.  As the posster in the attached message has written,
> > he has setup his locale to "zh_CN.utf8" which is wrong, but as he has
> > written too, the output of "locale -a" show it.
> No way which way the _locale_ is spelt (including "vi_VI" without even the
> word "utf" inside),

Irrelevant to this bug, as you'll see if you look at the code.

> the _charset_ is UTF-8.  No program ever should look at the locale's
> name, as it has more quirks like this.  Checking the charset will get
> you what you want.
> > I think, there should be a global solution for this, since patching
> > man-db is worthless.
> Actually, it's groff what's at fault here.  Mostly.

man-db really does have some special-casing here. Trust me. It was
necessary at the time. There are a finite number of known aliases for
the very small number of locales in question, and until it becomes
unnecessary I will simply support those.

(And I agree that it should go away, but can't easily just yet.)

Please don't drag groff into this bug. I really hate it when bugs drift
wildly off their original (accurately-constrained) topic despite
attempts to haul them back. It makes them impossible to keep organised.

> > $ LANG=zh_CN.UTF-8 man --warnings -l ls.zh_CN.1 > /dev/null
> > $ LANG=zh_CN.utf8 man --warnings -l ls.zh_CN.1 > /dev/null
> > <standard input>:9: warning: can't find special character `u013F'
> > <standard input>:9: warning: can't find special character `u011A'
> > <standard input>:9: warning: can't find special character `u021D'
> > <standard input>:11: warning: can't find special character `u0321'
> > <standard input>:11: warning: can't find special character `u04AA'
> > <standard input>:12: warning: can't find special character `u0461'
> > // snip
> Too bad, groff doesn't have real Unicode support, and supports only several
> special-cased locales (which may then be transcoded as UTF-8, but they still
> get wrapped into their old-style charsets).
> Instead of changing the special-case recognition, I would instead completely
> skip special-casing and just treat all characters equally.  Including, but
> not limited to, u013F and u0461.

Are you working with Brian M. Carlson on this? He has been working on a
solution acceptable to groff upstream, which is, frankly, the only way I
want to go now. He has already made substantial progress with character
class support.

Treating all characters equally will absolutely not be acceptable to
groff upstream. groff is a typesetter and needs to know about properties
of characters.


Colin Watson                                       [cjwatson@debian.org]

Reply to: