Re: Man pages and UTF-8

On Fri, Aug 10, 2007 at 11:24:08AM +0100, David Given wrote:
> Ben Finney wrote:
> [...]
> > That sounds like a bug. I was under the impression that the default
> > encoding of everything in lenny was supposed to be UTF-8.
> > 
> > What tool is it that has this different default encoding?
> Well, I tried UTF-8 with the assumption that it would work, and it threw up a
> lintian warning and produced gibberish when viewing the man page (with default
> 'man'). After searching the 'net I found this list in the LFS:
> http://www.linuxfromscratch.org/lfs/view/6.2/chapter06/man-db.html
> (See 6.45.2.)

I would call this a bug, in Etch it was "only" "important".
ANY file on a modern system installed by the distribution (and not in the
user's private data, /mnt/win/ or an upstream source tarball) is bad for a
number of reasons, mangling people's surnames being one of less important

All data files should be in UTF-8 (or UCS4, or any other format which does
not include data loss).  If an user then chooses to use a broken charset due
to his/her historic preferences, so be it -- but you cannot inflict data
loss on others.  If man-db does this, it needs to be beaten with a large

1KB		// Microsoft corollary to Hanlon's razor:
		//	Never attribute to stupidity what can be
		//	adequately explained by malice.

